FedRAMP-Compliant Monitoring System in Cybersecurity

Security Application Development
linkedin.webplinkedin.webplinkedin.webplinkedin.webplinkedin.webplinkedin.webp
FedRAMP-Compliant Monitoring System in Cybersecurity - Created date28/10/2025

Brief Overview of FedRAMP’S History

FedRAMP, which stands for the Federal Risk and Authorization Management Program, is a United States government-wide program that provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services.

Let’s Look Back At Its Evolution

TMA Solutions The history of FedRAMP
Figure 1: The history of FedRAMP

Nowadays, to access to the Federal market, every applications, productions, network systems which are related to Cyber Security, have to be “FedRAMP-compliant”

Introduction of FedRAMP Certified Grafana

To guarantee the uptime and reliability of our FedRAMP cybersecurity platform, we need a dedicated monitoring system that enables proactive maintenance and rapid incident response from our operations team. The key point is that monitoring system MUST be “FedRAMP-compliant”. It means that the monitor system has to

  • Be program independently verified
  • Be committed to maintaining that high standard of security

Moreover, the FedRAMP Certified Grafana Environment provides an observability platform, which uses the LGTM stack - Grafana (dashboards and visualization), Mimir (metrics), Tempo (traces) and Loki (logs) for FedRAMP customers

Architecture

Global View

At a high-level, this system consists of two active EKS clusters in two different regions. Metrics and logs from monitoring agents are sent to both clusters at the same time through https endpoints. By having two copies of data in two different regions, we can ensure high availability for Loki and Mimir.

TMA Solutions FedRAMP EKS Cluster Global View
Figure 2: FedRAMP EKS Cluster Global View
  • The RDS cluster resides in the private subnets of the primary region and has one read-write instance in one Availability Zone, one read-only replica in another zone. If the read-write instance fails, the read-only replica within the same AZ will be automatically promoted to receive traffic. The Grafana application in the secondary region communicate with the RDS instance through a VPC Peering connection.
  • Route 53 provides five public domains: two for Mimir's ALBs, two for Loki's ALBs and one for Grafana's ALB. The domain for Grafana will route users the region that has the lowest latency. AWS Certificate Manager (ACM) provides SSL/TLS certificates for these five domains.

Regional View
An Amazon EKS cluster (FIPS enabled) consists of two primary components:

  • The control plane managed by AWS
  • The data plane

Data Plane

For the data plan, this system uses the "managed node groups" node type, which are a blend of automation and customization for managing a collection of Amazon EC2 instances. The diagram below shows the data plane.

TMA Solutions FedRAMP EKS Cluster Region View
Figure 3: FedRAMP EKS Cluster Region View

The RDS cluster uses Aurora MySQL engine and password authentication method. During the provisioning of RDS, the default admin user is also created and gets all the privileges listed here. In addition to the admin user, another minimal-privileges user for Grafana is created after the EKS cluster is deployed.
Because the RDS cluster is in the private subnet, running an automatic script that connect directly from outside the VPC and create a user is not possible. The EKS cluster is in the same VPC as RDS, so we can utilize a Kubernetes Job object to run the script.
The master and Grafana's passwords are generated by a Terraform resource and stored in Secrets Manager.

Control Plane

TMA Solutions FedRAMP Grafana Control Plane
Figure 4: FedRAMP Grafana Control Plane

The Amazon EKS control plane consists of control plane nodes that run the Kubernetes software, such as etcd and the Kubernetes API server. The control plane runs in an account managed by AWS, and the Kubernetes API is exposed via the Amazon EKS endpoint associated with the cluster. Each Amazon EKS cluster control plane is single-tenant and unique and runs on its own set of Amazon EC2 instances.

The cluster control plane is provisioned across multiple Availability Zones and fronted by an Elastic Load Balancing Network Load Balancer. Amazon EKS also provisions elastic network interfaces in our own VPC subnets to provide connectivity from the control plane instances to the nodes (for example, to support kubectl exec, logs, proxy data flows).

This system enables both public and private endpoints. This means that Kubernetes API requests from within the VPC communicate to the control plane via the EKS-managed ENIs within the VPC and the cluster API server is accessible from the internet. CIDR restrictions are in place to limit client IP addresses that can connect to the cluster API server endpoint.

AWS encrypts all of the data stored by the etcd nodes and associated Amazon EBS volumes using AWS KMS by default. We also enable envelope encryption for secrets with customer managed KMS keys. At its core, a Zero Trust strategy assumes that no traffic is trustworthy by default, regardless of whether it originates from inside or outside the network. Every request—whether from a user, device, application, or service—must be continuously authenticated, authorized, and validated before gaining access.

Cluster View

Inside each cluster, we installed the lgtm-distributed helm chart including Loki, Grafana and Mimir via Terraform. The diagram below shows the main components of this chart (Mimir doc, Loki doc).

TMA Solutions FedRAMP EKS Cluster View
Figure 5: FedRAMP EKS Cluster View

This chart comes with default configuration values, which are overridden by a combination of --set flags and a yaml file as part of helm install's command arguments. The helm install command is defined as resource "helm_release" in Terraform. Sensitive data such as RDS passwords is passed into Kubernetes via helm configuration values that are stored as K8s Secret objects in etcd data store.

The lgtm-distributed chart also creates Ingress objects for Grafana, Loki and Mimir. These objects then create the corresponding Application Load Balancers with the help of AWS Load Balancer Controller. The EKS cluster by default does not have this controller and therefore it is installed via the "aws-load-balancer-controller" helm chart. Traffic between users and the load balancers is secured via HTTPS. Each region has one SSL/TLS certificate associated with all three ALBs and each certificate contains the domain names of Grafana, Loki, and Mimir.

Challenges

FedRAMP Certified Grafana: Gold Standard in Security

Deployment:

  • Hardening the Infrastructure: Grafana's host infrastructure must be hardened to meet strict government security standards, such as CIS Benchmarks or DISA STIGs.
  • FIPS 140-2/3 Compliance (3): FedRAMP mandates that the entire Grafana stack use only FIPS-validated cryptography, a complex task often requiring significant software and library modifications to protect data in transit and at rest.
  • High-Availability and Disaster Recovery: FedRAMP standards require Grafana to have a complex, highly available, multi-AZ architecture, plus a tested disaster recovery plan to meet specific agency RPO and RTO targets.

Case Study

Compliant Monitoring System - Passed the Schellman’s validation

TMA Solutions, with extensive experience in delivering cybersecurity services to global clients, provided a Compliant Monitoring System for a U.S.-based client seeking to safeguard sensitive data across critical business applications. Drawing on the expertise of its professional and experienced engineering team, TMA implemented a Grafana FedRAMP Monitoring System that is to monitor and respond quickly to incidents to ensure product continuity, is passed the validation of Schellman (which is an independent, private-sector cybersecurity and compliance services firm that works with the government as a third-party assessor)

TMA Solutions FedRAMP Grafana
Figure 5: FedRAMP Grafana
Brief Overview of FedRAMP’S History
Let’s Look Back At Its Evolution
Challenges
Case Study

Start your project today!

Share:

linkedin
copy
facebook
Others