Amazon CloudWatch FAQs

General

Amazon CloudWatch is an AWS monitoring service for cloud resources and the applications that you run on AWS. You can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, and set alarms. Amazon CloudWatch can monitor AWS resources, such as Amazon EC2 instances, Amazon DynamoDB tables, and Amazon RDS DB instances, in addition to custom metrics generated by your applications and services, and any log files that your applications generate, hosted on premises, hybrid, or on other clouds. You can use Amazon CloudWatch to gain system-wide visibility into resource utilization, application performance, and operational health. You can use these insights to react and keep your application running smoothly.

To get started with monitoring, you can use Automatic Dashboards with built-in AWS best practices, explore account and resource-based view of metrics and alarms, and easily drill down to understand the root cause of performance issues.

Amazon CloudWatch can be accessed via API, command-line interface, AWS SDKs, and the AWS Management Console.

Amazon CloudWatch receives and provides metrics for all Amazon EC2 instances and should work with any operating system currently supported by the Amazon EC2 service.

Amazon CloudWatch integrates with AWS Identity and Access Management (IAM) so that you can specify which CloudWatch actions a user in your AWS Account can perform. For example, you could create an IAM policy that gives only certain users in your organization permission to use GetMetricStatistics. They could then use the action to retrieve data about your cloud resources.

You can't use IAM to control access to CloudWatch data for specific resources. For example, you can't give a user access to CloudWatch data for only a specific set of instances or a specific LoadBalancer. Permissions granted using IAM cover all the cloud resources you use with CloudWatch. In addition, you can't use IAM roles with the Amazon CloudWatch command line tools.

Amazon CloudWatch Logs lets you monitor and troubleshoot your systems and applications using your existing system, application and custom log files.

With CloudWatch Logs, you can monitor your logs, in near real time, for specific phrases, values or patterns. For example, you could set an alarm on the number of errors that occur in your system logs or view graphs of latency of web requests from your application logs. You can then view the original log data to see the source of the problem. Log data can be stored and accessed indefinitely in highly durable, low-cost storage so you don’t have to worry about filling up hard drives.
Amazon CloudWatch Logs lets you monitor and troubleshoot your systems and applications using your existing system, application, and custom log files.

CloudWatch Logs is capable of monitoring and storing your logs to help you better understand and operate your systems and applications. You can use CloudWatch Logs in a number of ways.

Real-time application and system monitoring: You can use CloudWatch Logs to monitor applications and systems using log data. For example, CloudWatch Logs can track the number of errors that occur in your application logs and send you a notification whenever the rate of errors exceeds a threshold you specify. CloudWatch Logs uses your log data for monitoring, so no code changes are required.

Long-term log retention: You can use CloudWatch Logs to store your log data indefinitely in highly durable and cost effective storage without worrying about hard drives running out of space. The CloudWatch Logs Agent makes it easy to quickly move both rotated and non-rotated log files off of a host and into the log service. You can then access the raw log event data when you need it.

The CloudWatch Logs Agent is supported on Amazon Linux, Ubuntu, CentOS, Red Hat Enterprise Linux, and Windows. This agent will support the ability to monitor individual log files on the host.

Yes. The CloudWatch Logs Agent is integrated with Identity and Access Management (IAM) and includes support for both access keys and IAM roles.

Amazon CloudWatch Logs Insights is an interactive, pay-as-you-go, and integrated log analytics capability for CloudWatch Logs. It helps developers, operators, and systems engineers understand, improve, and debug their applications, by allowing them to search and visualize their logs. Logs Insights is fully integrated with CloudWatch, enabling you to manage, explore, and analyze your logs. You can also leverage CloudWatch Metrics, Alarms and Dashboards with Logs to get full operational visibility into your applications. This empowers you to understand your applications, make improvements, and find and fix problems quickly, so that you can continue to innovate rapidly. You can write queries with aggregations, filters, and regular expressions to derive actionable insights from your logs. You can also visualize timeseries data, drill down into individual log events, and export your query results to CloudWatch Dashboards.

You can immediately start using Logs Insights to run queries on all your logs being sent to CloudWatch Logs. There is no setup required and no infrastructure to manage. You can access Logs Insights from the AWS Management Console or programmatically through your applications by using the AWS SDK.

Amazon CloudWatch Anomaly Detection applies machine-learning algorithms to continuously analyze single time series of systems and applications, determine a normal baseline, and surface anomalies with minimal user intervention. It allows you to create alarms that auto-adjust thresholds based on natural metric patterns, such as time of day, day of week, seasonality, or changing trends. You can also visualize metrics with anomaly detection bands on dashboards, monitoring, isolating, and troubleshooting unexpected changes in your metrics.

It is easy to get started with Anomaly Detection. In the CloudWatch console, go to Alarms in the navigation pane to create an alarm, or start with Metrics to overlay the metric’s expected values onto the graph as a band. You can also enable Anomaly Detection using the AWS CLI, AWS SDKs, or AWS CloudFormation templates. To learn more, please visit the CloudWatch Anomaly Detection documentation  and pricing pages.

Amazon CloudWatch now includes Contributor Insights, which analyzes time-series data to provide a view of the top contributors influencing system performance. Once set up, Contributor Insights runs continuously without needing additional user intervention. This helps developers and operators more quickly isolate, diagnose, and remediate issues during an operational event.

In the CloudWatch console, go to Contributor Insights in the navigation pane to create a Contributor Insights rule. You can also enable Contributor Insights using the AWS CLI, AWS SDKs, or AWS CloudFormation templates. Contributor Insights is available in all commercial AWS Regions. To learn more, please visit the documentation on CloudWatch Contributor Insights.

Amazon CloudWatch ServiceLens is a feature that enables you to visualize and analyze the health, performance, and availability of your applications in a single place. CloudWatch ServiceLens ties together CloudWatch metrics and logs as well as traces from AWS X-Ray to give you a complete view of your applications and their dependencies. This enables you to quickly pinpoint performance bottlenecks, isolate root causes of application issues, and determine users impacted. CloudWatch ServiceLens enables you to gain visibility into your applications in three main areas: Infrastructure monitoring (using metrics and logs to understand the resources supporting your applications), transaction monitoring (using traces to understand dependencies between your resources), and end user monitoring (using canaries to monitor your endpoints and notify you when your end user experience has degraded).

If you already use AWS X-Ray, you can access CloudWatch ServiceLens on the CloudWatch console by default. If you do not yet use AWS X-Ray, you can get started by enabling AWS X-Ray on your applications using the X-Ray SDK. Amazon CloudWatch ServiceLens is available in all public AWS Regions where AWS-X-Ray is available. To learn more, visit the documentation on Amazon CloudWatch ServiceLens.

Amazon CloudWatch Synthetics allows you to monitor application endpoints more easily. It runs tests on your endpoints every minute, 24x7, and alerts you as soon as your application endpoints don’t behave as expected. These tests can be customized to check for availability, latency, transactions, broken or dead links, step by step task completions, page load errors, load latencies for UI assets, complex wizard flows, or checkout flows in your applications. You can also use CloudWatch Synthetics to isolate alarming application endpoints and map them back to underlying infrastructure issues to reduce mean time to resolution.

It's easy to get started with CloudWatch Synthetics. You can write your first passing canary in minutes. To learn more, visit the documentation on Amazon CloudWatch Synthetics .

Pricing

Please see our pricing page for the latest information.

All Amazon EC2 instance types automatically send key health and performance metrics to CloudWatch at no cost. If you enable EC2 Detailed Monitoring, you will be charged for custom metrics based on the number of metrics sent to CloudWatch for the instance. The number of metrics sent for an instance is dependent on the instance type - see available CloudWatch Metrics for Your Instances  for details.

Except as otherwise noted, our prices are exclusive of applicable taxes and duties, including VAT and applicable sales tax. Learn more.

Prior to July 2017, charges for CloudWatch were split under two different sections in your AWS bill and Cost and Usage Reports. For historical reasons, charges for CloudWatch Alarms, CloudWatch Metrics, and CloudWatch API usage were reported under the “Elastic Compute Cloud” (EC2) detail section of your bill, while charges for CloudWatch Logs and CloudWatch Dashboards are reported under the “CloudWatch” detail section. To help consolidate and simplify your monthly AWS CloudWatch usage and billing, we moved the charges for your CloudWatch Metrics, Alarms, and API usage from the “EC2” section of your bill to the “CloudWatch” section, effectively bringing together all of your CloudWatch monitoring charges under the “CloudWatch” section. Note that this has no impact to your total AWS bill amount. Your bill and Cost and Usage Reports will now simply display charges for CloudWatch under a single section.

Additionally, there is a Billing Metric in CloudWatch named “Estimated Charges” that can be viewed as Total Estimated Charge or broken down By Service. The “Total Estimated Charge” metric will not change. However, the “EstimatedCharges” metric broken down by Service will change for dimension ServiceName equal to “AmazonEC2” and dimension ServiceName equal to “AmazonCloudWatch”. Due to the billing consolidation, you may see that your AmazonEC2 billing metric decrease and AmazonCloudWatch billing metric increase as usage and billing charges get moved out of EC2 and into CloudWatch.

Logs Insights is priced per query and charges based on the amount of ingested log data scanned by the query. For additional details about pricing, see CloudWatch pricing .

Yes, if you cancel a query manually, you are charged for the amount of ingested log data scanned up to the point at which you cancelled the query.

No, you are not charged for failed queries.

Cross-account observability

Cross-account observability in CloudWatch lets you monitor and troubleshoot applications that span across multiple accounts within a Region. Using cross-account observability, you can seamlessly search, visualize, and analyze your metrics, logs, and traces, without having to worry about account boundaries. You can start with an aggregated cross-account view of your application to visually identify the resources exhibiting errors and dive deep into correlated traces, metrics, and logs to root cause the issue. The seamless cross-account data access and navigation enabled by cross-account monitoring helps you reduce the manual effort required to troubleshoot issues and save valuable time in resolution. Cross-account observability is an addition to CloudWatch’s unified observability capability.

Cross-account observability introduces two new account concepts. “Monitoring account” is a central AWS account that can view and interact with observability data generated across other accounts. A “source account” is an individual AWS account that generates observability data for the resources that reside in it. Once you identify your monitoring and source accounts, you complete your cross-account monitoring configuration by selecting which telemetry data to share with your monitoring account. Within minutes, you can easily setup central monitoring accounts from which you have a complete view of the health and performance of your applications deployed across many related accounts or an entire AWS organization. With cross-account observability in CloudWatch, you can get a birds-eye view of your cross-application dependencies that can impact service availability, and you can pinpoint issues proactively and troubleshoot with reduced mean time to resolution.

Using cross-account observability, you can search for log groups stored across multiple accounts from a central view, run cross-account Logs Insights queries, Live Tail analytics, and create Contributor Insights rules across accounts to identify top-N contributors generating log entries. You can use metrics search to visualize metrics from many accounts in a consolidated view, create alarms that evaluate metrics from other accounts to be notified of anomalies and trending issues, and visualize them on centralized dashboards. You can also use this capability to set up a single, cross-account metric stream to include metrics that span multiple AWS accounts in an AWS Region. With cross-account observability, you can also view an interactive map of your cross-account applications using ServiceLens with one-step drill downs to relevant metrics, logs, and traces.

Both cross-account monitoring in CloudWatch and the cross-account, cross-Region features will be available on the CloudWatch console. The cross-account, cross-Region drop-down menus will be removed from the console when you set up cross-account observability in CloudWatch. Note that the cross-account observability experience in CloudWatch is available in one Region at a time. The cross-account, cross-Region feature allows access to organization-wide telemetry through IAM roles. Cross-account observability in CloudWatch uses the Observability Access Manager API to define access policies. Learn more in our documentation.

Application Performance Monitoring (APM)

Amazon CloudWatch offers complete visibility into application transaction spans, providing developers with a powerful new search and analytics experience at any scale. This comprehensive solution goes beyond sampling, enabling quick connections between transaction-related business impact and application performance. With out-of-the-box analytics and visualization capabilities, CloudWatch delivers instant insights into overall application transaction health and performance. Seamlessly integrated with CloudWatch Application Signals, this feature empowers teams to efficiently monitor, troubleshoot, and optimize their applications with ease.

You can turn on Application Signals in the AWS Management Console for CloudWatch or when enabling CloudWatch on AWS resources, such as Amazon EKS clusters. Application instrumentation is included in the Amazon CloudWatch Agent. Application services, their APIs and dependencies are discovered and visualized in a summary view and on a service map. To reflect business impact and importance, in a few clicks, you can create service level objectives (SLOs) on standard application metrics, real-user or synthetic monitors. The “Enable more APM” view in console presents a view on monitored and un-monitored resources allowing customers to gradually add application visibility. You can use CloudWatch settings to increase trace sampling on critical services and capture more examples of critical transactions such as payment orders. To add external availability monitoring or UI workflows you can add synthetic canaries, and to add client visibility enable RUM telemetry on their web applications. To get started on Application Signals with complete visibility into application transaction spans, see documentation.

Amazon CloudWatch Application Signals discovers application services such as a mortgage payment processor running in EKS and generates a standard set of application metrics for volume, latency, errors and faults of APIs (such as to add users, place orders, pay, etc.) and dependencies (such as calls between application services, to AWS services or to external endpoints). Customers can reflect business impact and importance of application services, their APIs and dependencies by defining service level objectives. New application-centric observability views in the AWS Management Console for CloudWatch will then summarize application health against SLOs and offer a drill down to expediently establish a root cause.

Use Application Signals for an integrated application performance monitoring experience.With integrated monitoring, you can automatically collect and correlate application telemetry, while prioritizing for business critical applications. You can also leverage alarms, traces, and events data to take automated actions and reduce the time taken to recover from issues (MTTR). You want to monitor applications running on Amazon EKS, Amazon EC2, Amazon ECS, databases, components or on premise resources. Simply specify the resources to monitor, and enable Application Signals for Amazon EKS in your CloudWatch console without manual configurations. For all other application environments, you can quickly deploy the CloudWatch Agent and start monitoring your applications. With Application Signals you can create, measure and track SLOs aligned with your business and operational KPIs. SLOs are crucial in managing critical applications, improving availability, decreasing downtime, and enabling consistent customer experience. You need access to a comprehensive view of all your applications and the ability manage application performance. Leverage automatic, pre built, and standardized dashboards with all your applications, services and telemetry data. These visualization capabilities help you quickly scan and access metrics such as, volume, availability latency, and errors impacting your applications. Application Signals service maps enable you to drill down into traces, APIs, and compute resources to get a comprehensive view of root causes to your application issues. The integration of Amazon CloudWatch RUM and Amazon CloudWatch Synthetics within Application Signals gives you user data in real time and canaries in a single view. This is important if you need to rapidly pinpoint the root cause in you code, dependencies, or hosting environment before impacting any end users.

CloudWatch Application Insights helps you monitor your applications that use Amazon EC2 instances along with other application resources. It identifies and sets up key metrics, logs, and alarms across your application resources and technology stack (for example, your Microsoft SQL Server database, web (IIS) and application servers, OS, load balancers, and queues). It continuously monitors metrics and logs to detect and correlate anomalies and errors. When errors and anomalies are detected, Application Insights generates CloudWatch Events that you can use to set up notifications or take actions. To assist with troubleshooting, it creates automated dashboards for detected problems, which include correlated metric anomalies and log errors, along with additional insights to point you to a potential root cause.

Amazon CloudWatch Application Signals extends Amazon CloudWatch with standardized application metrics and application-centric observability views in the AWS Management Console for CloudWatch. You can get started without writing custom instrumentation. The new views summarize application health to help determine business impact and prioritize, then offer a drill down to expediently establish root cause.
When customers opt into the Application Signals with complete visibility into application transaction spans, you can access a powerful new search and analytics experience at any scale. This comprehensive solution goes beyond sampling, enabling quick connections between transaction-related business impact and application performance. With out-of-the-box analytics and visualization capabilities, CloudWatch delivers instant insights into overall application transaction health and performance. This feature empowers teams to efficiently monitor, troubleshoot, and optimize their applications with ease.

X-Ray traces

X-Ray traces help developers analyze and debug production, distributed applications, providing an end-to-end view of requests as they travel through the application.

X-Ray makes it easy for you to:

  1. Create a service map: X-Ray tracks requests to map services used, showing connections, dependency trees, and issues across Availability Zones or Regions.

  2. Identify errors and bugs: X-Ray analyzes response codes to automatically surface bugs, enabling easy debugging without reproduction.

  3. Build custom analysis and visualization apps: X-Ray's query APIs allow creating apps that leverage the data it records.

A set of data points sharing the same trace ID as a request travels through application services.

  • Segment: Data encapsulating a single component of a distributed application, including system-defined and user-defined data.

  • Annotation: System-defined or user-defined metadata associated with a segment.

  • Errors: System annotations on segments for calls that result in errors, including messages, stack traces, and source details.

  • Sampling: X-Ray collects data for a statistically significant number of requests, not every single one, for performance and cost-effectiveness.

  • X-Ray Daemon: A service that collects traces and sends them to X-Ray, simplifying the process compared to direct API usage.

You can get started with X-Ray by including the X-Ray language SDK in your application and installing the X-Ray daemon. For more information see the X-Ray user guide.

X-Ray can be used with distributed applications of any size to trace and debug both synchronous requests and asynchronous events. For example, X-Ray can be used to trace web requests made to a web application or asynchronous events that utilize Amazon SQS queues.

You can use X-Ray with applications running on EC2, ECS, Lambda, Amazon SQS, Amazon SNS and Elastic Beanstalk. In addition, the X-Ray SDK automatically captures metadata for API calls made to AWS services using the AWS SDK. In addition, the X-Ray SDK provides add-ons for MySQL and PostgreSQL drivers.

If you’re using Elastic Beanstalk, you will need to include the language-specific X-Ray libraries in your application code. For applications running on other AWS services, such as EC2 or ECS, you will need to install the X-Ray daemon and instrument your application code.

Yes, X-Ray provides a set of APIs for ingesting request data, querying traces, and configuring the service. You can use the X-Ray API to build analysis and visualization applications in addition to those provided by X-Ray.

Yes. X-Ray logs all API calls as management events. It also logs calls on traces as data events, including on PutTraceSegments and GetTimeSeriesServiceStatistics among other APIs. Data events are not logged by default. To log data events, you must configure your CloudTrail trail or event data store to collect them.

Container Monitoring

CloudWatch Container Insights collects, aggregates, and summarizes metrics and logs from your containerized applications and microservices running on Amazon ECS, Amazon EKS, Kubernetes platforms on Amazon EC2, and AWS Fargate (for both Amazon ECS and Amazon EKS). Container Insights collects container metrics such as CPU, memory, disk, and network metrics out of the box and provides deeper diagnostic information, such as container restart failures, to help you isolate issues and resolve them quickly. Container Insights delivers your container observability in automatic dashboards enabling you to monitor your application health and performance easily. You can also set CloudWatch alarms on Container Insights metrics to be notified of anomalies before your application performance is impacted.

Container Insights with enhanced observability is now available for Amazon Elastic Kubernetes Service (Amazon EKS) on EC2, Amazon Elastic Container Service (Amazon ECS) on EC2 and ECS on Fargate. Enhanced observability delivers detailed metrics such as container-level ECS and EKS performance metrics, EKS Kube-state metrics and EKS control plane metrics out-of-the box allowing you to visually drill down and up across various container layers to easily spot issues such as memory leaks in individual containers. Container Insights now shows you a list of container layers consuming high levels of resources, so that you can identify risks in your environment, even if you have not yet set up alarms, and take proactive action before your end user experience is impacted. Container Insights with enhanced observability comes with an easy getting started experience where you can auto-instrument your clusters with CloudWatch Observability add-on for EKS or opt-in by a single toggle on ECS to start ingesting telemetry immediately.

Container Insights with enhanced observability enables you to visually drill up and down across your Amazon EKS and Amazon ECS container layers and easily spot issues like memory leaks in individual containers, reducing mean time to resolution. Container level visibility comes out-of-the-box with enhanced observability. To enable enhanced observability, please follow the steps provided in the Amazon CloudWatch Container Insights documentation.

Yes. Using Container Insights with enhanced observability for Amazon Elastic Kubernetes Service (EKS) you can monitor your control plane status. You can use it to understand autoscaling status and plan your test cluster lifecycles in your automated test capabilities for example.

Container Insights with enhanced observability is an optional feature that provides out-of-the-box detailed health and performance metrics, which include container-level ECS and EKS performance metrics, EKS Kube-state metrics, and EKS control plane metrics for faster problem isolation and troubleshooting.  Without enhanced observability, Container Insights provides aggregated metrics at cluster and service levels.

Yes. You can decide to use Container Insights with or without enhanced observability on a per-cluster basis. For EKS you can enable enhanced observability for your clusters by installing the CloudWatch Observability add-on in your clusters after they are created using the add-ons tab in your cluster info view. For ECS you can toggle Enhanced under Monitoring Tab in cluster create workflow or update your existing clusters to do the same to onboard Container Insights with enhanced observability. You can also onboard enhanced observability at account level in ECS. This will enable any net new clusters under that account to onboard Container Insights with enhanced observability out-of-the-box. Please see the CloudWatch Container Insights documentation  for details.

You can get started collecting detailed performance metrics, logs, and metadata from your containers and clusters by installing CloudWatch Observability add-on into your EKS clusters or by opting in at cluster or account level for ECS. To start using Container Insights, please follow the steps provided in the Amazon CloudWatch Container Insights documentation.

Container Insights with enhanced observability supports Amazon EKS running on EC2, Amazon ECS running on EC2 and AWS Fargate.

More details on Container Insights pricing is available on the CloudWatch pricing page.

No. Current metric types supported are Gauge and Counters. Histogram and Summary metrics are planned for an upcoming release.

Prometheus is a popular open source monitoring project, part of the Cloud Native Compute Foundation (CNCF) . The open source community has built over 150 plugins and defined a framework that DevOps teams can use to expose custom metrics to be collected using a pull-based approach from their applications. With this new feature, DevOps teams can automatically discover services for containerized workloads such as AWS App Mesh , NGINX, and Java/JMX. They can then expose custom metrics on those services, and ingest the metrics in CloudWatch. By curating the collection and aggregation of Prometheus metrics, CloudWatch users can monitor, troubleshoot, and alarm on application performance degradation and failures faster while reducing the number of monitoring tools required.

Prometheus metrics are automatically ingested as CloudWatch custom metrics. The retention period is 15 months per metric data point with automatic roll up (<60secs available for 3 hours, one min available for 15 days, 5 min available for 63 days, one hour available for 15 months). To learn more, see the documentation on CloudWatch metrics retention .

Yes. Each Kubernetes (k8s) cluster has its own log group for the events (e.g., /aws/containerinsights//prometheus) with their own configurable retention period. For more details, please refer to the documentation on log group retention .

No. All metrics are ingested as CloudWatch Logs events and can be queried using CloudWatch Logs Insights queries. For more information, see the documentation on CloudWatch Logs Insights search language syntax .

You will be charged for what you use for the following: (1) CloudWatch Logs ingested by the Gigabyte (GB), (2) CloudWatch Logs stored, and (3) CloudWatch custom metrics. Please refer to the CloudWatch pricing page  for pricing details in your AWS Region.

Database Insights

CloudWatch Database Insights is a database observability solution that provides a curated experience designed for DevOps engineers, application developers, and database administrators (DBAs) to expedite database troubleshooting and gain a holistic view into their database fleet health. This is available for Amazon Aurora MySQL and PostgreSQL.

CloudWatch Database Insights consolidates logs and metrics from your applications, your databases, and the operating systems on which they run into a unified view in the console. Using its pre-built dashboards, recommended alarms, and automated telemetry collection, DBAs and DevOps engineers can monitor the health of their database fleets and use a guided troubleshooting experience to drill down to individual instances for root-cause analysis. Application developers can correlate the impact of database dependencies with the performance and availability of their business-critical applications. This is because they can drill down from the context of their application performance view in CloudWatch Application Signals to the specific dependent database in CloudWatch Database Insights.

You can get started with Database Insights in CloudWatch by enabling it on your Aurora clusters. Database Insights now delivers overall database fleet health and performance visibility through a landing page where you can navigate to the instance-level dashboards for detailed database and SQL query analysis.

Database Insights is available in all public AWS Regions. Database Insights applies a new vCPU-based pricing – see pricing page for details. For further information, visit the Database Insights documentation.

  • RDS Performance Insights is a standard database performance tuning and monitoring feature which allows customers to assess the load on their databases in a pre-built dashboard, one instance at a time.
  • Inclusive of all the Performance Insights capabilities, Database Insights is an advanced comprehensive database observability feature that is designed for DevOps engineers and database administrators (DBAs) to troubleshoot database and their supporting applications at scale. It provides fleet-level views, integration with application performance monitoring (APM) via Application Signals, correlation of database metrics with logs and events, and visualization of SQL query statistics.

Internet Monitoring

Amazon CloudWatch Internet Monitor helps you to continually monitor internet availability and performance metrics between your AWS-hosted applications and application end users. With Internet Monitor, you can quickly visualize the impact of issues, pinpoint locations and providers that are affected, and then take action to improve your end users' network experience. You can see a global view of traffic patterns and health events and drill down into information about events at different geographical granularities. If an issue is caused by the AWS network, you'll receive an AWS Health Dashboard notification that tells you the steps that AWS is taking to mitigate the problem. Internet Monitor also provides insights and recommendations that can help you improve your users' experience by using other AWS services.

To use Internet Monitor, you create a monitor and associate your application's resources with it, Amazon Virtual Private Clouds (VPCs), CloudFront distributions, or WorkSpaces directories, to enable Internet Monitor to know where your application's internet traffic is. Internet Monitor then provides internet measurements from AWS that are specific to the locations and networks that communicate with your application.

Then you can use the CloudWatch dashboard to learn about health events, view performance and availability scores, explore your application's historical data at different geographic granularities, and get insights into how to configure your application to improve performance for your end users.

Internet Monitor publishes internet measurements to CloudWatch Logs and CloudWatch Metrics, so you can easily use CloudWatch tools to better understand application health in geographies and networks specific to your application. Internet Monitor also sends health events to Amazon EventBridge so that you can set up notifications.

As you explore Internet Monitor, it helps to be familiar with the components and concepts you'll see referenced in the service. Internet Monitor uses or references the following: Monitor, CloudWatch logs, CloudWatch metrics, city-networks, health events, Autonomous System Numbers (ASNs), monitored resource, internet measurements, round-trip time, bytes transferred, and performance and availability scores.

Read a quick description of these components in the documentation .

Internet Monitor pricing has the following components: A fee per monitored resource, a fee per city-networks, and charges for the diagnostic logs published to CloudWatch Logs. For more information, see the Amazon CloudWatch Internet Monitor pricing page .

For Internet Monitor, Regional support depends on the types of resources that you add to your monitor. For Amazon CloudFront distributions and Amazon WorkSpaces directories, Internet Monitor is available in all supported Regions. For Amazon Virtual Private Clouds (VPCs), VPCs from an opt-in Region can be added only to a monitor created in the same Region. For a complete list of supported AWS Regions, see Amazon CloudWatch Internet Monitor endpoints.

Lambda Monitoring

CloudWatch Lambda Insights is a feature for monitoring, troubleshooting, and optimizing the performance and cost of your Lambda functions. Lambda Insights simplifies the isolation and analysis of performance issues impacting your Lambda environments. DevOps and systems engineers have access to automatic dashboards in the CloudWatch console, giving them end-to-end operational visibility of metrics, logs, and traces summarizing the performance and health of their AWS Lambda functions.

You can get started collecting detailed performance metrics, logs, and metadata from your Lambda functions by following these steps in the CloudWatch Lambda Insights documentation .

CloudWatch Lambda Insights automatically collects custom metrics from performance events ingested as CloudWatch logs from your Lambda functions. More details on pricing is available on the CloudWatch pricing page .

Network Monitoring

Network Monitor provides visibility into the performance and visibility of the network connecting your AWS-hosted applications to your on-premises destinations. Network Monitor enables you to quickly visualize packet loss and latency of your hybrid network connections, set alerts and thresholds, and then take action to improve your end users’ network experience. If your hybrid network connections are over AWS Direct Connect, Network Monitor allows you to identify the source of any network performance degradation within minutes.

Network Monitor vends round-trip latency and packet loss for each probe configured in the monitor. Additionally, for hybrid network connections over AWS Direct Connect, Network Monitor vends a metric for the AWS Network Health Indicator. These metrics are aggregated per VPC subnet and per destination endpoint and posted to Amazon CloudWatch. You can then access CloudWatch dashboards from within the Network Monitor console to view these metrics, set up alarms, and view the AWS network health status to see when network issues affected performance. You can also benchmark packet loss and round-trip latency by observing the 30-day history of recorded metrics or set up alarms to be notified of network events.

Network Monitor pricing has the following components: A fee per monitored resource and charges for the metrics published to CloudWatch. For more information, see the Amazon CloudWatch pricing and navigate to the Network Monitor tab.

To use Network Monitor, you create a monitor and associate your application’s resources with it. You choose source subnets belonging to your Amazon Virtual Private Cloud (VPCs) and then you choose the destination IP addresses in your on-premises network. Network Monitor creates a mesh of the possible source and destination combinations, each of which is called a probe, within a single monitor. Network Monitor creation is fully managed by AWS and you should be able to view real-time metrics within minutes of setting up your monitors. Network Monitor vends these real-time metrics to CloudWatch CloudWatch Metrics, so you can easily use CloudWatch tools to better understand network health in AWS regions specific to your network. Please refrence CloudWatch documention on detailed set up instructions.

Digital Experience Monitoring

Amazon CloudWatch DEM lets you monitor how your end users experience your applications (including performance, availability, and usability). 

Spot intermittent issues, get notified even when there is no user traffic, and monitor your endpoints and UI using CloudWatch Synthetic canaries. Complement synthetic monitoring with CloudWatch RUM to understand end user impact and to get better visibility of your digital experience. With CloudWatch Evidently, improve your end user’s digital experience by experimenting and validating new designs and features. 

Amazon CloudWatch RUM is a real user monitoring feature that gives you visibility into an application’s client-side performance to help you reduce mean time to resolution (MTTR). With CloudWatch RUM, you can collect client-side data on web application performance in real time to identify and debug issues. It complements the CloudWatch Synthetics data to give you more visibility into the end-user’s digital experience. You can visualize anomalies in performance and use the relevant debugging data (such as error messages, stack traces, and user sessions) to fix performance issues (such as JavaScript errors, crashes, and latencies). You can also understand the range of end-user impacts, including number of sessions, geolocations, or browsers. CloudWatch RUM aggregates data on your users' journey through your application, which can help you determine which features to launch and bug fixes to prioritize.

Create an app monitor in CloudWatch RUM and add the lightweight web client in the HTML header of your application. Then start using CloudWatch RUM’s dashboards to receive user insights from different geolocations, devices, platforms, and browsers. 

Amazon CloudWatch Evidently allows you to conduct experiments and identify unintended consequences of new features before rolling them out for general use, thereby reducing risk related to new feature roll-outs. Evidently allows you to validate new features across the full application stack before release, which makes for a safer release. When launching new features, you can expose them to a smaller user base, monitor key metrics such as page load times or conversions, and then dial up traffic. Evidently also allows developers to try out different designs, collect user data, and release the most effective design in production. It assists you in interpreting and acting on experiment results without the need for advanced statistical knowledge. You can use the insights provided by Evidently’s statistical engine (such as anytime p-value and confidence intervals) to make decisions while an experiment is in progress.

You can use the CloudWatch RUM JavaScript code snippet to collect client-side user journeys and performance metrics. If desired, you can also add custom metrics like conversions using the Evidently API. Next, new features to be tested can be instrumented with the CloudWatch Evidently SDK, which provides the ability to control how users get exposed to new features. Now you can run launches and experiments, using either the AWS console or CLI. 

Amazon CloudWatch Synthetics allows you to monitor application endpoints more easily. It runs tests on your endpoints every minute, 24x7, and alerts you as soon as your application endpoints don’t behave as expected. These tests can be customized to check for availability, latency, transactions, broken or dead links, step by step task completions, page load errors, load latencies for UI assets, complex wizard flows, or checkout flows in your applications. You can also use CloudWatch Synthetics to isolate alarming application endpoints and map them back to underlying infrastructure issues to reduce mean time to resolution.

It's easy to get started with CloudWatch Synthetics. You can write your first passing canary in minutes. To learn more, visit the documentation on Amazon CloudWatch Synthetics .

The two services can be used separately, but are even better together.

AppConfig is a capability of AWS Systems Manager that you can use to create, manage, and deploy feature flags and other application configuration. When developing new features, you can use AppConfig to deploy a new feature to production, but hide it behind a flag toggle. Once you are ready to launch, you simply update your configuration to release the feature instantly or gradually.

For more advanced feature management and experimentation, you can use Evidently, which is a new capability of Amazon CloudWatch. With Evidently you can run experiments on different feature variations and compare them with a baseline, or launch a feature variation on a schedule, while monitoring business metrics like visit duration and revenue. Evidently also integrates with CloudWatch RUM, which provides client-side application performance monitoring, so RUM metrics can be used directly in Evidently.

Metrics analytics

CloudWatch Metrics Insights is a high-performance query engine that helps you slice and dice your operational metrics in real time and create aggregations on the fly using standard SQL queries. Metrics Insights helps you understand the status of your application health and performance by giving you the ability to analyze your metrics at scale. It is integrated with CloudWatch Dashboards, so you can save your queries into your health and performance dashboards to proactively monitor and pinpoint issues quickly.

To get started, just click on the metrics tab on your CloudWatch console, and you will find Metrics Insights as a built-in query engine under Query tab at no additional cost. While Metrics Insights comes with standard SQL language, you can also get started with Metrics Insights by using the visual query builder. To use the query builder, you select your metrics of interest, namespaces and dimensions visually, and the console automatically constructs your SQL queries for you, based on your selections. You can use the query editor to type in your raw SQL queries anytime to dive deep and pinpoint issues to further granular detail. Metrics Insights also comes with a set of out of the box sample queries that can help you start monitoring and investigating your application performance instantly. Metrics Insights is also available programmatically through CloudFormation, the AWS SDK, and CLI.

AWS resource and custom metrics monitoring

Amazon CloudWatch allows you to monitor AWS cloud resources and the applications you run on AWS. Metrics are provided automatically for a number of AWS products and services, including Amazon EC2 instances, EBS volumes, Elastic Load Balancers, Auto Scaling groups, EMR job flows, RDS DB instances, DynamoDB tables, ElastiCache clusters, RedShift clusters, OpsWorks stacks, Route 53 health checks, SNS topics, SQS queues, SWF workflows, and Storage Gateways. You can also monitor custom metrics generated by your own applications and services.

You can publish and store custom metrics down to one-second resolution. Extended retention of metrics was launched on November 1, 2016, and enabled storage of all metrics for customers from the previous 14 days to 15 months. CloudWatch retains metric data as follows:

Data points with a period of less than 60 seconds are available for 3 hours. These data points are high-resolution custom metrics.

Data points with a period of 60 seconds (1 minute) are available for 15 days

Data points with a period of 300 seconds (5 minute) are available for 63 days 

Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months)

Data points that are initially published with a shorter period are aggregated together for long-term storage. For example, if you collect data using a period of 1 minute, the data remains available for 15 days with 1-minute resolution. After 15 days this data is still available, but is aggregated and is retrievable only with a resolution of 5 minutes. After 63 days, the data is further aggregated and is available with a resolution of 1 hour. If you need availability of metrics longer than these periods, you can use the GetMetricStatistics API to retrieve the datapoints for offline or different storage.

The feature is currently available in US East (N. Virginia), US West (Oregon), US West (N. California), EU (Ireland), EU (Frankfurt), S. America (São Paulo), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Mumbai), Asia Pacific (Sydney), EU (London), Canada (Central), US East (Ohio), and China (Beijing).

The minimum resolution supported by CloudWatch is one-second data points, which is a high-resolution metric, or you can store metrics at one-minute granularity. Sometimes metrics are received by Cloudwatch at varying intervals, such as three-minute or five-minute intervals. If you do not specify that a metric is high resolution, by setting the StorageResolution field in the PutMetricData API request, then by default CloudWatch will aggregate and store the metrics at one-minute resolution.

Depending on the age of data requested, metrics will be available at the resolutions defined in the retention schedules above. For example, if you request for one-minute data for a day from 10 days ago, you will receive the 1440 data points. However, if you request for one-minute data from five months back, the UI will automatically change the granularity to one-hour and the GetMetricStatistics API will not return any output.

CloudWatch does not support metric deletion. Metrics expire based on the retention schedules described above.

No. You can always retrieve metrics data for any Amazon EC2 instance based on the retention schedules described above. However, the CloudWatch console limits the search of metrics to two weeks after a metric is last ingested to ensure that the most up-to-date instances are shown in your namespace.

Yes, Amazon CloudWatch supports querying data from multiple sources, helping you monitor metrics on AWS, on premises, and other clouds. You can now troubleshoot critical events in minutes, not hours, and gain visibility into your application health, surfacing insights faster for seamless operations. Centralize your querying, visualizing, and alarming across all of your monitoring tools in one place.

Yes. Amazon CloudWatch stores metrics for terminated Amazon EC2 instances or deleted Elastic Load Balancers for 15 months.

To get started, you navigate to the metrics query builder in the Amazon CloudWatch console and open the data source selector. The selector allows you to start a wizard to add a new data source to query and alarm on. You choose the data source you want to query and specify access details such as a URL or path and credentials. For more details, see documentation.

If you view the same time window in a 5 minute period versus a 1 minute period, you may see that data points are displayed in different places on the graph. For the period you specify in your graph, Amazon CloudWatch will find all the available data points and calculates a single, aggregate point to represent the entire period. In the case of a 5 minute period, the single data point is placed at the beginning of the 5 minute time window. In the case of a 1 minute period, the single data point is placed at the 1 minute mark. We recommend using a one minute period for troubleshooting and other activities that require the most precise graphing of time periods.

You can use Amazon CloudWatch to monitor data produced by your own applications, scripts, and services. A custom metric is any metric you provide to Amazon CloudWatch. For example, you can use custom metrics as a way to monitor the time to load a web page, request error rates, number of processes or threads on your instance, or amount of work performed by your application. You can get started with custom metrics by using the PutMetricData API, our sample monitoring scripts for Windows and Linux, CloudWatch collectd plugin, as well as a number of applications and tools offered by AWS partners.

A custom metric can be one of the following:

Standard resolution, with data having one-minute granularity

High resolution, with data at a granularity of one second

By default, metrics are stored at one-minute resolution in CloudWatch. You can define a metric as high-resolution by setting the StorageResolution parameter to one in the PutMetricData API request. If you do not set the optional StorageResolution parameter, then CloudWatch will default to storing the metrics at one-minute resolution.

When you publish a high-resolution metric, CloudWatch stores it with a resolution of one second, and you can read and retrieve it with a period of one second, five seconds, 10 seconds, 30 seconds, or any multiple of 60 seconds.

Custom metrics follow the same retention schedule listed above.

Currently, only custom metrics that you publish to CloudWatch are available at high resolution. High-resolution custom metrics are stored in CloudWatch at one-second resolution. High resolution is defined by the StorageResolution parameter in the PutMetricData API request, with a value of one, and is not a required field. If you do not specify a value for the optional StorageResolution field, then CloudWatch will store the custom metric at one-minute resolution by default.

No, high-resolution custom metrics are priced in the same manner as standard one-minute custom metrics.

You can monitor your own data using custom metrics, CloudWatch Logs, or both. You may want to use custom metrics if your data is not already produced in log format, for example operating system processes or performance measurements. Or, you may want to write your own application or script, or one provided by an AWS partner. If you want to store and save individual measurements along with additional detail, you may want to use CloudWatch Logs.

You can retrieve, graph, and set alarms on the following statistical values for Amazon CloudWatch metrics: Average, Sum, Minimum, Maximum, and Sample Count. tatistics can be calculated for time intervals that are multiples of either one minute or 60 seconds. For high-resolution custom metrics, statistics can be computed for time periods between one second and three hours.

Amazon CloudWatch Application Insights for .NET and SQL Server is a capability that you can use to easily monitor your .NET and SQL Server applications. It helps identify and set up key metrics and logs across your application resources and technology stack, i.e. database, web (IIS) and application servers, OS, load balancers, queues, etc. It constantly monitors these telemetry data to detect and correlate anomalies and errors, to notify you of any problems in your application. To aid in troubleshooting, it creates automatic dashboards to visualize problems it detects which includes correlated metric anomalies and log errors, along with additional insights to point you to their potential root cause.

Automatically recognize application metrics and logs: It scans your application resources, provides a list of recommended metrics and logs to monitor, and sets them up automatically, making it easier to set up monitoring for your applications. 

Intelligent problem detection: It uses built-in rules and machine learning algorithms to dynamically monitor and analyze symptoms of a problem across your application stack and detect application problems. It helps you reduce the overhead of dealing with individual metric spikes, or events, or log exceptions, and instead get notified on real problems, along with contextual information these problems.

Faster troubleshooting: It assesses the detected problems to give you insights on them, such as the possible root cause of the detected problem and list of metrics and logs impacted because of the problem. You can provide feedback on generated insights to make the problem detection engine specific to your use case.

On-board application: Specify the application you want to monitor by choosing the AWS Resource Group associated with it.

Identify application components: It analyzes your application resources to identify application components (standalone resources, or groups of related resources such as auto scaling groups and load balancer groups). You can also customize components by grouping resources for better insights and easy onboarding.

Enable monitoring: For your application components, you can specify the technology tier i.e. IIS front-end, .NET worker tier, etc. Based on your selection it provides a recommended set of metrics and logs that can be customized based on your needs. Once you save these “monitors”, Application Insights for .NET and SQL Server sets up CloudWatch to collect these on your behalf.

Once onboarded, Application Insights for .NET and SQL Server uses a combination of pre-built rules and machine learning models to start identifying application problems. It creates automated dashboards on CloudWatch with the list of problems detected, and a detailed view for these problems along with related anomalies and errors.

CloudWatch Metric Streams is a feature that enables you to continuously stream CloudWatch metrics to a destination of your choice with minimal setup and configuration. It is a fully managed solution, and doesn’t require you to write any code or maintain any infrastructure. With a few clicks, you can configure a metric stream to destinations like Amazon Simple Storage Service (S3). You can also send your metrics to a selection of third-party service providers to keep your operational dashboards up to date.

Metric Streams provides an alternative way of obtaining metrics data from CloudWatch without the need to poll APIs. You can create a metric stream with just a few clicks, and your metrics data will start to flow to your destination. You can easily direct your metrics to your data lake on AWS such as on Amazon S3, and start analyzing usage or performance with tools such as Amazon Athena. Metrics Streams also makes it easier to send CloudWatch metrics to popular third-party service providers using an Amazon Kinesis Data Firehose HTTP endpoint. You can create a continuous, scalable stream including the most up-to-date CloudWatch metrics data to power dashboards, alarms, and other tools that rely on accurate and timely metric data.

You can create and manage Metric Streams through the CloudWatch Console or programmatically through the CloudWatch API, AWS SDK, AWS CLI, or AWS CloudFormation to provision and configure Metric Streams. You can also use AWS CloudFormation templates provided by third-party service providers to set up Metric Streams delivery to destinations outside AWS. For more information, see the documentation on CloudWatch Metric Streams .

Yes. It is possible to choose to send all metrics by default, or create filter rules to include and exclude groups of metrics defined by namespace, e.g. AWS/EC2. Metric Streams automatically detects new metrics matching filter rules and includes metric updates in the stream. When resources are terminated, Metric Streams will automatically stop sending updates for the inactive metrics.

Metric Streams can output in either OpenTelemetry or JSON format. You can select the output format when creating or managing metric streams.

Yes. You can visit the monitoring section of the Metric Streams console page. You will see automatic dashboards for the volume of metric updates over time. These metrics are also available under the AWS/CloudWatch namespace and can be used to create alarms to send notifications in the case of an unusual spike in volume.

Log monitoring

CloudWatch Logs lets you monitor and troubleshoot your systems and applications using your existing system, application and custom log files.

With CloudWatch Logs, you can monitor your logs, in near real time, for specific phrases, values or patterns. For example, you could set an alarm on the number of errors that occur in your system logs or view graphs of latency of web requests from your application logs. You can then view the original log data to see the source of the problem. Log data can be stored and accessed for up to as long as you need in highly durable, low-cost storage so you don’t have to worry about filling up hard drives.

Amazon CloudWatch Vended logs are logs that are natively published by AWS services on behalf of the customer. VPC Flow logs is the first Vended log type that will benefit from this tiered model. However, more AWS Service log types will be added to Vended Log type in the future.

Please refer to Regional Products and Services for details of CloudWatch Logs service availability by region.

Please see our pricing page for the latest information.

CloudWatch Logs is capable of monitoring and storing your logs to help you better understand and operate your systems and applications. When you use CloudWatch Logs with your logs, your existing log data is used for monitoring, so no code change are required. Here are two examples of what you can do with Amazon CloudWatch and your logs:

Real time Application and System Monitoring: You can use CloudWatch Logs to monitor applications and systems using log data in near real time. For example, CloudWatch Logs can track the number of errors that occur in your application logs and send you a notification whenever the rate of errors exceeds a threshold you specify. Amazon CloudWatch uses your log data for monitoring and consequently it doesn't involve any code changes from you.

Long Term Log Retention: You can use CloudWatch Logs to store your log data for as long as you need in highly durable and cost-effective storage without worrying about hard drives running out of space. The CloudWatch Logs Agent makes it easy to quickly move both rotated and non-rotated log files off of a host and into the log service. You can then access the raw log event data when you need it.

You can configure the EC2Config service to send a variety of data and log files to CloudWatch including: custom text logs, Event (Application, Custom, Security, System) logs, Event Tracing (ETW) logs, and Performance Counter (PCW) data. Learn more about the EC2Config service here .

The CloudWatch Logs Agent will send log data every five seconds by default and is configurable by the user.

CloudWatch Logs can ingest, aggregate and monitor any text based common log data or JSON-formatted logs.

The CloudWatch Logs Agent will record an error in the event it has been configured to report non text log data. This error is recorded in the /var/logs/awslogs.log.

You can monitor log events as they are sent to CloudWatch Logs by creating Metric Filters. Metric Filters turn log data into Amazon CloudWatch Metrics for graphing or alarming. Metric Filters can be created in the Console or the CLI. Metric Filters search for and match terms, phrases or values in your log events. When a Metric Filter finds one of the terms, phrases or values in your log events, it counts it in an Amazon CloudWatch Metric that you choose. For example, you can create a Metric Filter to search for and count the occurrence of the word “Error” in your log events. Metric Filters can also extract values from space delimited log events, such as the latency of web requests. You can also use conditional operators and wildcards to create exact matches. The Amazon CloudWatch Console can help you test your patterns before creating Metric Filters.

A Metric Filter pattern can contain search terms or a specification of your common log or JSON event format.

For example, if you want to search for the term Error, the pattern for the metric filter would just be the term Error. Multiple search terms can be included to search for multiple terms. For example, if you wanted to count events which contained the terms Error and Exception you would use the pattern Error Exception. If you wanted to match the term Error Exception exactly, you would put double quotes around the search term, "Error Exception". You can specify as many search terms as you like.

CloudWatch Logs can also be used to extract values from a log event in common log or JSON format. For example, you could track the bytes transferred from your Apache access logs. You can also use conditional operators and wildcards to match and extract the data you are interested in. To use the extraction feature of Metric Filters, log events must be space delimited and use a starting and ending double quote """, or, a starting square brace "[" and a closing square brace "]"square, to enclose fields. Alternatively, they can be JSON-formatted log events. For the full details of the syntax and examples, please see the Developer Guide for Metric Filters .

CloudWatch Logs lets you test the Metric Filter patterns you want before you create a Metric Filter. You can test your patterns against your own log data that is already in CloudWatch Logs or you can supply your own log events to test. Testing your pattern will show you which log events matched the Metric Filter pattern and, if extracting values, what the extracted value is in the test data. Metric Filter testing is available for use in the console and the CLI.

Amazon CloudWatch Metric Filters does not support regular expressions. To process your log data with regular expressions, consider using Amazon Kinesis and connect the stream with a regular expression processing engine.

Log management

You can retrieve any of your log data using the CloudWatch Logs console or through the CloudWatch Logs CLI. Log events are retrieved based on the Log Group, Log Stream and time with which they are associated. The CloudWatch Logs API for retrieving log events is GetLogEvents.

You can use the CLI to retrieve your log events and search through them using command line grep or similar search functions.

You can store your log data in CloudWatch Logs for as long as you want. By default, CloudWatch Logs will store your log data indefinitely. You can change the retention for each Log Group at any time.

Amazon CloudWatch Logs Standard is one of two log classes offered by CloudWatch. Logs Standard delivers comprehensive log management intended for real-time monitoring and advanced analytics capabilities like Live Tail, metric extraction, alarming, and data protection. You can monitor your logs, in near real time, for specific phrases, values or patterns. For example, you could set an alarm on the number of errors that occur in your system logs or view graphs of latency of web requests from your application logs. You can then view the original log data to see the source of the problem.

Amazon CloudWatch Logs Infrequent Access (Logs-IA) is one of two log classes offered by CloudWatch. Logs-IA is purpose-built for consolidating all your logs natively on AWS. It offers the managed ingestion, cross-account log analytics, and encryption of CloudWatch Logs Standard, with a low per GB ingestion price. This combination of tailored capabilities and low cost make CloudWatch Logs-IA ideal for ad-hoc querying and after-the-fact forensic analysis. Log data can be stored and accessed indefinitely in highly durable, low-cost storage so you don’t have to worry about filling up hard drives.

Amazon CloudWatch Logs Infrequent Access (Logs-IA) is available in all AWS Regions where CloudWatch Logs is available. You can get started in the console or programmatically via AWS CLI or APIs

Log analytics

To access Logs Insights, your IAM policy must include permissions for logs:DescribeLogGroups and logs:FilterLogEvents.

You can use Logs Insights to query all logs being sent to CloudWatch. Logs Insights automatically discovers the logs fields from logs from AWS services such as Lambda, CloudTrail, Route53, and VPC Flow Logs; and any application log that generates log events in JSON format. Additionally, for all log types, it generates 3 system fields @message, @logStream, and @timestamp for all logs sent to CloudWatch. @message contains the raw unparsed log event, @logStream contains the name of the source that generated the log event, and @timestamp contains the time at which the log event was added to CloudWatch.

With CloudWatch Logs Insights, you can interactively search and analyze your log data in Amazon CloudWatch Logs. You can perform queries to help you more efficiently and effectively respond to operational issues.

CloudWatch Logs Insights supports three query languages that you can use for your queries:

  • A purpose-built Logs Insights query language (Logs Insights QL) with a few but powerful commands. You can write commands to retrieve one or more log fields, find log events that match one or more search criteria, aggregate your log data, and extract ephemeral fields from your text-based logs
  • OpenSearch Service Piped Processing Language (PPL). OpenSearch PPL enables you to analyze your logs using a set of commands delimited by pipes (|). With PPL, you can query, and analyze data using piped-together commands, making it easier to understand and compose complex queries and use commands to filter and aggregate data, and a rich set of math, string, date, conditional functions for analysis.
  • OpenSearch Service Structured Query Language (SQL). With OpenSearch SQL queries you can analyze your logs in a declarative manner. With OpenSearch SQL, you can use commands such as SELECT, FROM, WHERE, GROUP BY, HAVING, and various other Spark SQL commands and functions. You can execute JOINs across log groups, correlate data using subqueries, and use a rich set of JSON, mathematical, string, conditional, and other Spark SQL functions to analyze your logs.

Logs Insights offers in-product help in the form of sample queries, command descriptions, and query auto-completion to help you get started. You can find additional details about the query language here.

The service limits are documented here .

Logs Insights QL is available in all regions where CloudWatch Logs is available. OpenSearch PPL and OpenSearch SQL are available in regions where OpenSearch Direct Query Service is available.

You can write queries containing aggregations, filters, regular expressions, and text searches. You can also extract data from log events to create ephemeral fields, which can be further processed by the query language to help you access the information you are looking for. The query language supports string, numeric, and mathematical functions, such as concat, strlen, trim, log, and sqrt, among others. You can also use boolean and logical expressions, and aggregate functions such as min, max, sum, average, and percentile, among others. You can find additional details about the query language and supported functions here .

You can find a list of query commands here . You can find a list of supported functions here .

You can use visualizations to identify trends and patterns that occur over time within your logs. Logs Insights supports visualizing data using line charts, stacked area charts, bar charts and pie charts. It generates visualizations for all queries containing one or more aggregate functions, where data is grouped over a time interval or specific fields. You can find additional details about visualizing data here

In addition to visualizing the results of a Logs Insights query, customers can create out-of-the-box OpenSearch dashboards for vended logs such as VPC, CloudTrail and WAF. These dashboards are backed by OpenSearch indexes, and require customers to explicitly opt-in, as OpenSearch Serverless instances are created as part of this integration, and incur cost.
VPC Flow Logs Dashboards: This dashboard captures network flow data for Virtual Private Cloud. It’s designed to help customers analyze network traffic, detect unusual patterns, and monitor resource usage. Currently only VPC v2 fields (default format) is supported. Custom formatted fields are not supported. CloudTrail Dashboard: This dashboard provides an overview of API activity within an AWS environment using CloudTrail logs. It’s useful for monitoring API activity, auditing actions, and identifying potential security or compliance issues. WAF Dashboard: This dashboard provides insights into web traffic being monitored by AWS WAF (Web Application Firewall). This dashboard helps in identifying traffic patterns, blocked requests, and potential threats from specific regions or IPs. For OpenSearch pricing and free trial, please see the CloudWatch Logs pricing section.

You can use Java-style regular expressions with Logs Insights. Regular expressions can be used in the filter command. You can find examples of queries with regular expressions using the in-product help or here .

You can use backticks to escape special characters. Log field names that contain characters other than alphanumeric characters, @, and . require escaping with backticks.

System fields generated by Logs Insights begin with @. Logs Insights currently generates 3 system fields @message which contains the raw, unparsed log event as sent to CloudWatch, @logStream which contains the name of the source that generated the log event, and @timestamp which contains the time when the log event was added to CloudWatch.

Logs Insights enables you to query log data that was added to CloudWatch Logs on or after November 5, 2018.

You can search for log events from a specific log stream by adding the query command filter @logStream = "log_stream_name" to your log query.

CloudWatch Logs already supports integration options with other AWS Services such as Amazon Kinesis, Amazon Kinesis Data Firehose, Amazon Elasticsearch and AWS Partner ISV solutions such as Splunk, Sumo Logic, and DataDog, among others, to provide you with choice and flexibility across all environments, for your custom log processing, enrichment, analytics, and visualization needs. In addition, the query capabilities of CloudWatch Logs Insights are available for programmatic access through the AWS SDK, to facilitate AWS ISV Partners to build deeper integrations, advanced analytics, and additional value on top of CloudWatch Logs Insights.

ISV Partner integrations with CloudWatch Logs Insights enable you to bring in your log data into one place and have the ability to analyze using the tools and frameworks of your choice in a high performance, cost-effective way, without having to move large amounts of data. It also provides you with faster access to your logs by removing the associated data transfer latencies and eliminates the operational complexities of configuring and maintaining certain data transfers.

Logs anomaly detection

Powered by AI/ML, Amazon CloudWatch Logs Anomaly Detection is an automated logs analytics feature that helps you cluster related logs to accelerate log investigation, compares your logs over time to surface key insights, and monitors your logs and notifies you when an unusual behavior occurs for faster remediation. Using advanced algorithms, CloudWatch can automatically detect unusual patterns and changes in your application logs, alerting you to potential problems. You no longer need to update queries or filters every time your logs change. With Logs Anomaly Detection, you can catch emerging errors and spikes in log messages early before they impact you, identify new issues without needing to know specifics in advance, get alerted to unusual activity without needing to configure parameters, and continuously monitor your most important logs. By surfacing potential issues proactively, CloudWatch Logs Anomaly Detection helps you stay ahead of problems and deliver reliable performance.
 

Amazon CloudWatch Logs Anomaly Detection helps to automatically detect unusual behavior in your application logs. While tools such as Metric Filters enable you to monitor for specific, known variables, anomaly detection can identify previously unknown conditions such as a newly occurring error code in your logs or a sudden increase in a particular log message. Logs Anomaly Detection flexibly evolves with your application logs over time and does not require you to define complicated configuration parameters such as query or filter syntax. Logs Anomaly Detection provides an additional degree of assurance for your most critical application log groups.

Amazon CloudWatch Logs Anomaly Detection does not require a specific format of logs to work. The feature uses Machine Learning to flexibly parse your logs. CloudWatch Logs Anomaly Detection is most suited for application logs, such as those generated from application code running in EC2, EKS, ECS, Lambda, and other resources for running application code.

Amazon DevOps Guru offers an anomaly detection capability that is purpose-built for specific application sources such as Lambda. Amazon CloudWatch Logs Anomaly Detection is a solution that works with any application log. CloudWatch Logs Anomaly Detection is available within the CloudWatch console.

Logs Live Tail

Amazon CloudWatch Logs Live Tail is a new interactive analytics capability that provides you a real-time view of your incoming logs. Using Live Tail you can quickly troubleshoot issues: Developers can leverage a streaming view of their logs to debug their code, and IT engineers can reliably monitor the status of their deployments. Live Tail delivers to a real-time interactive view of logs in the context of related events to help reduce mean time to detection and in turn mean time to resolution.

You should use the interactive CloudWatch Live Tail capability for out-of-the-box detection of application or deployment issues within your native AWS Observability tools. Live Tail allows DevOps teams to gain deep visibility into your critical application logs and debug code from within your development environment without having to switch between multiple tools. Using Live Tail to monitor the status and health of deployments, IT engineers, operations support, and central security teams can efficiently monitor their services and applications to expedite root-cause analysis and reduce mean time to resolution.

In addition to providing Live Tail capabilities on custom application logs, Live Tail also helps customers gain deep insights on logs by AWS Services including Amazon Virtual Private Cloud, Amazon Route53, AWS Lambda, Amazon Elastic Kubernetes Service, Amazon Elastic Container Service, and more. Using the Live Tail widget, AWS services can embed the same interactive live tailing experience into your consoles. Additionally, direct integration can also be implemented by other services (such as Amazon Managed Grafana, AWS Thinkbox) to provide the same deep-dive analytics capabilities to you from within your own console and any application log that generates log events.

For this feature to work as intended, the following operations should be allowed for users. When starting a Live Tail session, if you are not part of the Admin Role or have a policy including logs:*, please ensure to add the below actions to your policy statement: logs:StartLiveTail and logs:StopLiveTail.

Learn more about Live Tail service limits .

Live Tail is available in the US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), EU (Paris), and South America (São Paulo) Regions.

You can filter based on Log Groups, Log Streams, and filter by keywords. The log groups selection supports multiple selections across multiple accounts when in the monitoring account (cross-account observability). The log streams selection supports multiple selections based on name or a prefix. Filter by keywords is case sensitive. One or more keywords (for instance., error, exception, or fault) can be entered to further narrow the focus of the search. You can type in the keywords or copy and paste from the samples provided in the Info panel. Learn more about filter patterns.

No, Live Tail provides real-time view of the logs data collected by CloudWatch. For historical logs, refer to Logs Insights and Log Groups features.

Logs data protection

Data protection is a feature in CloudWatch Logs that allows you to define your own rules and policies to automatically detect and mask sensitive data within logs that are collected from your systems and applications. This is done using machine learning (ML) and pattern matching. Data can be viewed unmasked with elevated Identity and Access Management (IAM) privileges.

To prevent logging sensitive data, customers sometimes rely on manual investigation or by configuring short log retention policies to delete logs, which runs the risk of losing valuable operational logs. CloudWatch Logs data protection automatically identifies and masks sensitive information in your logs using pattern matching and ML without requiring that anybody access them. This feature is useful for industries under strict regulations that need to make sure that no personal information gets stored. Also, customers building payment or authentication services where a lot of personal and sensitive information is required can use this new feature to reduce the probability of unneeded information getting stored in their logs.

When you create the data protection policy in CloudWatch Logs, you can specify the data you want to protect. There are many data identifiers for you to choose from, such as email addresses, driver’s licenses from many countries, credit card numbers, addresses, and more. The variety of targeted data identifiers provides the flexibility to choose what sensitive data is used by your applications and mask the sensitive data that does not need to be easily accessible. It is important that you decide what information is sensitive to your application and select the relevant identifiers for your use cases.

Alarms

You can create an alarm to monitor any Amazon CloudWatch metric in your account. For example, you can create alarms on an Amazon EC2 instance CPU utilization, Amazon ELB request latency, Amazon DynamoDB table throughput, Amazon SQS queue length, or even the charges on your AWS bill.

You can also create an alarm on custom metrics that are specific to your custom applications or infrastructure. If the custom metric is a high-resolution metric, you have the option of creating high-resolution alarms that alert as soon as 10-second or 30-second periods.

With composite alarms, you can combine multiple alarms into alarm hierarchies. This reduces alarm noise by triggering just once when multiple alarms fire at the same time. You can provide an overall state for a grouping of resources like an application, AWS Region, or Availability Zone.

Please reference the CloudWatch pricing page to learn more.

When you create an alarm, you can configure it to perform one or more automated actions when the metric you chose to monitor exceeds a threshold you define. For example, you can set an alarm that sends you an email, publishes to an SQS queue, stops or terminates an Amazon EC2 instance, or executes an Auto Scaling policy. Since Amazon CloudWatch alarms are integrated with Amazon Simple Notification Service, you can also use any notification type supported by SNS. You can use the AWS Systems Manager OpsCenter action to automatically create an OpsItem when an alarm enters the ALARM state. This helps you to quickly diagnose and remediate issues with AWS resources from a single console.

When you create an alarm, you first choose the Amazon CloudWatch metric you want it to monitor. Next, you choose the evaluation period (e.g., five minutes or one hour) and a statistical value to measure (e.g., Average or Maximum). To set a threshold, set a target value and choose whether the alarm will trigger when the value is greater than (>), greater than or equal to (>=), less than (<), or less than or equal to (<=) that value.

Alarms continue to evaluate metrics against your chosen threshold, even after they have already triggered. This allows you to view its current up-to-date state at any time. You may notice that one of your alarms stays in the ALARM state for a long time. If your metric value is still in breach of your threshold, the alarm will remain in the ALARM state until it no longer breaches the threshold. This is normal behavior. If you want your alarm to treat this new level as OK, you can adjust the alarm threshold accordingly.

Alarm history is available for 14 days. To view your alarm history, log in to CloudWatch in the AWS Management Console, choose Alarms from the menu at left, select your alarm, and click the History tab in the lower panel. There you will find a history of any state changes to the alarm as well as any modifications to the alarm configuration.

Dashboards

Amazon CloudWatch Dashboards allow you to create, customize, interact with, and save graphs of AWS resources and custom metrics.

To get started, visit the Amazon CloudWatch Console and select “Dashboards”. Click the “Create Dashboard” button. You can also copy the desired view from Automatic Dashboards by clicking on Options -> “Add to Dashboard”.

Automatic Dashboards are pre-built with AWS service recommended best practices, remain resource aware, and dynamically update to reflect the latest state of important performance metrics. You can now filter and troubleshoot to a specific view without adding additional code to reflect the latest state of your AWS resources. Once you have identified the root cause of a performance issue, you can quickly act by going directly to the AWS resource.

Yes. Dashboards will auto refresh while you have them open.

Yes, a dashboard is available to anyone with the correct permissions for the account with the dashboard.

Events

Amazon CloudWatch Events (CWE) is a stream of system events describing changes in your AWS resources. The events stream augments the existing CloudWatch Metrics and Logs streams to provide a more complete picture of the health and state of your applications. You write declarative rules to associate events of interest with automated actions to be taken.

Currently, Amazon EC2, Auto Scaling, and AWS CloudTrail are supported. Via AWS CloudTrail, mutating API calls (i.e., all calls except Describe*, List*, and Get*) across all services are visible in CloudWatch Events.

When an event matches a rule you've created in the system, you can automatically invoke an AWS Lambda function, relay the event to an Amazon Kinesis stream, notify an Amazon SNS topic, or invoke a built-in workflow.

Yes. Your applications can emit custom events by using the PutEvents API, with a payload uniquely suited to your needs.

CloudWatch Events is able to generate events on a schedule you set by using the popular Unix cron syntax. By monitoring for these events, you can implement a scheduled application.

CloudWatch Events is a near real time stream of system events that describe changes to your AWS resources. With CloudWatch Events, you can define rules to monitor for specific events and perform actions in an automated manner. AWS CloudTrail is a service that records API calls for your AWS account and delivers log files containing API calls to your Amazon S3 bucket or a CloudWatch Logs log group. With AWS CloudTrail, you can look up API activity history related to creation, deletion and modification of AWS resources and troubleshoot operational or security issues.

AWS Config is a fully managed service that provides you with an AWS resource inventory, configuration history, and configuration change notifications to enable security and governance. Config rules help you determine whether configuration changes are compliant. CloudWatch Events is for reacting in near real time to resource state changes. It doesn’t render a verdict on whether the changes comply with policy or give detailed history like Config/Config Rules do. It is a general purpose event stream.