In this, we will get Cloudwatch introduction.

SOA-C01 exam is updated to AWS Certified SysOps Administrator – Associate (SOA-C02).

Amazon CloudWatch monitors
  • AWS resources
  • applications running on AWS
CloudWatch
  • collects and tracks metrics, for AWS resources and applications.
  • CloudWatch home page displays metrics about every AWS service in use.
  • Can create custom dashboards to display metrics
  • Create alarms to watch metrics and send notifications or
  • Alarms can automatically make changes to the resources under monitoring against a threshold

Access CloudWatch by

CloudWatch Namespaces

A cloudwatch namespace is

  • It is a container for CloudWatch metrics.
  • Metrics in different namespaces are isolated from each other,
  • There is no default namespace.
  • Must specify a namespace for each data point to be published to CloudWatch.
  • You can specify a namespace name when you create a metric.
  • These names must contain valid XML characters,
  • Be fewer than 256 characters in length.
  • Possible characters are: alphanumeric characters (0-9A-Za-z), period (.), hyphen (-), underscore (_), forward slash (/), hash (#), and colon (:).
  • The AWS namespaces, naming convention: AWS/service
CloudWatch Metrics

A cloudwatch metric

  • represents a time-ordered set of data points published to CloudWatch.
  • It is similar to a variable to monitor, with data points as values of that variable over time.
  • AWS services send metrics to CloudWatch
  • Can send custom metrics to CloudWatch also
  • Retrieve statistics about data points as an ordered set of time-series data.
  • Metrics are specific to a Region in which were created and cannot be deleted,
  • Automatically expire after 15 months if no new data is published to them.
  • They expire on a rolling basis; as new data points come in, data older than 15 months is dropped.
  • Metrics are uniquely defined by a name, a namespace, and zero or more dimensions.
  • Each data point in a metric has a time stamp, and (optionally) a unit of measure.
CloudWatch Metrics Time Stamps
  • Each metric data point must be associated with a time stamp.
  • The time stamp can be up to two weeks in the past
  • up to two hours into the future.
  • If no time stamp is given, CloudWatch creates a time stamp on time data point was received.
  • Time stamps are dateTime objects
  • Coordinated Universal Time (UTC) is recommended
  • When you retrieve statistics from CloudWatch, all times are in UTC
  • CloudWatch alarms check metrics based on the current time in UTC.
CloudWatch Metrics Retention

CloudWatch retains metric data as follows:

  • Data points with a period of less than 60 seconds are available for 3 hours. Also called as high-resolution custom metrics.
  • Data points with a period of 60 seconds (1 minute) are available for 15 days
  • Data points with a period of 300 seconds (5 minute) are available for 63 days
  • Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months)
CloudWatch Dimensions

A dimension

  • is a name/value pair
  • part of the identity of a metric.
  • Can assign up to 10 dimensions to a metric.
  • Used to describe characteristic of a metric
  • Also used to filter the results that CloudWatch returns
  • For few AWS services like EC2, CloudWatch can aggregate data across dimensions
  • Example – Server=Producton,Domain=City01
CloudWatch Statistics
  • It is metric data aggregations over specified periods of time.
  • Aggregations use the namespace, metric name, dimensions, and the data point unit of measure, within the specified time period.
Available statistics
  • Minimum  – lowest value observed during the specified period. Tells, when low activity
  • Maximum – highest value observed during the specified period. Tells, when high activity
  • Sum – All values submitted for the matching metric added together. Tells, total activity
  • Average – The value of Sum / SampleCount during the specified period.
  • SampleCount  – The count (number) of data points used for the statistical calculation.
  • pNN.NN  – Value of specified percentile up to 2 decimal places like p95.45. Not for negative value  metrics.
CloudWatch Metrics Units
  • Each statistic has a unit of measure.
  • Example units – Bytes, Seconds, Count, and Percent.
  • Can specify a unit when you create a custom metric.
  • If not specified, CloudWatch uses None as the unit.
  • CloudWatch attaches no significance to a unit internally
  • Metric data points that specify a unit of measure are aggregated separately.
  • Statistics without specifying a unit, CloudWatch aggregates all data points of the same unit together.
CloudWatch Metrics Periods
  • Period is the length of time associated with a specific Amazon CloudWatch statistic.
  • Periods defined in seconds, and valid values for period are 1, 5, 10, 30, or any multiple of 60.
  • For period of six minutes, use 360 as the period value.
  • Can adjust how the data is aggregated by varying the length of the period.
  • Only custom metrics that you define with a storage resolution of 1 second support sub-minute periods.
  • To retrieve statistics, specify a period, start time, and end time.
  • The default values for the start time and end time get you the last hour’s worth of statistics.
  • For statistics aggregated over the entire hour, specify a period of 3600.
  • aggregated statistics are stamped with the time corresponding to the beginning of the period.
  • Periods are also important for CloudWatch alarms.
CloudWatch Metrics Aggregation

Amazon CloudWatch aggregates statistics according to the period length that you specify when retrieving statistics. You can publish as many data points as you want with the same or similar time stamps. CloudWatch aggregates them according to the specified period length. CloudWatch does not aggregate data across Regions.

You can publish data points for a metric that share not only the same time stamp, but also the same namespace and dimensions. CloudWatch returns aggregated statistics for those data points. You can also publish multiple data points for the same or different metrics, with any timestamp.

For large dataset\s, you can insert a pre-aggregated dataset called a statistic set. With statistic sets, you give CloudWatch the Min, Max, Sum, and SampleCount for a number of data points. This is commonly used when you need to collect data many times in a minute.

CloudWatch Percentiles
  • A percentile indicates the relative standing of a value in a dataset.
  • example, the 95th percentile means that 95 percent of the data is lower than this value and 5 percent of the data is higher than this value.
  • Used to isolate anomalies.
CloudWatch Alarms
  • Watches a single metric over a specified time period, and performs specified actions,
  • It initiates actions on your behalf.
  • Action on value of the metric relative to a threshold over time.
  • Action can be notification to SNS or Auto Scaling policy.
  • Can add alarms to dashboards.
  • Actions only for sustained state changes only.
  • Always select a period greater or equal to the frequency of the metric to be monitored.

CloudWatch Monitoring

  • Can monitor EC2 instances, Autoscaling Groups, ELBs, Route53 Health Checks, EBS Volumes, Storage Gateways, CloudFront, DynamoDB, ElastiCache nodes, RDS instances, EMR Job Flows, Redshift. SNS topics, SQS Queues, OpsWorks, CloudWatch Logs, Estimated charges on your AWS bill, and custom metrics | logs generated by your applications and services.
  • EC2 will by default monitor your instances @5 minute intervals
  • EC2 instances can monitor your instances @1 minute intervals if the ‘detailed monitoring’ option is set on the instance
  • By default CloudWatch will monitor CPU, Network, Disk, and Status Checks
  • RAM utilization is a custom metric and must be added manually to EC2 instances in order to be tracked.
2 types of Status Checks:

System Status Checks (Physical Host):

  • Checks the underlying physical host
  • Checks for loss of network connectivity
  • Then, checks for loss of system power
  • Checks for software issues on the physical host
  • Checks for hardware issues on the physical host
  • Best way to resolve issues is to stop the instance and start it again (will switch physical hosts)
Instance Status Checks
  • Checks the VM itself
  • Checks for failed system status checks
  • Then, checks for mis-configured networking or startup configs
  • Checks for exhausted memory
  • Next, checks for corrupted file systems
  • Checks for an incompatible kernel

Check more.

Menu