Section Menu
- Introduction to Monitoring and alerting EC2
- Monitoring EC2 Instances
- EC2 and CloudWatch Alarms
- EC2 System Checks and metrics
- Custom metrics for EC2
- Push custom metrics from EC2 to CloudWatch
- Quiz
Introduction to Monitoring and alerting EC2
Monitoring EC2
- Basic metrics (CPU Utilization, Network) are pushed every 5 minutes
- Can be increased to 1 min at extra charge
- requires more compute
- requires more storage
- Can be increased to 1 min at extra charge
- Track various standard metrics
- Track custom metrics
- Automatically start, stop, terminate, reboot or recover EC2
- Can automatically detect a VM that has crashed due to some software or hardware issue and terminate that instance.
- Ensures you do not have dead servers in your pool
- Reduces charges – you’re charged for VMs as long as they are up.
- Two System Level checks for the overall status
Monitoring EC2 Instances
- Basic monitoring (5 min) enabled by default
- Detailed monitoring: 1 min
- Can be enabled from the EC2 console
EC2 and CloudWatch Alarms
How to use Alarms to managed our EC2 instances
- CloudWatch > Metrics > EC2
- Per-Instance Metrics
- Good for setting alarms for individual instances
- Aggregated by Instance Type
- Aggregate value for ALL EC2 instances for that type (t2.micro, m4.small, etc.)
- Must have Detailed Monitoring Enabled!
- Per-Instance Metrics
Per-Instance
- Filter by using the EC2 id
- Example: i-0bdac641606572f86
- Filter by Tags is NOT supported
Create a Per-Instance Alarm
- CloudWatch > Alarms > Filter on Instance ID > Select Metric (CPU Utilization)
- Adjust parmeters (Time interval, etc.)
- [Next] > Define Alarm
- Define Alarm
- Alarm Threshold
- Name, Description, etc.
- Criteria for setting alarm (Metric >= X)
- Consecutive periods
- Actions
- Alarm is: OK, Alarm or Insuff. data
- Notifications (Default)
- Send notifications setup in SNS
- Can also output a document to SQS
- Auto Scaling
- Works best (only?) when EC2 instances are working together as a Load Balanced group or cluster
- Example: CPU Utilization >60
- EC2 Action
- Requires an IAM “EC2ActionsAccess” Role
- Recover
- Stop
- Terminate
- Reboot
- Alarm Threshold
EC2 System Checks and metrics
Different types of Status checks
- Status checks performed Every Minute
- Pass
- All checks must pass for this Status return
- Fail
- Pass
- System Status Checks
- Monitor systems for the underlying infrastructure – Actual Hardware
- Detects problems that require AWS involvement
- Loss of network connectivity
- Loss of system power
- Software issues on the physical host
- Hardware issues on the physical host
- You CANNOT do anything to ‘fix’ these problems
- You CAN terminate and replace the VM, which will automatically replace it on a new host
- Instance Status Checks
- Issues that might prevent your VM from running applications
- Incorrect networking or startup configurations
- Exhausted Memory
- Corrupt file system
- Incompatible kernel
- Usually can be fixed by rebooting or making configuration changes
- Issues that might prevent your VM from running applications
Demo
- You can view the status of these system checks in the EC2 console
CloudWatch Metrics for EC2
- CPU Utilization: Compute as %
- Disk Read and Write Operations
- Measure the completed Read / Write operations from all Instance Store volumes available to the VM
- Network In / Out
- Bytes sent or received on all network interfaces on the VM
- Status Check Failed
- Pass or Fail for BOTH the system and internal status checks
- If value = 0 both passed
- If value = 1 either 1 or both failed
Custom Metrics
- Can publish to AWS using the AWS CLI or the API
- Can view statistical graphs within the console
- CloudWatch stores the data as a series of datapoints
- Each datapoint has an associated timestamp
- Can use these custom metrics to set up custom alarms
Custom metrics for EC2
How to Create Custom Metrics
Create the IAM Roles
- Required:
- To send data to CloudWatch
- For CloudWatch to fetch these metrics from the EC2 instance
- Anytime an AWS services wants to talk to another AWS service, they require the appropriate roles that provide the required privileges.
- IAM > Policies > [Create Policy]
- CloudWatch:
- Put Metric Data
- Get Metric Statistics
- List Metrics
- EC2
- Describe Tags
- Name: Ec2-Custom-CloudWatch
- CloudWatch:
- IAM > Roles > [Create role]
- Add Ec2-Custom-CloudWatch policy
- Name: CustomMetricsRole
Add the role to an existing EC2 instance
- Ec2 > Instance > Select Instance
- Actions > Instance Settings > Attach/Replace IAM Roles > CustomMetricsRole
Install the monitoring scripts
sudo yum install perl-DateTime perl-Sys-Syslog perl-LWP-Protocol-https
In the next section, I found I also had to install the following to prevent this error:
Can’t locate Digest/SHA.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 . .) at AwsSignatureV4.pm line 23.
Can’t locate Digest/SHA.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 . .) at AwsSignatureV4.pm line 23.
sudo yum install -y perl-Digest-SHA
wget http://aws-cloudwatch.s3.amazonaws.com/downloads/CloudWatchMonitoringScripts-1.2.1.zip
unzip CloudWatchMonitoringScripts-1.2.1.zip
Delete the downloaded zip file
rm CloudWatchMonitoringScripts-1.2.1.zip
Switch to the script directory
cd aws-scripts-mon
List the contents of the directory
ls -la
- awscreds.template
- provides credentials for EC2 instance
- Access ID and Secret Key
- Only required if you did not assign the CloudWatch role created above.
- provides credentials for EC2 instance
- AwsSignatureV4.pm
- CloudWatchClient.pm
- LICENSE.txt
- mon-get-instance-stats.pl
- Important!
- mon-put-instance-data.pl
- Important!
- NOTICE.txt
Push custom metrics from EC2 to CloudWatch
File Descriptions
-
- awscreds.template
- provides credentials for EC2 instance
- Access ID and Secret Key
- Only required if you did not assign the IAM CloudWatch role created above.
- provides credentials for EC2 instance
- AwsSignatureV4.pm
- CloudWatchClient.pm
- mon-get-instance-stats.pl
- Queries AWS CloudWatch and displays the most recent metics for the instance.
- mon-put-instance-data.pl
- Collects system metrics on an EC2 instance and sends them to AWS CloudWatch
- Memory
- Swap
- Disk space utilization
- Collects system metrics on an EC2 instance and sends them to AWS CloudWatch
- awscreds.template
Running the scripts
./mon-put-instance-data.pl --mem-util --mem-used --mem-avail
Successfully reported metrics to CloudWatch. Reference Id: a581db4c-b47e-11e8-a10b-c96e1c074e21
Verify the Metrics
- CloudWatch > Metrics > Linux System > InstanceId
Setup as a recurring cron job
crontab -e */5 * * * * /home/ec2-user/aws-scripts-mon/mon-put-instance-data.pl --mem-util --disk-space-util --disk-path=/ --from-cron
Quiz
Amazon EC2 stands for:
- Elastic Cloud Compute
- Elastic Cloud Two
- Elastic Clouding Compute
- Elastic Cloud Computation
The two different status checks which detect issues with an AWS instance are:
- System status check and Non-system-status check
- System status check and Instance-status check
- These two systems status checks are for monitoring the AWS systems running behind the scenes, such as the hypervisor and overall health of the VM itself.
- System status check and Monitoring system check
- Instance status check and Monitoring status check
“The status checks are meant for monitoring your AWS instances and their underlying hardware, network and software configurations.”
- True
- False
Which of the following metric identifies the p4rocessing power that is being consumed to run operating system and applications upon a selected instance.
- CPUUtilization
- DiskUtilization
- NetworkUtilization
- PowerUtilization
“By default, AWS EC2 is configured to provide metrics at 5 minute intervals, but detailed metrics which provides metric data for every 1 minute can also be enabled.”
- True
- False