Monitoring and Alerting for AWS RDS

September 10, 2018 Amazon Web Services (AWS), CloudWatch

Main Menu

Section Menu

Introduction to monitoring and alerting for AWS RDS
Monitoring AWS RDS metrics
AWS RES events
Quiz

Introduction to monitoring and alerting for AWS RDS

https://www.udemy.com/aws-monitoring-alerting-with-aws-cloudwatch-and-aws-sns/learn/v4/t/lecture/7082820?start=0

Overview

Intro to RDS
Key Metric available
Monitoring for availability and performance

AWS RDS

Relational Database Service
Provides a scalable database service
PaaS – Platform as a service
Multiple Database engines supported
- MySQL
- PostGre
- MSSQL
- Oracle
- MySQL Clone called Aurora
Highly scalable for large data loads
Can be configured with High Availability mode
- Automatically fails over to a healthy replica

Monitoring AWS RDS

Proper monitoring and alerting is essential
RDS and CloudWatch provide various monitoring options
- Provided Metrics
- Alarms
- Logs
- Events

Demo – Overview of RDS Monitoring

Events
- Events related to the database instance
  - Includes the time and date of the event
  - Source name and source type
  - Message associated with the event
  - Can be configured to send notifications
- Database
- DB Snapshots
- DB Security Group changes
- Parameter Groups
Event Suscriptions allow you to define the type/method of sending alerts.
Logs:RDS > Instances > Select instance > Scroll way down to Logs
- Available logs depend on the type of DB engine you are using.
  - MySQL
    - Error Log
    - Slow Query Log
    - General Log
- Logs are updated every 5 seconds

Monitoring AWS RDS metrics

https://www.udemy.com/aws-monitoring-alerting-with-aws-cloudwatch-and-aws-sns/learn/v4/t/lecture/7082822?start=0

Key Metrics

CPU Utilization: % of CPU utilized
*Database Connections: Number of connections currently in use by all the clients of the database.
Freeable Memory: Ram Available (MB)
*Free Storage Space: Storage Space Available (MB)
Read IOPs / Write IOPs
- High IOPs may suggest volume backing the database may get overloaded
  - This can cause slower responses
*Read Latency / Write Latency: Track the average time taken per disk input, output or IO operation.
- High latency values are a cause of concern because it is an indicator that:
  - disk activity is overloaded
  - performance is degraded

* These are the most important when it comes to production monitoring!

Demo: RDS Metrics

RDS > Instances > Select Database Instance
- Demo is old and does not match current AWS display 🙁

Create Alarm

RDS > Instances > Select Database Instance > Scroll down to “CloudWatch Alarms” > [Create Alarm]
Typical alarm setup
- Where to send the notification
- Metric to monitor
- Thresholds
- Alarm name

AWS RDS events

https://www.udemy.com/aws-monitoring-alerting-with-aws-cloudwatch-and-aws-sns/learn/v4/t/lecture/7082824?start=0

Key metrics should be monitored closely

Resource Utilization
System errors
Accidental termination
Cluster health
Maintenance windows
Query logs
RDS metrics
Instance level metrics
RDS resource modification

RDS Events

RDS > Events
- Generated whenever there is a new configuration change or important operational change from within the RDS instance.
You can retrieve events from RDS
- Console: 24 hr
- RDS API: 14 Days
- AWS CLI: 14 Days

Creating an Event Subscription

RDS > Event Subscriptions > [Create Event Subscription]
- Details:
  - Name
- Target:
  - Send notifications to: Select ARN (Topic)
- Source:
  - Source Type:
    - Instances
    - Security groups
    - Parameter groups
    - Snapshots
    - DB clusters
    - DB cluster snapshots
  - Instances to include:
    - All instances
    - Specific (Select from dropdown)
  - Event Categories
    - All
    - Select specific
      - availability
      - backtrack
      - backup
      - configuration change
      - creation
      - deletion
      - failover
      - failure
      - low storage
      - maintenance
      - notification
      - read replica
      - recovery
      - restoration
  - Note: by selecting All Instances and All Categories, we should be notified on any change.
  - [ Create ]
    - This took SEVERAL minutes

Testing

Reboot the instance
- It took a while before the notifications came through. Might be a yahoo thing.

Recap:

Learned various metrics available
Setup Alarms and Notifications
Viewed Events and create an Event Subscription
Viewed logs

Quiz

What does RDS stand for

Rational Database Service
Relational Database Service
Resolution Database Service
Revolution Database Service

Which of the following key metrics are available for AWS RDS

CPUUtilization, DatabaseConnections
FreeMemory, FreeStorageSpace
ReadIOPS/WriteIOPS
All of the above

For Monitoring AWS RDS databases, CloudWatch provides:

Only CloudWatch Alarms
Only CloudWatch Logs
Only CloudWatch Events
All of the above

Which of the following metrics is used to check the number of database connections in use:

DiskQueueDepth
DatabaseConnections
ReadThroughput
WriteThroughput

The FreeStorageSpace metric is used to

Determine the amount of currently available storage space
Determine the amount of available Random Access Memory
Determine the number of outstanding IOs waiting to access the disk
Determine the amount of swap space used on the DB instance.

LEAVE A COMMENT Cancel reply