Monitoring and Alerting for AWS RDS

  Amazon Web Services (AWS), CloudWatch

Main Menu

Section Menu

Introduction to monitoring and alerting for AWS RDS

https://www.udemy.com/aws-monitoring-alerting-with-aws-cloudwatch-and-aws-sns/learn/v4/t/lecture/7082820?start=0

Overview

  • Intro to RDS
  • Key Metric available
  • Monitoring for availability and performance

AWS RDS

  • Relational Database Service
  • Provides a scalable database service
  • PaaS – Platform as a service
  • Multiple Database engines supported
    • MySQL
    • PostGre
    • MSSQL
    • Oracle
    • MySQL Clone called Aurora
  • Highly scalable for large data loads
  • Can be configured with High Availability mode
    • Automatically fails over to a healthy replica

Monitoring AWS RDS

  • Proper monitoring and alerting is essential
  • RDS and CloudWatch provide various monitoring options
    • Provided Metrics
    • Alarms
    • Logs
    • Events

Demo – Overview of RDS Monitoring

  • Events
    • Events related to the database instance
      • Includes the time and date of the event
      • Source name and source type
      • Message associated with the event
      • Can be configured to send notifications
    • Database
    • DB Snapshots
    • DB Security Group changes
    • Parameter Groups
  • Event Suscriptions allow you to define the type/method of sending alerts.
  • Logs:RDS > Instances > Select instance > Scroll way down to Logs
    • Available logs depend on the type of DB engine you are using.
      • MySQL
        • Error Log
        • Slow Query Log
        • General Log
    • Logs are updated every 5 seconds

Monitoring AWS RDS metrics

https://www.udemy.com/aws-monitoring-alerting-with-aws-cloudwatch-and-aws-sns/learn/v4/t/lecture/7082822?start=0

Key Metrics

  • CPU Utilization: % of CPU utilized
  • *Database Connections: Number of connections currently in use by all the clients of the database.
  • Freeable Memory: Ram Available (MB)
  • *Free Storage Space: Storage Space Available (MB)
  • Read IOPs / Write IOPs
    • High IOPs may suggest volume backing the database may get overloaded
      • This can cause slower responses
  • *Read Latency / Write Latency: Track the average time taken per disk input, output or IO operation.
    • High latency values are a cause of concern because it is an indicator that:
      • disk activity is overloaded
      • performance is degraded

* These are the most important when it comes to production monitoring!

Demo: RDS Metrics

  • RDS > Instances > Select Database Instance
    • Demo is old and does not match current AWS display 🙁

Create Alarm

  • RDS > Instances > Select Database Instance > Scroll down to “CloudWatch Alarms” > [Create Alarm]
  • Typical alarm setup
    • Where to send the notification
    • Metric to monitor
    • Thresholds
    • Alarm name

AWS RDS events

https://www.udemy.com/aws-monitoring-alerting-with-aws-cloudwatch-and-aws-sns/learn/v4/t/lecture/7082824?start=0

Key metrics should be monitored closely

  • Resource Utilization
  • System errors
  • Accidental termination
  • Cluster health
  • Maintenance windows
  • Query logs
  • RDS metrics
  • Instance level metrics
  • RDS resource modification

RDS Events

  • RDS > Events
    • Generated whenever there is a new configuration change or important operational change from within the RDS instance.
  • You can retrieve events from RDS
    • Console: 24 hr
    • RDS API: 14 Days
    • AWS CLI: 14 Days

Creating an Event Subscription

  • RDS > Event Subscriptions > [Create Event Subscription]
    • Details:
      • Name
    • Target:
      • Send notifications to: Select ARN (Topic)
    • Source:
      • Source Type:
        • Instances
        • Security groups
        • Parameter groups
        • Snapshots
        • DB clusters
        • DB cluster snapshots
      • Instances to include:
        • All instances
        • Specific (Select from dropdown)
      • Event Categories
        • All
        • Select specific
          • availability
          • backtrack
          • backup
          • configuration change
          • creation
          • deletion
          • failover
          • failure
          • low storage
          • maintenance
          • notification
          • read replica
          • recovery
          • restoration
      • Note: by selecting All Instances and All Categories, we should be notified on any change.
      • [ Create ]
        • This took SEVERAL minutes

Testing

  • Reboot the instance
    • It took a while before the notifications came through. Might be a yahoo thing.

Recap:

  • Learned various metrics available
  • Setup Alarms and Notifications
  • Viewed Events and create an Event Subscription
  • Viewed logs

Quiz

What does RDS stand for

  • Rational Database Service
  • Relational Database Service
  • Resolution Database Service
  • Revolution Database Service

Which of the following key metrics are available for AWS RDS

  • CPUUtilization, DatabaseConnections
  • FreeMemory, FreeStorageSpace
  • ReadIOPS/WriteIOPS
  • All of the above

For Monitoring AWS RDS databases, CloudWatch provides:

  • Only CloudWatch Alarms
  • Only CloudWatch Logs
  • Only CloudWatch Events
  • All of the above

Which of the following metrics is used to check the number of database connections in use:

  • DiskQueueDepth
  • DatabaseConnections
  • ReadThroughput
  • WriteThroughput

The FreeStorageSpace metric is used to

  • Determine the amount of currently available storage space
  • Determine the amount of available Random Access Memory
  • Determine the number of outstanding IOs waiting to access the disk
  • Determine the amount of swap space used on the DB instance.

 

LEAVE A COMMENT