grofers.rds-alarms
RDS Alarms
This tool helps you create warning and critical alerts for RDS (Relational Database Service) instances on Amazon CloudWatch. For more information, you can read the blog post.
:boom: Tested and used at Grofers
Requirements
To use this tool, you will need:
- An SNS topic in AWS
- The
boto
library - The AWS CLI tool
- An IAM Policy with these permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt19471460522000",
"Effect": "Allow",
"Action": [
"cloudwatch:DeleteAlarms",
"cloudwatch:DescribeAlarms",
"cloudwatch:PutMetricAlarm"
],
"Resource": [
"*"
]
},
{
"Sid": "Stmt1947940274000",
"Effect": "Allow",
"Action": [
"rds:DescribeDBInstances"
],
"Resource": [
"*"
]
}
]
}
Installation
To install this tool, simply run:
$ ansible-galaxy install grofers.rds-alarms
Role Variables
The following variables can be set:
rds_alarms_region
- AWS region (Required)rds_alarms_common_action_list
- List of ARNs for SNS topicsrds_alarms_period
- Time (in seconds) between evaluations of metricsrds_alarms_evaluation_periods
- Number of evaluations before making a decisionrds_alarms_warning_threshold
- Warning threshold (default - 75%)rds_alarms_critical_threshold
- Critical threshold (default - 90%)rds_alarms_warning_cpu_credits_threshold
- Warning threshold for CPU Credits (default - 30)rds_alarms_critical_cpu_credits_threshold
- Critical threshold for CPU Credits (default - 15)rds_alarms_db_instances
- A dictionary with the following structure:
rds_alarms_db_instances:
<rds-instance-identifier>:
warning_db_connections_threshold: 100
critical_db_connections_threshold: 200
warning_burst_balance_threshold: 100
critical_burst_balance_threshold: 200
alarm_action_list: ["arn:aws:sns:us-east-1:9783248248:MYALARM"]
critical_threshold: 90 # Optional
warning_threshold: 75 # Optional
credit_warning_threshold: 30 # Optional
credit_critical_threshold: 15 # Optional
replica_lag_threshold: 1800 # Required only for replicas, in seconds
Naming Convention
The names of the alarms created in Amazon CloudWatch follow this pattern:
rds-<instance_name>-<metric_name>-<alert_type>
.
For example, a warning alarm for CPU usage on an instance named my-rds-instance
will be called rds-my-rds-instance-cpu-warning
.
Example Playbook
This playbook sets up alarms for my-rds-instance-identifier
with default settings. For the my-replica-rds-instance-identifier
, it uses a warning threshold of 80% and the default critical threshold of 90%. If the instance is a replica, it will also create an alarm for the replication lag. For t2
instances, it creates alarms for remaining CPU credits.
- hosts: localhost
connection: local
vars:
rds_alarms_common_action_list:
- "arn:aws:sns:us-east-1:9783248248:ALARMS"
rds_alarms_period: 60
rds_alarms_evaluation_periods: 2
rds_alarms_region: us-east-1
rds_alarms_warning_threshold: 70
rds_alarms_critical_threshold: 90
rds_alarms_warning_cpu_credits_threshold: 60
rds_alarms_critical_cpu_credits_threshold: 30
rds_alarms_db_instances:
my-rds-instance-identifier:
warning_db_connections_threshold: 100
critical_db_connections_threshold: 200
alarm_action_list: ["arn:aws:sns:us-east-1:9783248248:MYALARM"]
my-replica-rds-instance-identifier:
warning_threshold: 80
warning_db_connections_threshold: 100
critical_db_connections_threshold: 200
alarm_action_list: ["arn:aws:sns:us-east-1:9783248248:MYALARM"]
credit_warning_threshold: 20
credit_critical_threshold: 10
replica_lag_threshold: 1800
roles:
- rds-alarms
Limitations
You will need different playbooks for different AWS regions.
License
Creates Cloudwatch Alarms for RDS instances
ansible-galaxy install grofers.rds-alarms