Autoscaling ensures that you have the sufficient AWS EC2 instances available to handle the load and it also ensures that you don't have more instances allocated than you require.

The load on a server fluctuates over time. Autoscaling allows you to allocate more EC2 resources when you need them and retire them when you no longer need them. You can either manually define peak and low-peak times or you can define rules based on CloudWatch alarm that automatically allocate and retire resources as needed.

When you add more CPU or RAM to your virtual machines, it is called scaling up. When you reduce CPU or RAM, it is called scaling down Collectively, this is called vertical scaling. When you add more nodes or components, it is called scaling out. When you reduce nodes, it is called scaling in. Collectively, this is called horizontal scaling.

To setup scaling, you need to answer what, where, and when:

  1. Define launch configuration (What?). What will be launched by autoscaling; AMI, instance types, security group, roles?
  2. Define autoscaling group (Where?). Where will you deploy; VPC, subnet, load balancer, minimum instances required, maximum instances allowed, desired capacity?
  3. Define autoscaling policy (When?). When to launch autoscaling. It could be scheduled or on-demand.

For dynamic autoscaling, you need to create CloudWatch alarm which will trigger autoscaling.

To setup autoscaling:

  1. Login to console
  2. Click on Create Autoscaling group under Autoscaling
  3. Choose AMI
  4. Create launch configuration
  5. Create autoscaling group
  6. Create autoscaling policy