Spread the love

Auto Scaling and Elastic Load Balancing

are both important Amazon Web Services (AWS) services that help with scaling and load balancing, but they do have some differences you should be aware of if you’re considering using them together.

If you don’t use the two services together but do want to learn more about them, be sure to check out our overviews of Auto Scaling and Elastic Load Balancing before continuing with this article.

Here are 10 key differences between the two services.

1) Auto Scaling’s capacity increase starts with an instance

While an Auto Scaling group will automatically increase capacity in response to increases in traffic or load, it starts with one instance.

The increase happens when an EC2 instance is launched. This can be helpful for capacity increases that are triggered by specific events; however, if a steady increase in traffic or load needs to be accommodated, then additional instances might not be launched for hours after your current resources reach their maximum capacity.

By comparison, elastic load balancing provides a more immediate scalability solution. By distributing requests across multiple servers based on application-level variables like session state, you can scale your app’s capacity up and down almost instantly.

2) ELB allows you to increase capacity in both directions

by adding more instances or by increasing instance size. For example, if you are running a web application that has high traffic in a 24×7 pattern, ELB lets you scale capacity both by launching new instances to reduce latency or adding larger EC2 instances to increase capacity.

Both scaling approaches increase resource utilization concerning your demand, which helps keep costs down.

When you use auto scaling with EC2 instances, however, AWS launches more instances only when it is below its target number of running instances—not when your application traffic increases.

3) Autoscaling runs from instance failure

With autoscaling, AWS will launch new instances in response to instance failure. This is beneficial because instance failure can often be a symptom of deeper infrastructure issues, so it’s best to get new instances running as soon as possible.

You may experience an increased initial cost during autoscaling, but it could save you from serious data loss or system downtime down the road.

With load balancing, there is no replacement for failed instances. While autoscaling is appropriate for more mission-critical applications, like transactional databases or web servers that require real-time uptime, load balancing may be a better option for elastic compute clusters with less strict requirements around availability or consistency.

4) ELB can terminate instances based on statistics

AWS’s Elastic Load Balancing (ELB) service is an extremely flexible traffic-routing solution that you can use to handle incoming traffic to your EC2 instances.

It can, for example, detect when an instance has failed and route incoming requests to a healthy instance.

Amazon’s Autoscaling Application Programming Interface (API) supports setting up Auto Scaling groups with target metrics, but it doesn’t offer advanced features such as rerouting traffic when your instances fail or restarting them if they become unresponsive.

However, you can do both of these things by combining AWS Auto Scaling and ELB with a small amount of scripting in Python or Ruby.

5) Auto scaling relies on alarms

To trigger a scaling event in auto scaling, you must set an alarm. An alarm is created based on a CloudWatch metric.

If you have set up more than one CloudWatch metric, alarms can be set to trigger only when specific conditions occur (for example, when CPU usage exceeds 80% for 10 minutes).

To scale your resources using auto scaling, you must configure your alarms to trigger a scaling action (for example, resize an instance or increase capacity by launching a new instance).

In addition to setting alarms based on your CloudWatch metrics, you can also define fixed triggers that will automatically add or remove instances from your fleet each time they are reached (for example, every Monday at 2:00 am) regardless of how Amazon EC2 utilization is performing.

6) ELB does health checks

Both of these AWS services can automatically scale your application, but there’s a major difference in how they manage instances.

With an ELB, you create health checks that keep track of whether your application is running as expected.

If it isn’t, your ELB will remove those instances from rotation. With Auto Scaling, on the other hand, you specify your minimum and a maximum number of instances to maintain at all times.

If it senses those numbers have changed (for example due to increased traffic or insufficient capacity), it’ll add or remove new resources to match your specified ranges.

7) no single point of failure for auto-scaling groups

When you have an ELB, it can only host one application. If that app fails, or if there is a degradation in performance for some reason, you will not be able to shift traffic away from that app automatically.

Additionally, if your ELB goes down (which can happen) then you will not be able to increase or decrease capacity on demand; you’ll need to wait until it comes back up before traffic distribution goes back to normal.

With auto scaling groups, however, you do not have a single point of failure: even if an entire AZ goes down (as long as other AZs in other regions remain available), it doesn’t affect your ability to manage your autoscaling group.

8) No rule limits are needed for ELB

A load balancer will accept any traffic that is sent to it. As a result, there’s no need for policies with ELB. This means that you can use ELB when you want to decouple your workload from Amazon EC2 instances.

There are times when you need to control user access based on their location or some other reason.

For example, security policies will be necessary if you want to grant different users access to different resources without any IT involvement on your part.

9) Auto scaling has no DNS delay

The DNS response time for ELB is determined by two factors: how long it takes to receive a TCP/IP packet from your instance, and how long it takes to respond with an IP address.

The latter can be increased by specifying a TTL greater than one, but you cannot decrease it below one minute.

As such, if ELB responds too slowly, requests may timeout before they reach your application. Since auto scaling determines when new instances are needed based on metrics collected every five minutes, it has no way of knowing when traffic increases or decreases.

This means that if your service receives more traffic than expected, auto-scaling might not be able to react fast enough – causing timeouts for users.

10) ELB requires additional firewall configuration

It’s easy to forget how a load balancer works once you start using it. In its basic form, a load balancer forwards traffic to multiple EC2 instances (which might be running on your infrastructure or within AWS), so it can handle a larger workload than any single instance.

But that core functionality means you need to configure a firewall for incoming traffic, so ELB can detect which of those instances is healthy enough to receive requests.

If you don’t configure that rule, you end up with just one EC2 instance handling all your traffic—not exactly what you want!

Configuring your firewall appropriately requires additional configuration steps in AWS that aren’t required when using Auto Scaling.