Sunday, May 25, 2025

Scaling Policies for Auto Scaling Groups

 

Auto Scaling Groups – Scaling Policies are a fundamental component of AWS Auto Scaling, which helps you automatically adjust the number of Amazon EC2 instances in your Auto Scaling group (ASG) based on demand.

Here’s a concise breakdown:

 Types of Scaling Policies

  1. Target Tracking Scaling
    • How it works: Adjusts capacity to keep a specified metric (like CPU utilization) at or near a target value.
    • Example: Keep average CPU utilization at 50%.
    • Use case: Easy to set up and effective for most scenarios.
  2. Step Scaling
    • How it works: Scales based on step adjustments (scale-out or scale-in) based on the size of the alarm breach.
    • Example:
      • If CPU > 60%, add 1 instance.
      • If CPU > 80%, add 2 instances.
    • Use case: More granular control over scaling actions.
  3. Simple Scaling (Legacy)
    • How it works: Performs a single scaling action when a CloudWatch alarm is triggered, then waits for the cooldown period.
    • Use case: Simpler scenarios or backward compatibility.
  4. Scheduled Scaling
    • How it works: Increase or decrease instance count at specific times (e.g., daily or weekly).
    • Example: Add 2 instances at 8 AM on weekdays.
    • Use case: Predictable load patterns like office hours.
  5. Predictive Scaling (Optional Add-On)
    • How it works: Uses machine learning to forecast future traffic and scale ahead of time.
    • Use case: Applications with regular and predictable traffic patterns.

 Key Concepts

  • Scaling Out: Add instances to handle increased load.
  • Scaling In: Remove instances to save cost during low demand.
  • Cooldown Period: Prevents additional scaling actions until previous activity has completed and stabilized.
  • Instance Warm-up: Time it takes for a new instance to become ready for traffic.

Scaling Policies for Auto Scaling Groups

Scaling policies define how and when your Auto Scaling Group adds or removes EC2 instances based on conditions or schedules.

 1. Target Tracking Scaling Policy

What it does:

Automatically adjusts capacity to maintain a target value for a specific CloudWatch metric.

 Example:

·        Target CPU utilization = 50%

·        If CPU rises above 50%, it scales out.

·        If CPU drops below 50%, it scales in.

 Use case:

Ideal for most apps with varying workloads.

# json
{
  "PolicyType": "TargetTrackingScaling",
  "TargetTrackingConfiguration": {
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ASGAverageCPUUtilization"
    },
    "TargetValue": 50.0
  }
}

 2. Step Scaling Policy

What it does:

Adjusts the number of instances in steps, based on the severity of metric breaches (CloudWatch alarms).

 Example:

·        CPU > 60% → add 1 instance

·        CPU > 80% → add 2 instances

Use case:

When you want fine-grained control.

# json
{
  "PolicyType": "StepScaling",
  "AdjustmentType": "ChangeInCapacity",
  "StepAdjustments": [
    {
      "MetricIntervalLowerBound": 0,
      "MetricIntervalUpperBound": 20,
      "ScalingAdjustment": 1
    },
    {
      "MetricIntervalLowerBound": 20,
      "ScalingAdjustment": 2
    }
  ]
}

3. Scheduled Scaling

What it does:

Scales capacity based on a specific time schedule.

 Example:

·        Add 2 instances every weekday at 8 AM.

·        Remove 2 instances at 6 PM.

 Use case:

Workloads with predictable usage patterns.

# json
{
  "ScheduledActionName": "twtech-scale-up-morning",
  "StartTime": "2025-05-25T08:00:00Z",
  "Recurrence": "0 8 * * 1-5",
  "MinSize": 2,
  "MaxSize": 5,
  "DesiredCapacity": 4
}

 4. Simple Scaling (Legacy)

What it does:

Takes a single scaling action when an alarm is triggered, then waits for cooldown.

Use case:

Basic scaling scenarios (not recommended for complex workloads).

 5. Predictive Scaling

What it does:

Uses machine learning to forecast traffic and proactively scale.

 Use case:

Apps with repeating traffic patterns.

Enable via AWS Auto Scaling console or CLI with metrics like CPU or request count.

twech Best Practices

·        Use Target Tracking for ease of setup.

·        Combine Step Scaling with CloudWatch for detailed control.

·        Use Scheduled Scaling for predictable workloads.

·        Enable instance scale-in protection for important nodes.

·        Tune cooldown/warm-up periods to avoid scaling loops.

·        Monitor and adjust cooldown/warm-up settings to avoid flapping (rapid scale-in/out).

Project: Hands-on

Aws Auto Scaling policies typesDynamic scaling policies, Predictive scaling policies and Scheduled actions.

How twtech creates a Dynamic scaling policies: twtect-dynamic-scaling-policy





To make the policy take effect, twtech has to edit the auto-scaling group: twtech-ASG

From:

To maximium capacity: 3

Save changes: update

Connect to the instance using instance connect:  because instacnce was created without keypair

Connect to the instance using: instance connect

How twtech stresses the ec2-instance to make cpu utilization go up to: 100%.

Install stress amazon linux2: run the following command first

sudo yum install stress -y

other commands: to make cpu utilization go to 100%: by using all cpu units

stress -c 4

From a different window:  go to Auto Scaling group then monitory tab to see the metrics

Auto Scaling: Activity history (5)

When instances are stress, the Dynamic scaling policies would trigger the auto scaling group to auto provision other new instances.


More instances scale out due to stress on cpu utilization: twtech-ASG

How twteck also verify automatic Scaling from the console: twtech-ASD

How twtech also track auto scaling from cloudwatch service then on the menu tab:  alarms


Then on the menu tab:  alarms

There there two types of alarms created:

Alarm high in red colour (scale out or add instances)

 And

Alarm low in green colour ( scales in or reduces instances) 


What if the instances under Auto scaling group are terminated: new instances will automatically be provisioned by the Auto Scaling group.


Activity History: keeps a record of activities by the auto scaling group


No comments:

Post a Comment

Kubernetes Clusters | Upstream Vs Downstream.

  The terms "upstream" and "downstream" in the context of Kubernetes clusters often refer to the direction of code fl...