How to Enable Autoscaling for a GKE Cluster

Learn how to enable autoscaling in Google Kubernetes Engine (GKE) for responsive workload management and cost optimization.

Patrick Londa
Author
Dec 8, 2022
 • 
5
 min read
Share this post

Autoscaling is an automated, node provisioning process that scales your GKE clusters depending on their workload needs. As a result, GKE clusters with autoscaling enabled scale up their node pool to offer more workload availability when demand is high and scale down their node pool to save on costs when demand is low.

You can control your cluster’s autoscaling by specifying a minimum and maximum number of nodes. You can also choose whether to use the default, balanced autoscaling method or the optimize-utilization setting.

In this post, we’ll walk you through the basics of GKE cluster autoscaling and show you how to enable it for your node pools.

gcp logo
Blink Automation: Enabling Autoscaling for a GKE Cluster or Node Pool
Blink + GCP
Get Started

Understanding Your GKE Autoscaling Options

When you enable autoscaling, you have the ability to set guardrails and preferences:

Minimum and Maximum Nodes

One of the decisions you need to make is the minimum and maximum number of nodes, either per zone (minimum nodes, maximum nodes) or in total (total minimum nodes, total maximum nodes) across your node pools.

For a minimum, you’ll always need at least 1 node for each zone the node pool is in. The autoscaler will never scale down to zero because at least 1 node is needed to run the system Pods.

When setting a maximum, you might want to consider the implications of a dramatic scale up. For example, if you scale beyond the IP address space you have allocated, you will receive an error and no longer be able to add new nodes. Consider these types of dependencies when selecting a maximum.

Balanced vs. Optimize-Utilization

There are two types of autoscaling profiles. The default profile is balanced, which means it scales up and down with a balance between availability of resources and node utilization. For example, balanced autoscaling allows for more nodes with lower utilization so that they are available if workloads increase.

The optimize-utilization autoscaling profile by comparison prioritizes concentrating utilization in fewer nodes, which enables the removal of underutilized nodes. The result is faster scale downs and lower resource costs. If you choose this profile, you may experience performance delays when new workloads require new resources to be provisioned. Depending on your performance requirements, optimize-utilization may be a useful way to lower your GKE operating costs.

Configuring Autoscaling for an Existing Node Pool

If you want to enable autoscaling, you can start by updating your existing node pools using GCP Console or the GCP CLI.

Using the GCP Console:

1. In the Google Cloud console, navigate to the Google Kubernetes Engine page.

2. Select the cluster you want to update from the displayed cluster list, and go to the Nodes tab.

3. Under Node Pools, select the node pool that you want to update and click Edit.

4. Under Size, check Enable autoscaling.

5. Specify values for Minimum number of nodes and Maximum number of nodes, and click Save.

Using the gCloud CLI:

You can use the following command to enable autoscaling for an existing node pool:

gcloud container clusters update CLUSTER_NAME \
    --enable-autoscaling \
    --autoscaling-profile=PROFILE \
    --node-pool=POOL_NAME \
    --min-nodes=MIN_NODES \
    --max-nodes=MAX_NODES \
    --region=COMPUTE_REGION

Plug in your details for the cluster, node pool, and region. If you only have one node pool, you can use default-pool as your value. The region value should be your Compute Engine region, or specific zone if it’s a zonal cluster.

 The --enable-autoscaling flag invokes autoscaling. You can customize your configuration with --autoscaling-profile (balanced or optimize-utilization), --min-nodes, --max-nodes, --total-max-nodes, and --total-max-nodes.

Here’s an example of enabling autoscaling with an optimize-utilization profile for the pool-1 node pool of the demo-1 cluster:

gcloud container clusters update demo-1 \
    --enable-autoscaling \
    --autoscaling-profile=optimize-utilization \
    --node-pool=pool-1 \
    --min-nodes=1 \
    --max-nodes=4 \
    --region=us-central1

Enabling Autoscaling When Creating a New GKE Cluster or Node Pool

If you are creating new clusters or node pools, you can enabling GKE autoscaling by using settings in the GCP Console or GCP CLI flags:

Using the GCP Console:

1. In the Google Cloud console, go to the Google Kubernetes Engine page.

2. To set up a new cluster, click Create. To create a new node pool, select an existing cluster and click Add Node Pool.

3. Specify details for your cluster or node pool. For a new cluster, click default-pool under Node Pools from your navigation pane.

4. Next, you need to select the Enable autoscaling checkbox. For node pools, you’ll find it under the Size section. 

6. Modify the values for Minimum number of nodes and Maximum number of nodes according to your requirements, and click Create.

Using the gCloud CLI:

When you are creating new clusters or new node pools, you can ensure that they have autoscaling enabled by including the --enable-autoscaling flag and specifying --min-nodes and --max-nodes values.

Here’s an example of creating a new cluster with this command:

gcloud container clusters create my-cluster --enable-autoscaling \
    --num-nodes=30 \
    --min-nodes=15 --max-nodes=50 \
    --region=us-central

Here’s an example of creating a new node pool with this command:

gcloud container node-pools create node-pool-1 
    --cluster=sample-cluster 
    --num-nodes=5
    --enable-autoscaling
    --min-nodes=5 --max-nodes=15

By updating your existing node pools, and creating new clusters and node pools with autoscaling enabled, you’ll be able to ensure your clusters are able to automatically adapt to changing workloads.

Checking Whether Autoscaling is Enabled with Blink

Autoscaling can make your GKE clusters more effective and efficient. If you want to make it a standard that your organization's GKE clusters have autoscaling enabled, then you can manually enable it using the steps above. Unfortunately, that requires many manual updates and doesn’t ensure that future clusters will also have autoscaling enabled.

With Blink, you can use no-code automations to quickly check if you have any GKE clusters without autoscaling enabled. You can then send a notification to Slack on a regular basis if there are any clusters missing this setting. You can be more agile with making adjustments and getting information when you are only a click away.

Get started with Blink and better manage your GKE clusters today.