Published on March 03, 2025
Table of Contents
Kubernetes has had the ability to scale the amount of workloads easily using Horizontal Pod Autoscaling (HPA), but one challenge has always been adjusting CPU and memory resources for Deployments, StatefulSets and DaemonSets without needing to restart them. That all changed with the introduction of In-Place Vertical Pod Scaling, which was added starting in Kubernetes 1.27.
This new feature allows you to adjust CPU and memory resources (both requests and limits) in running pods without having to recreate the Pod, which provides a smoother, less disruptive way to dynamically adjust resources. In this post, we’ll go through how to enable this feature, demonstrate changes to the Pod spec, explain the new entries in Pod status, and discuss how this will affect open-source projects like the Vertical Pod Autoscaler (VPA).
What is “In-Place Vertical Pod Scaling”?
In earlier versions of Kubernetes, if you needed to adjust the resource allocation of a Pod, it would need to be terminated and recreated. This is because, even though Docker and other container runtimes allow for dynamically updating the resource configuration at runtime, the Pod spec marked the resources
field as immutable, and once set it cannot be modified. This means that the only way to “modify” the resources of a Pod is to delete the old one and create a new one with the updated resource values.
In-Place Vertical Pod Scaling promises to change this, by adding new fields that allow for the resources to be modified at runtime, and providing new status fields that indicate the progress in performing the change. As this is an alpha-level feature, with changes still being made, a feature gate needs to be enabled to try it out.
The concept of “in-place” scaling for Pods goes against the “treat workloads as cattle, not pets” ethos that Kubernetes was built upon. However, there are some special use-cases, such as when using any kind of stateful or long-running workload, where exceptions exist and justify the existence of features like this.
How to enable In-Place Vertical Pod Scaling
Starting in Kubernetes 1.27, and as of current writing, to test out the In-Place Vertical Pod Scaling feature one must enable a feature gate, which needs to be set to true
before it can be used in the cluster. To do this you’ll need to:
1. Update API Server and Controller Manager Flags
In your Kubernetes cluster configuration, you’ll need to enable the InPlacePodVerticalScaling
feature gate. You can do this by modifying the kube-apiserver
, kube-scheduler
, and kube-controller-manager
flags.
In each of these components, add the flag:
--feature-gates=InPlacePodVerticalScaling=true
2. Update the Kubernetes version on all nodes
Ensure that your nodes are running a Kubernetes version that supports this feature (1.27 or higher) and that the Kubelet’s feature gates are updated to enable the In-Place Vertical Pod Scaling.
Certain cloud providers, such as GKE, can also provide a place for testing this feature by creating a cluster with all alpha features turned on. Keep in mind that this may affect the stability of the cluster and therefore caution should be taken before making this change.
How does it look like in the Pod?
Once the feature gate is enabled, modifying resource requests or limits on a running pod becomes possible. A container’s resource requests
and limits
will be mutable for CPU and memory resources. With the feature, these fields represent the desired CPU and memory resource requests and limits for the container. There is also now a resizePolicy
array with two required fields:
-
resourceName
: Specifies which resource this policy applies to. Currently only"cpu"
and"memory"
are supported. -
restartPolicy
: Defines whether a container restart is required when this resource is modified. This field can have two possible values:NotRequired
, which means that he container does not need to be restarted when this resource is changed, andRestartContainer
, where the container must be restarted to apply changes to this resource.
By default, if no resizePolicy
is specified for a resource, Kubernetes treats it as if restartPolicy: RestartContainer
is set. An example of all these fields in the Pod spec is shown below:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
memory: "100Mi"
cpu: "100m"
limits:
memory: "200Mi"
cpu: "200m"
resizePolicy:
- resourceName: cpu
restartPolicy: NotRequired
- resourceName: memory
restartPolicy: RestartContainer
The status field of a Pod now has more information as well. As of 1.27, the resize
field tracks the progress of resize operations. It can have the following values:
-
Proposed
: Theresource
field was modified to update the desired resources, but the Kubelet has not yet started the process of resizing. -
InProgress
: The Kubelet has accepted the resize request and is in the process of applying it to the pod’s containers. -
Deferred
: The requested resize cannot be completed at this moment. The Kubelet will continue to retry the resize, and it may be granted when other pods are removed and node resources are freed up. -
Infeasible
: The requested resize cannot be performed on the container, such as when the resize exceeds the maximum resources in a node. -
""
: An empty or unset value indicates that the last resize operation was completed.
However, this will change for 1.33, since this change that was merged only three weeks ago as of writing, replaces the resize
status with two new pod conditions PodResizePending
and PodResizing
. For now, we’ll continue showing the pre-1.33 schema.
The allocatedResources
field in containerStatuses
of the Pod’s status reflects the currently-allocated resources to the pod’s containers as reported by the container runtime, if the container is running. For a non-running container, these are the resources that are allocated for the container for when it starts:
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
# Pod spec as above
status:
resize: Proposed
containerStatuses:
- name: nginx
allocatedResources:
cpu: "200m"
memory: "100Mi"
# other status fields
Finally, a new Pod condition type Resizing
was also introduced, which indicates that a Pod’s resources are being modified in-place. (Note that two more conditions will be present when 1.33 is released, as explained earlier in this post). This condition looks like so in the Pod’s status:
status:
conditions:
- type: Resizing
status: "True"
lastTransitionTime: "2023-04-01T12:00:00Z"
reason: ResizeStarted
message: "Pod resources are being modified"
Potential Impact on the Vertical Pod Autoscaler
The Vertical Pod Autoscaler is an important scaling tool that helps with automatically adjusting resource requests and limits for pods based on usage patterns. It observes the current resource usage (leveraging the information from the Kubernetes Metrics Server), suggests a resource value with some buffer room for unexpected spikes, and applies the value to the Pod. However, the way it does this is by modifying the resource value in the Pod’s controlling object (think of Deployment or StatefulSet), and then evicting the Pods which had the resource values, so that new Pods with the updated resource values could replace them. For stateful workloads such as a single database or a long-running workload with in-memory cached information, this was always a particular source of frustration because you could get locked in failure scenarios that you would not be able to scale from. Imagine these situations:
-
A database that runs on a Pod has a high startup CPU spike that is dependent on the amount of rows it has to process. However, after startup the process uses very little CPU. The VPA sees this and readjusts the CPU request/limits to be lower, and evicts the Pod to apply these values. The new Pod comes up with lower resource values, and due to its startup sequence, it doesn’t have enough resources to start. Oops. This DB is also stateful so you can’t have more than one Pod running at a time, and now you have downtime. Double oops.
-
A long-running query that’s being executed on a Pod has been pegging its CPU usage for a few minutes, and the VPA sees this and increases the CPU limit. Now your job can use more CPU to finish faster, but the Pod needs to be recreated to see this new value, which kills this query. Triple oops.
These are just a few situations where scaling becomes a problem for stability, and the temporary solutions are suboptimal given the limitations of the VPA. For that first example, the Pod now needs to have a minimum amount of CPU that never gets used outside of its startup procedure, which contributes to inefficient workload resource allocation, and for the second example autoscaling is just completely disabled.
However, with In-Place Vertical Pod Scaling, we can do these kinds of adjustments on-the-fly without restarting the Pod, which avoids the situations previously mentioned. And it seems that there are PRs on the way to add this feature to the VPA. This promises to make the situations previously described a thing of the past!
Final thoughts
Kubernetes 1.27’s In-Place Vertical Pod Scaling feature is a welcome improvement to resource management, offering the ability to scale pods without requiring to recreate your Pod. This is particularly useful for workloads with changing resource demands, allowing more flexibility and less downtime. With the potential integration into the Vertical Pod Autoscaler, Kubernetes is becoming even more powerful for managing dynamic workloads with minimal disruption.
Subscribe (yes, we still ❤️ RSS) or join our mailing list below to see more blog posts like this!