Jump to content

Karpenter graduates to beta


Recommended Posts

Introduction

Karpenter is a Kubernetes node lifecycle manager created by AWS, initially released in 2021 with the goal of minimizing cluster node configurations. Over the past year, it has seen tremendous growth, reaching over 4900 stars on GitHub and merged code from more than 200 contributors. It is in the process of being donated to the Cloud Native Computing Foundation (CNCF) as part of Kubernetes Autoscaling Special Interest Group.

As part of this growth, there is a growing need for Karpenter’s APIs to mature, to offer more stringent stability guarantees to users that don’t want to deal with the number of breaking changes the project has made in it’s alpha state. This marks a significant milestone in the project’s evolution. With this transition, customers benefits from the increased level of maturity and API stability that the beta version offers. This also marks a commitment from us to prioritize backward compatibility, which means customers can confidently adopt new features and enhancements without the worry of disruptive changes down the line. This release, like previous releases, incorporates feedback from the open-source community.

The API changes are being rolled out as part of Karpenter version 0.32.0 release. Existing deployments need to be upgraded to this version, following the migration path outlined in this post and further detailed in the Karpenter upgrade guide.

Existing alpha APIs are now deprecated and remain available in this single version. Starting with the release v0.33.0, Karpenter will only support its v1beta1 APIs. 

Karpenter APIs follow a maturity progression of alpha → beta → stable.

The graduation from alpha to beta required significant changes to the APIs, which are highlighted in this post. We don’t anticipate the graduation from beta to stable to require the same level of changes.

If you’re curious about the Kubernetes APIs graduation process, then please see this post.

What is changing

In the journey to get to a stable v1, we’ve made significant changes to our APIs from alpha to beta to improve the ease-of-use by dropping areas of the APIs that commonly gave users problems. One of these areas was naming, where we saw confusion the use of the word provisioner (i.e., overloaded term in the realm of storage) and generally wanted to reduce the number of concepts that users had to reason about.

With this release, Karpenter deprecates the Provisioner, AWSNodeTemplate and Machine APIs, while introducing NodePool, EC2NodeClass, and NodeClaim. We’ve taken an holistic view and streamlined the APIs around the single concept of Node.

Walkthrough

API group and kind naming

The v1beta1 version introduces the following new APIs while deprecating the existing ones:

  • karpenter.sh/Provisioner becomes karpenter.sh/NodePool
  • karpenter.sh/Machine becomes karpenter.sh/NodeClaim
  • karpenter.k8s.aws/AWSNodeTemplate becomes karpenter.k8s.aws/EC2NodeClass

Each of these naming changes comes with schema changes that need to be considered as you update to the latest version of Karpenter. Let’s look at each change and what the new API definition looks like.

v1alpha5/Provisioner → v1beta1/NodePool

NodePool serves as the successor to Provisioner, exposing configuration-based parameters that impact the compatibility between Nodes and Pods during scheduling (such as requirements, taints, and labels). It also encompasses behavior-based settings for fine-tuning Karpenter’s scheduling and deprovisioning decision-making.

A pool resolves to a mix of instance types and sizes, while still enforcing limits on how workloads request resources. It facilitates the grouping of provisioning and deprovisioning behavior. Importantly, a pool shouldn’t have any cloud-specific configurations to maintain a portable configuration.

In Karpenter v1beta1, all non-behavioral fields are encapsulated within the NodePool template field. In the case of Karpenter, NodePool template NodeClaims, which are then orchestrated by the Karpenter controller. This mirrors the concept of Deployments, where Pods are templated and orchestrated by the deployment controller.

You can read more about NodePools in the documentation. An example NodePool looks something like this:

apiVersion: karpenter.sh/v1beta1
kind: NodePool
...
spec:
  template:
    metadata:
      annotations:
        custom-annotation: custom-value
      labels:
        team: team-a
        custom-label: custom-value
    spec:
      nodeClassRef:
        name: default
      requirements:
      - key: karpenter.k8s.aws/instance-category
        operator: In
        values: ["c", "m", "r"]
      ...
      kubelet:
        systemReserved:
          cpu: 100m
          memory: 100Mi
          ephemeral-storage: 1Gi
        maxPods: 20
  disruption:
    expireAfter: 360h
    consolidationPolicy: WhenUnderutilized

When you look at the example specification, you’ll notice a new section called disruption. This groups the previous settings (ttlSecondsAfterEmpty, ttlSecondsUntilExipred, and consolidation.enabled) for consolidation, expiration, and empty nodes.

Karpenter set defaults for the disruption configuration if it isn’t specified when applying the NodePool manifest. The default values are highlighted below and you can read more about the behavior of these fields in the documentation.

Field

Default

spec.disruption.consolidationPolicy WhenUnderutilized
spec.disruption.expireAfter 720h

v1alpha1/AWSNodeTemplate → v1beta1/EC2NodeClass

EC2NodeClass serves as the successor to AWSNodeTemplate, exposing cloud provider specific fields that affect launch and bootstrap process for that Node including: configuring Amazon Machine Image (AMI), security groups, and subnets you want to use as well as details about block storage, user-data, and Instance Metadata settings.

The Karpenter spec.instanceProfile field has been removed from the EC2NodeClass in favor of the spec.role field. Karpenter now auto-generates the instance profile in your EC2NodeClass given the role that you specify.

The spec.launchTemplateName field, which was already deprecated, for referencing unmanaged launch templates within Karpenter has been removed. If you are still using it, then you need to migrate it to the Karpenter-managed versions using EC2NodeClass.

You can read more about NodeClass in the documentation. An example EC2NodeClass looks something like this:

apiVersion: compute.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: Bottlerocket
  role: KarpenterNodeRole-karpenter-demo
  subnetSelectorTerms:
  - tags:
      karpenter.sh/discovery: karpenter-demo
  securityGroupSelectorTerms:
  - tags:
      karpenter.sh/discovery: karpenter-demo
  tags:
    test-tag: test-value

v1alpha5/Machine→ v1beta1/NodeClaim

In Karpenter v0.28.0, a new type called Machine was added. This allowed for multiple node provisioning improvements that allowed native Kubernetes controllers to join nodes to the cluster and still lets Karpenter manage and track the node. If you’re on a version of Karpenter before v0.28.0, then you won’t have this resource type.

With the release v0.32.0 this has changed to NodeClaim. NodeClaims aren’t intended to be created by cluster operators, and instead they’re created and deleted by Karpenter. You shouldn’t have to make any changes for NodeClaims to work, but if you’re troubleshooting a node in a cluster this is a great place to see the lifecycle and health of a node as Karpenter manages it.

Labels changes

Karpenter v1beta1 introduces changes to the common labels karpenter.sh/do-not-evict and karpenter.sh/do-not-consolidate, which have been deprecated and unified under the single label karpenter.sh/do-not-disrupt. These can be applied to both Pods and Nodes and prevents Karpenter from performing node disruption and Pod eviction.

More flexible selectors for AMIs, subnets, and security groups in NodeClass

The current selector settings have been somewhat limiting in their capacity to identify and use different settings for nodes being provisioned. The existing behavior would apply AND logic, which made it harder to match settings in various clusters and regions. To address this, we’ve extended the selectors, which gives you the capability to specify multiple terms. These terms are now combined using an OR logic, which means they’re evaluated together until a match is identified.

An example for matching an AMI with name my-name1 or my-name-2, and owner 123456789 or amazon would look like this:

amiSelectorTerms:
- name: my-name1
  owner: 123456789
- name: my-name2
  owner: 123456789
- name: my-name1
  owner: amazon
- name: my-name2
  owner: amazon

Similar settings can be made for subnetSelectorTerms and securityGroupSelectorTerms, which you can read more about in the Karpenter documentation.

securityGroupSelectorTerms:
- id: abc-123
  name: default-security-group # Not the same as the name tag
  tags:
    key: value
# Selector Terms are ORed
- id: abc-123
  name: custom-security-group # Not the same as the name tag
  tags:
    key: value

Drift enabled by default

Starting from the next release (v0.33.0), the drift feature will be enabled by default. If you don’t specify the Drift featureGate, the feature is assumed to be enabled. You can disable the drift feature by specifying --feature-gates DriftEnabled=false in the command line arguments to Karpenter. This feature gate is expected to be fully dropped when core APIs (NodePool, NodeClaim) are bumped to v1.

Migration path

Update Karpenter controller AWS IAM role

The Karpenter controller uses an AWS Identity and Access Management (AWS IAM) role to grant the permissions to launch and operate Amazon Elastic Compute Cloud (Amazon EC2) instances in your AWS account. As part of the upgrade, create a new permission policy by adding the following:

  1. Add to the ec2:RunInstances, ec2:CreateFleet, and ec2:CreateLaunchTemplate permissions scoped down to the tag-based constraint karpenter.sh/nodepool instead of the previous tag key karpenter.sh/provisioner-name.
  2. Grant permissions to the actions iam:CreateInstanceProfile, iam:AddRoleToInstanceProfile, iam:RemoveRoleFromInstanceProfile, iam:DeleteInstanceProfile, and iam:GetInstanceProfile. All of these permissions (with the exception of the GetInstanceProfile permission) are constrained by tag-based policy that ensures that the controller only has permission to operate on instance profiles that it was responsible for creating. These are needed to support the Karpenter-managed instance profiles.

One the migration is completed, and you’ve rolled out the new nodes as described in the following details, you can safely remove the previous permission policy.

An example of the permission policy is available in the Karpenter GitHub repository and it is distributed as part of the project getting started AWS CloudFormation template.

API migration

To transition from the alpha to the new v1beta1 APIs, you should first install the new v1beta1 Custom Resource Definitions (CRDs). Subsequently, you need to generate the beta equivalent of each alpha API for both Provisioners and AWSNodeTemplates. It’s worth noting that the migration from Machine to NodeClaim is managed by Karpenter as you transition your CustomResources from Provisioners to NodePools and remains seamless for users.

We’re happy to introduce karpenter-convert, which is a command line utility designed to streamline the creation of NodePool and EC2NodeClass objects. In the following, you’ll find the steps to effectively utilize this tool:

  1. Install the command line utility: go install github.com/aws/karpenter/tools/karpenter-convert/cmd/karpenter-convert@latest
  2. Migrate each provisioner into a NodePool: karpenter-convert -f provisioner.yaml > nodepool.yaml
  3. Migrate each AWSNodeTemplate into an EC2NodeClass: karpenter-convert -f nodetemplate.yaml > nodeclass.yaml
  4. For each EC2NodeClass generated by the tool you need to manually specify the AWS role. The tool leaves a placeholder $KARPENTER_NODE_ROLE, which you need to replace with your actual role name.

For each Provisioner resource, you need to decide whether you want to roll nodes one-at-a-time or roll all Provisioner nodes all-at-once. A detailed step-by-step guide is provided in the following section:

Periodic rolling with drift

With drift enabled, for each Provisioner in your cluster, perform the following actions:

  1. Migrate your alpha CRDs to v1beta1
  2. Add a taint to the old Provisioner such as karpenter.sh/legacy=true:NoSchedule
  3. Karpenter drift marks all machines/nodes owned by that Provisioner as drifted
  4. Karpenter drift launches replacements for the nodes in the new NodePool resource
    1. Currently, Karpenter only supports rolling of one node at a time, which means that it may take some time for Karpenter to completely roll all nodes under a single Provisioner

Forced deletion

For each Provisioner in your cluster, perform the following actions:

  1. Create a NodePool/EC2NodeClass in your cluster that is the v1beta1 equivalent of the v1alpha5 Provisioner/AWSNodeTemplate
  2. Delete the old Provisioner with kubectl delete provisioner <provisioner-name> --cascade=foreground
    1. Karpenter deletes each Node that is owned by the Provisioner, draining all nodes simultaneously and launches nodes for the newly pending pods as soon as the Nodes enter a draining state

Manual rolling

For each Provisioner in your cluster, perform the following actions:

  1. Create a NodePool/EC2NodeClass in your cluster that is the v1beta1 equivalent of the v1alpha5 Provisioner/AWSNodeTemplate
  2. Add a taint to the old Provisioner such as karpenter.sh/legacy=true:NoSchedule
  3. Delete each node one-at-time owned by the Provisioner by running kubectl delete node <node-name>

Conclusion

In this post, we showed you the modifications introduced by the new APIs and provided insight into the reasoning behind these changes, which have been shaped by feedback from the community. We’re thrilled to witness the growing maturity of the Karpenter project. We anticipate that the majority of these changes will eventually move to the stable v1 API, which enables a broader user base to take full advantage of Karpenter’s capabilities in workload-native node provisioning.

There are some other deprecations and changes that we didn’t cover in this post. Please head to the Karpenter upgrade guide for a comprehensive migration guideline.

Before you upgrade Karpenter to v0.32.0, it is recommended to read the full release notes. If you have any questions, then please feel free to reach out in the Kubernetes slack #karpenter channel or on GitHub where we welcome feedback that helps us prioritize and develop new features.

View the full article

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...