Amazon Web Services Posted July 13, 2023 Share Posted July 13, 2023 Introduction In November 2021, AWS introduced Karpenter, an open-source high-performance Kubernetes Cluster Autoscaler licensed under the Apache License 2.0. Karpenter helps improve your application availability and cluster efficiency by rapidly launching right-sized compute resources in response to changing application load. Since its release, we’ve been seeing an increase in customers migrating from Kubernetes Cluster Autoscaler to Karpenter. However, for customers running a heterogeneous Amazon Elastic Kubernetes Service (Amazon EKS) cluster with Windows node groups, it became a showstopper as Karpenter didn’t support Windows nodes, until now. The OSS community did a great job starting development on Windows node group supportability in Karpenter. The AWS team took it a step further to review the proposed design, add enhancements to improve the customer experience, and integrate it with our internal continuous integration (CI) process. When Karpenter is installed in your cluster, it observes the aggregate resource requests of unscheduled pods and decides to launch new nodes when additional capacity is needed, while deciding to deprovision nodes when that capacity is no longer needed. By doing this, Karpenter reduces the scheduling latencies and infrastructure costs of your cluster. Figure 1: Karpenter high-level scheduling In this post, we focus on scaling out/in Windows Server 2019 and Windows Server 2022 using Karpenter for Amazon EKS. To learn more about Karpenter architecture and components, access the Karpenter website. Prerequisites Ensure you are running eksctl commands with an AWS Identify and Access Management (AWS IAM) profile that has permissions to create and manage Amazon EKS. This AWS IAM security principal is used in the Getting Started section below for the AWS Command Line Interface (AWS CLI) configuration. Ensure you are using eksctl v0.124.0 or higher to operate Karpenter. Follow the Getting Started section in the Amazon EKS documentation to install aws cli, kubectl, and eksctl on your development machine. Alternatively, you could leverage Cloud9 or Cloudshell to handle deployment and maintenance tasks. Solution overview Create OS variables to be used throughout the post. Deploy Karpenter service requirements. Create an Amazon EKS cluster with the necessary iamIdentityMappings for Karpenter. Enable Amazon EKS Windows support. Install Karpenter with Helm. Create Karpenter provisioner and NodeTemplate. Test Karpenter for Windows – scale out. Test Karpenter for Windows – scale in. Cleanup test resources. Walkthrough 1. Create OS variables to be used throughout the post export KARPENTER_VERSION=v0-c990a2d9fb10c1bfeffd5c6af64bf8575536d67e export AWS_PARTITION="aws" export CLUSTER_NAME="windows-karpenter-demo" export AWS_DEFAULT_REGION="us-west-2" export AWS_ACCOUNT_ID="$(aws sts get-caller-identity --query Account --output text)" export TEMPOUT=$(mktemp) 2. Create Karpenter service requirements Karpenter directly integrates with the Amazon Elastic Compute Cloud (Amazon EC2) API endpoint to take specific actions based on events such as spot interruption or instance state changes. The following command automatically deploys the necessary AWS services/components such as Amazon EventBridge rules applied to messages being sent over an Amazon SQS queue using AWS CloudFormation. curl -fsSL https://karpenter.sh/v0.29/getting-started/getting-started-with-karpenter/cloudformation.yaml > $TEMPOUT \ && aws cloudformation deploy \ --stack-name "Karpenter-${CLUSTER_NAME}" \ --template-file "${TEMPOUT}" \ --capabilities CAPABILITY_NAMED_IAM \ --parameter-overrides "ClusterName=${CLUSTER_NAME}" Upon successful execution of the AWS CloudFormation template, you’ll be presented with the following output: Waiting for changeset to be created.. Waiting for stack create/update to complete Successfully created/updated stack - Karpenter-windows-karpenter-demo 3. Create an Amazon EKS cluster with the necessary iamIdentityMappings for Karpenter Next, we deploy a temporary Amazon EKS cluster using eksctl in order to test Karpenter integration with Windows. The necessary AWS IAM and IdentityMapping are created as ServiceAccounts and added to the Kubernetes ConfigMap. eksctl create cluster -f - <<EOF --- apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: ${CLUSTER_NAME} region: ${AWS_DEFAULT_REGION} version: "1.27" tags: karpenter.sh/discovery: ${CLUSTER_NAME} iam: withOIDC: true serviceAccounts: - metadata: name: karpenter namespace: karpenter roleName: ${CLUSTER_NAME}-karpenter attachPolicyARNs: - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME} roleOnly: true iamIdentityMappings: - arn: "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}" username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes managedNodeGroups: - instanceType: m5.large amiFamily: AmazonLinux2 name: ${CLUSTER_NAME}-linux-ng desiredCapacity: 2 minSize: 1 maxSize: 10 EOF export CLUSTER_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text)" export KARPENTER_IAM_ROLE_ARN="arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}-karpenter" echo $CLUSTER_ENDPOINT $KARPENTER_IAM_ROLE_ARN eksctl uses AWS CloudFormation to create all the necessary resources to build an Amazon EKS cluster. Upon successful creation of your cluster, you’ll be presented with a similar output. If the cluster creation fails, then the failure reason is provided in the AWS CLI output (or AWS CloudFormation console). 2023-06-14 06:20:19 [✔] all EKS cluster resources for "windows-karpenter-demo" have been created 2023-06-14 06:20:19 [ℹ] nodegroup "windows-karpenter-demo-linux-ng" has 2 node(s) 2023-06-14 06:20:19 [ℹ] node "ip-192-168-12-160.ec2.internal" is ready 2023-06-14 06:20:19 [ℹ] node "ip-192-168-53-156.ec2.internal" is ready 2023-06-14 06:20:19 [ℹ] waiting for at least 1 node(s) to become ready in "windows-karpenter-demo-linux-ng" 2023-06-14 06:20:19 [ℹ] nodegroup "windows-karpenter-demo-linux-ng" has 2 node(s) 2023-06-14 06:20:19 [ℹ] node "ip-192-168-12-160.ec2.internal" is ready 2023-06-14 06:20:19 [ℹ] node "ip-192-168-53-156.ec2.internal" is ready 2023-06-14 06:20:20 [ℹ] kubectl command should work with "/Users/bpfeiff/.kube/config", try 'kubectl get nodes' 2023-06-14 06:20:20 [✔] EKS cluster "windows-karpenter-demo" in "us-east-1" region is ready 4. Enable Amazon EKS Windows support To deploy Windows nodes to our cluster, we need to enable Amazon EKS Windows support. kubectl apply -f - <<EOF --- apiVersion: v1 kind: ConfigMap metadata: name: amazon-vpc-cni namespace: kube-system data: enable-windows-ipam: "true" EOF 5. Install Karpenter with Helm Next, we will use Helm to install Karpenter. # Logout of helm registry to perform an unauthenticated pull against the public ECR helm registry logout public.ecr.aws helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version ${KARPENTER_VERSION} --namespace karpenter --create-namespace \ --set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=${KARPENTER_IAM_ROLE_ARN} \ --set settings.aws.clusterName=${CLUSTER_NAME} \ --set settings.aws.defaultInstanceProfile=KarpenterNodeInstanceProfile-${CLUSTER_NAME} \ --set settings.aws.interruptionQueueName=${CLUSTER_NAME} \ --set controller.resources.requests.cpu=1 \ --set controller.resources.requests.memory=1Gi \ --set controller.resources.limits.cpu=1 \ --set controller.resources.limits.memory=1Gi \ --wait Upon successful installation, you‘ll see the following output. Release "karpenter" does not exist. Installing it now. Pulled: public.ecr.aws/karpenter/karpenter:v0-c990a2d9fb10c1bfeffd5c6af64bf8575536d67e Digest: sha256:33e2597488e3359653515bb7bd43a4ed6c1e811cb95c261175f8808a9ea4fc97 NAME: karpenter LAST DEPLOYED: Wed Jun 14 08:16:36 2023 NAMESPACE: karpenter STATUS: deployed REVISION: 1 TEST SUITE: None 6. Create provisioner as required Now we create two Karpenter provisioners to support Windows Server 2019 and Windows Server 2022 in the same Amazon EKS cluster. The Karpenter provisioner sets constraints on the nodes that can be created by Karpenter and the pods that can run on those nodes. cat <<EOF | kubectl apply -f - apiVersion: karpenter.sh/v1alpha5 kind: Provisioner metadata: name: windows2019 spec: requirements: - key: karpenter.sh/capacity-type operator: In values: ["on-demand"] - key: kubernetes.io/os operator: In values: ["windows"] limits: resources: cpu: 1000 providerRef: name: windows2019 ttlSecondsAfterEmpty: 30 --- apiVersion: karpenter.k8s.aws/v1alpha1 kind: AWSNodeTemplate metadata: name: windows2019 spec: subnetSelector: karpenter.sh/discovery: ${CLUSTER_NAME} securityGroupSelector: karpenter.sh/discovery: ${CLUSTER_NAME} amiFamily: Windows2019 metadataOptions: httpEndpoint: enabled httpProtocolIPv6: disabled httpPutResponseHopLimit: 2 httpTokens: required --- apiVersion: karpenter.sh/v1alpha5 kind: Provisioner metadata: name: windows2022 spec: requirements: - key: karpenter.sh/capacity-type operator: In values: ["on-demand"] - key: kubernetes.io/os operator: In values: ["windows"] limits: resources: cpu: 1000 providerRef: name: windows2022 ttlSecondsAfterEmpty: 30 --- apiVersion: karpenter.k8s.aws/v1alpha1 kind: AWSNodeTemplate metadata: name: windows2022 spec: subnetSelector: karpenter.sh/discovery: ${CLUSTER_NAME} securityGroupSelector: karpenter.sh/discovery: ${CLUSTER_NAME} amiFamily: Windows2022 metadataOptions: httpEndpoint: enabled httpProtocolIPv6: disabled httpPutResponseHopLimit: 2 httpTokens: required EOF 7. Scale out the deployment We now have our Amazon EKS cluster prepped for running Windows nodes and all the necessary components of Karpenter. We scale a sample application to see Karpenter automatically add nodes to the Amazon EKS cluster based on demand. 7.1 Run the following code to create your Windows Server 2022 sample application. cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: windows-server-iis-simple-2022 spec: selector: matchLabels: app: windows-server-iis-simple-2022 tier: backend track: stable replicas: 0 template: metadata: labels: app: windows-server-iis-simple-2022 tier: backend track: stable spec: containers: - name: windows-server-iis-simple-2022 image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2022 imagePullPolicy: IfNotPresent command: - powershell.exe - -command - while(1){sleep 2; ping -t localhost;} nodeSelector: kubernetes.io/os: windows node.kubernetes.io/windows-build: 10.0.20348 EOF The Windows Server version used by each pod must match that of the node. If you want to use multiple Windows Server versions in the same cluster, then you should set additional node labels and nodeSelector fields. Kubernetes automatically adds a label to the Windows node, named node.kubernetes.io/windows-build to simplify this. This label reflects the Windows major, minor, and build number that need to match for compatibility. Here are values used for each Windows Server version: Product Name Version 1 Windows Server 2019 10.0.17763 2 Windows Server 2022 10.0.20348 Based on the build version specified in the Pod nodeSelector, Karpenter launches new Windows nodes with the operating system accordingly. For example, if the build version is specified as 10.0.17763, then Karpenter uses the Windows 2019 provisioner to launch Windows nodes. For more information, please refer to the Guide for Running Windows Containers in Kubernetes. 7.2 Run the following command to scale your Windows Server 2022 sample application. kubectl scale deployment windows-server-iis-simple-2022 --replicas 10 7.3 You can use the Karpenter logs to track the scaling progress. kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller The following output shows the Windows Server 2022 Karpenter provisioner scaling from 0 nodes to 1 to support the 10 replicas we requested be run. 2023-06-14T12:19:01.581Z INFO controller.machine_lifecycle launched machine {"commit": "c990a2d", "machine": "windows2022-4hq46", "provisioner": "windows2022", "provider-id": "aws:///us-east-1f/i-039507775a01898e6", "instance-type": "c6a.xlarge", "zone": "us-east-1f", "capacity-type": "on-demand", "allocatable": {"cpu":"3920m", "ephemeral-storage":"44Gi","memory":"6012Mi","pods":"110","vpc.amazonaws.com/ PrivateIPv4Address":"14"}} 7.4 Run the following command to track the deployment progress of your pods. kubectl rollout status deploy/windows-server-iis-simple-2022 You’ll see the 10 replicas being created on our new Karpenter provisioned Windows worker nodes. Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 0 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 1 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 2 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 3 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 4 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 5 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 6 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 7 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 8 of 10 updated replicas are available... Waiting for deployment "windows-server-iis-simple-2022" rollout to finish: 9 of 10 updated replicas are available... deployment "windows-server-iis-simple-2022" successfully rolled out 7.5 Run the following code to scale out your Windows Server 2019 deployment. cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: windows-server-iis-simple-2019 spec: selector: matchLabels: app: windows-server-iis-simple-2019 tier: backend track: stable replicas: 0 template: metadata: labels: app: windows-server-iis-simple-2019 tier: backend track: stable spec: containers: - name: windows-server-iis-simple-2019 image: mcr.microsoft.com/windows/servercore/iis:windowsservercore-ltsc2019 imagePullPolicy: IfNotPresent command: - powershell.exe - -command - while(1){sleep 2; ping -t localhost;} nodeSelector: kubernetes.io/os: windows node.kubernetes.io/windows-build: 10.0.17763 EOF 7.6 Run the following command to scale your Windows Server 2019 sample application. kubectl scale deployment windows-server-iis-simple-2019 --replicas 10 A new Windows Server 2019 worker is launched by Karpenter as more pods are requested to be scheduled. This process is identical to Windows Server 2022 and you can reuse the steps above to track the progress of launching Windows Server 2019 worker node. 8. Scale in the deployment Karpenter handles scale out and scale in of Windows nodes based on demand. We’ll now tear down our sample applications and watch Karpenter terminate our Windows nodes. 8.1 Run the following commands to delete your sample application deployments. kubectl delete deployment windows-server-iis-simple-2022 kubectl delete deployment windows-server-iis-simple-2019 The Windows instances launched earlier by Karpenter will now be terminated. You can use the Karpenter logs to track the scale down progress. kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller Once all pods have been terminated, Karpenter deletes all idle instances. 2023-06-20T16:27:12.878Z DEBUG controller.node added TTL to empty node {"commit": "c990a2d", "node": "ip-192-168-99-4.ec2.internal", "provisioner": "windows2022"} 2023-06-20T16:27:15.140Z DEBUG controller.node added TTL to empty node {"commit": "c990a2d", "node": "ip-192-168-88-252.ec2.internal", "provisioner": "windows2019"} 2023-06-20T16:27:42.051Z INFO controller.deprovisioning deprovisioning via emptiness delete, terminating 1 machines ip-192-168-99-4.ec2.internal/c6a.xlarge/on-demand {"commit": "c990a2d"} 2023-06-20T16:27:42.138Z INFO controller.termination cordoned node {"commit": "c990a2d", "node": "ip-192-168-99-4.ec2.internal"} 2023-06-20T16:27:42.478Z INFO controller.termination deleted node {"commit": "c990a2d", "node": "ip-192-168-99-4.ec2.internal"} 2023-06-20T16:27:42.751Z INFO controller.machine_termination deleted machine {"commit": "c990a2d", "machine": "windows2022-4hq46", "node": "ip-192-168-99-4.ec2.internal", "provisioner": "windows2022", "provider-id": "aws:///us-east-1f/i-039507775a01898e6"} 2023-06-20T16:27:54.105Z INFO controller.deprovisioning deprovisioning via emptiness delete, terminating 1 machines ip-192-168-88-252.ec2.internal/c6a.xlarge/on-demand {"commit": "c990a2d"} 2023-06-20T16:27:54.177Z INFO controller.termination cordoned node {"commit": "c990a2d", "node": "ip-192-168-88-252.ec2.internal"} 2023-06-20T16:27:54.480Z INFO controller.termination deleted node {"commit": "c990a2d", "node": "ip-192-168-88-252.ec2.internal"} 2023-06-20T16:27:54.754Z INFO controller.machine_termination deleted machine {"commit": "c990a2d", "machine": "windows2019-khmc5", "node": "ip-192-168-88-252.ec2.internal", "provisioner": "windows2019", "provider-id": "aws:///us-east-1a/i-0978aeb1680f37d7c"} 2023-06-20T16:31:21.596Z DEBUG controller.awsnodetemplate discovered subnets {"commit": "c990a2d", "awsnodetemplate": "windows2019", "subnets": ["subnet-05d7fed709f082b75 (us-east-1a)", "subnet-0109ebad1a6808805 (us-east-1f)", "subnet-0ff0ebe5e1a8630f1 (us-east-1a)", "subnet-0d01b14a3e9c91d1f (us-east-1f)"]} 2023-06-20T16:33:19.192Z DEBUG controller.deprovisioning discovered subnets {"commit": "c990a2d", "subnets": ["subnet-05d7fed709f082b75 (us-east-1a)", "subnet-0109ebad1a6808805 (us-east-1f)", "subnet-0ff0ebe5e1a8630f1 (us-east-1a)", "subnet-0d01b14a3e9c91d1f (us-east-1f)"]} discovered instance types {"commit": "c990a2d", "count": 649} Cleaning up When you’ve finished, clean up the resources associated with the example cluster deployment to avoid incurring unwanted charges. eksctl delete cluster --name ${CLUSTER_NAME} --region us-west-2 If this command times out, then you can run the command above again to show the cluster has been successfully removed. Conclusion In this post, we showed you can leverage Karpenter to seamlessly scale out/in your Windows worker nodes on Amazon EKS. Customers no longer need to maintain two auto-scaler solutions on a heterogeneous Amazon EKS cluster with Windows and Linux nodes. A big shout-out to topikachu, who proactively started the development of the add-on.View the full article Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.