Search the Community
Showing results for tags 'vpc cni'.
-
This post was coauthored by Venkatesh Nannan, Sr. Engineering Manager at Rippling Introduction Rippling is a workforce management system that eliminates the friction of running a business, combining HR, IT, and Finance apps on a unified data platform. Rippling’s mission is to free up intelligent people to work on hard problems. Existing Stack Rippling uses a modular monolith architecture with different Docker entrypoints for multiple services and background jobs. These components are managed within a single, large, multi-tenant production cluster on Amazon Elastic Kubernetes Service (Amazon EKS per region), on a scale of over 1000 nodes. Rippling’s infra stack consists of: Karpenter for cluster autoscaling – a flexible, high-performance Kubernetes cluster autoscaler making sure of optimal compute capacity. Horizontal Pod Autoscaler for scaling Kubernetes pods based on demand. KEDA, an event-driven autoscaler for scaling background job processing containers based on event volume. IAM Roles for Service Accounts (IRSA) provide temporary AWS Identity and Access Management (IAM) credentials to the Kubernetes pod, enabling access to AWS resources such as Amazon Simple Storage Service (Amazon S3) buckets, etc. Argo CD, an open-source, GitOps continuous delivery tool, deploys applications and add-on software to the Kubernetes cluster. AWS Load Balancer Controller exposes Kubernetes services to end-users. TargetGroupBinding Custom Resource binds pods to Application Load Balancer (ALB) target groups. Amazon EKS managed node groups spanning across multiple Availability Zones (AZs). In addition to these technologies, we were using Cilium CNI for controlling network traffic between pods. However, we were running into challenges with this part of our stack, so we decided to look for the following alternatives. Figure 1: High level architecture of Rippling Challenges As Amazon EKS version 1.23 approached end-of-life, upgrading to v1.27 became imperative. However, during our initial attempts at upgrading to v1.24 in our non-production environment, we encountered a significant hurdle. New nodes running Cilium failed to join the cluster, increasing our downtime and requiring operational work on the CNI plugin. As a company, we prioritize using managed services to streamline operations and focus on adding value to our business. This Kubernetes upgrade task gave us an opportunity to look at alternatives that would be easier to maintain. We saw that AWS had just announced the VPC CNI support for k8s network policies using eBPF. We realized that migrating to this solution would enable us to replace our third-party networking add-on and solely rely on VPC CNI for both cluster networking and network policy implementation. This change would help reduce the overhead of managing operational software needed for cluster networking. Introduction of Amazon VPC CNI support for network policies When AWS announced VPC CNI support for k8s Network Policies using eBPF, we wanted to use the Amazon VPC CNI to secure the traffic in our Kubernetes clusters and simplify our EKS cluster management and operations. As network policy agents are bundled in existing VPC CNI pods, we would no longer need to run additional daemon pods and network policy controllers on the worker nodes. We followed the blue-green cluster upgrade strategy and were able to safely migrate the traffic from the old cluster to the new cluster with minimal risk of breaking existing workloads. Planning the migration We did an inventory of the applied network policies in our existing cluster and the various ingress/egress features used. This helped us identify deviations from upstream K8s Network Policies. This is necessary for migrating, as Amazon VPC CNI supports only the upstream k8s network policies as of this writing. Rippling was not using advanced features from our third-party network policy engines such as Global Network Policies, DNS based policy rules, or rule priority. Therefore, we did not need Custom Resource Definition (CRD) transformations going into the migration process. AWS recommends converting third-party NetworkPolicy CRDs to Kubernetes Network Policy resources and testing the converted policies in a separate test cluster before migrating from third-party to VPC CNI Network Policy engine in production. To assist in the migration process, AWS has developed a tool called K8s Network Policy Migrator that converts existing supported Calico/Cilium network policy CRDs to Kubernetes native network policies. After conversion you can directly test the converted network policies on your new clusters running VPC CNI network policy controller. The tool is designed to help streamline the migration process and make sure of a smooth transition. Picking migration strategy There are broadly two strategies to migrate the CNI plugin in the EKS cluster: (1) In-place and (2) Blue-Green. The in-Place strategy replaces an existing third-party CNI plugin with the VPC CNI plugin with network policy support in an existing EKS cluster. This would entail the following steps: Creating a new label “cni-plugin=3p” on the existing Amazon EKS managed node groups and Karpenter NodePool resources. Updating the existing third-party CNI DaemonSet to schedule CNI pods on those labeled nodes. Deploy the Amazon EKS Add-on version of Amazon VPC CNI and schedule them to nodes without the “cni-plugin” label. At this point the existing nodes have third-party CNI plugin pods and not the VPC CNI pods. Launch new Amazon EKS Managed node groups, Karpenter NodePool resources without the “cni-plugin=3p” label so that VPC CNI pods can be scheduled to those nodes. Drain and delete the existing Amazon EKS managed node groups and Karpenter NodePool resources to move the workloads to the new worker nodes with VPC CNI. Finally, delete the third-party CNI and associated network policy controllers from the cluster. As you can see, this process is involved, needs careful orchestration, and is more prone to errors that impact the application availability. The second approach is to use the Blue-Green strategy, in which a new EKS cluster is launched with the VPC CNI plugin and then the workloads are migrated to it. This approach is safer since it can be rolled back and provides the ability to test the setup in isolation before routing the live production traffic. Therefore, we chose the Blue-Green strategy for our migration. Migration As part of the blue-green strategy, we created a new EKS cluster with the Amazon VPC CNI and enabled Network Policy support by customizing the VPC CNI Amazon EKS add-on configuration. We also deployed the Argo CD agent on the cluster and bootstrapped it using Argo CD’s App of apps pattern to deploy the applications into the cluster. Network policies were also deployed to the cluster using the Argo CD. This was tested in a non-production environment to migrate from the third-party CNI to VPC CNI to make sure that applications and services passed functional tests. Then we could safely migrate the traffic from the old cluster to the new cluster without risks by leveraging the same strategy in the production environment. Lessons learned Amazon VPC CNI uses the VPC IP space to assign IP addresses to k8s pods. This led us to realize our existing VPCs were not properly sized to meet the growing number of k8s pods. We added a permitted secondary CIDR block 100.64.0.0/10 to the VPC and configured VPC CNI Custom Networking feature to assign those IP addresses to the k8s pods. This proactive measure makes sure of scalability as our infrastructure expands, mitigating concerns about IP address exhaustion. Leveraging automation and Infrastructure-as-Code (IaC) is recommended, especially as we are replicating existing clusters and migrating the workloads to them. Conclusion In this post, we discussed how Rippling migrated from third-party CNI to Amazon VPC CNI in their Amazon EKS clusters and enabled network policy support to secure pod-to-pod communications. Rippling used the blue-green strategy for the migration to minimize the application impact, and safely cut over the traffic to the new cluster. This migration helped Rippling to use the native features offered by AWS and reduced the burden of managing the operational software in our EKS clusters. Venkatesh Nannan, Sr. Engineering Manager – Infrastructure at Rippling Venkatesh Nannan is a seasoned Engineering leader with expertise in building scalable cloud-native applications, specializing in backend development and infrastructure architecture. View the full article
-
- aws vpc
- amazon eks
-
(and 1 more)
Tagged with:
-
Users modernizing their applications using Amazon Elastic Kubernetes Service (Amazon EKS) on AWS often run into critical IPv4 address space exhaustion driven by scale. They want to maximize usage of the VPC CIDRs and subnets provisioned for the EKS pods without introducing additional operational complexity. We believe that use of IPv6 address space is the long-term solution for users to build scalable networking solutions. However, we also understand that Amazon EKS users may be constrained to IPv4 environments owing to dependencies on other networking components and applications’ support for IPv6. Therefore, Amazon EKS is introducing support for Enhanced Subnet Discovery for helping users streamline network configuration and scale IPv4 based clusters without adding operational complexity. How it works Amazon VPC Container Network Interface (CNI) plugin is deployed on each Amazon Elastic Compute Cloud (Amazon EC2) worker node in your EKS cluster. It creates and attaches Elastic Network Interfaces (ENIs) to your worker nodes, as well as assigns a private IPv4, IPv6 address from your VPC CIDR to each pod in the EKS cluster. By default, VPC CNI assigns IP addresses to pods from the same subnet as the worker node’s primary network interface, which is sometimes referred to as “usable subnet”. Without any additional configuration, a node can only attach ENIs from this usable subnet in which an EC2 instance was launched. With the new feature of VPC CNI, we are now expanding the scope of “usable subnet(s)”. When enhanced subnet discovery is enabled, pod IPs are automatically allocated from all available subnets/CIDRs in the VPC that are tagged for use. New subnets can be created and tagged using the specific tag “kubernetes.io/role/cni”, and they are integrated seamlessly into the existing network configuration. This enables you to scale your applications effectively with minimal disruption to ongoing operations. Prerequisites The following prerequisites are necessary to continue with this post: An AWS Account An EKS cluster with version 1.25 or higher – we use v1.29 in the walkthrough Amazon VPC CNI version 1.18.0 or later The latest version of AWS Command Line Interface (AWS CLI) configured on your device, or AWS CloudShell eksctl – a simple CLI tool for creating and managing EKS clusters (v0.165.0 or higher) Setup export AWS_REGION=<YOUR_AWS_REGION> #Replace with your AWS Region export AWS_ACCOUNT=<YOUR_ACCOUNT> #Replace with your AWS Account number export CLUSTER_NAME=eks-enhsubsel-demo #Replace with your EKS cluster name In this walkthrough we simulate an IP exhaustion scenario by creating an Amazon VPC with /24 CIDR block, which yields 256 IP addresses. It is divided into three public and three private subnets, and each is assigned with /27 CIDR block (28 IP addresses) as shown in the following figure. Once the VPC CIDR range is exhausted, we associate a secondary CIDR to the VPC, and then we create the VPC subnets with the “kubernetes.io/role/cni” tag so that VPC CNI can automatically discover and use the new subnets to allocate the Pod IP addresses. Figure 1: Amazon VPC setup Let’s start with creating an EKS cluster with VPC CNI version 1.18.0. cat << EOF > cluster.yaml apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: ${CLUSTER_NAME} region: ${AWS_REGION} version: "1.29" vpc: cidr: 10.0.0.0/24 addons: - name: vpc-cni version: 1.18.0 - name: coredns - name: kube-proxy managedNodeGroups: - name: ${CLUSTER_NAME}-mng instanceType: m6a.large privateNetworking: true minSize: 2 desiredCapacity: 2 maxSize: 5 EOF eksctl create cluster -f cluster.yaml Wait for the cluster creation to be complete and make sure that vpc-cni addon is up and running in the cluster. aws eks describe-addon --addon-name vpc-cni --cluster-name $CLUSTER_NAME --region $AWS_REGION { "addon": { "addonName": "vpc-cni", "clusterName": "eks-enhsubsel-demo", "status": "ACTIVE", "addonVersion": "v1.18.0-eksbuild.1", .... } } As the subnets are assigned with /27 CIDR, note that private subnets only have few or no available IP addresses: aws ec2 describe-subnets --region $AWS_REGION \ --filters Name=tag:Name,Values="eksctl-eks-enhsubsel-demo-cluster/SubnetPrivate*" \ --query "Subnets[].{VPC:VpcId,SubnetId:SubnetId,AvailableIPs:AvailableIpAddressCount}" \ --output table ----------------------------------------------------------------------- | DescribeSubnets | +--------------+----------------------------+-------------------------+ | AvailableIPs | SubnetId | VPC | +--------------+----------------------------+-------------------------+ | 16 | subnet-08411e385d62f29da | vpc-07f75e9b1d954689a | | 7 | subnet-0755097835150b642 | vpc-07f75e9b1d954689a | | 0 | subnet-0975a78066e7e76d6 | vpc-07f75e9b1d954689a | +--------------+----------------------------+-------------------------+ Deploy a sample application and simulate the IP exhaustion scenario in the EKS cluster. cat <<EOF | kubectl apply -f - apiVersion: apps/v1 kind: Deployment metadata: name: inflate spec: replicas: 50 selector: matchLabels: app: inflate template: metadata: labels: app: inflate spec: terminationGracePeriodSeconds: 0 containers: - name: inflate image: public.ecr.aws/eks-distro/kubernetes/pause:3.7 resources: requests: cpu: 50m EOF Due to the insufficient IPs, note many pods are in the “ContainerCreating” state, as the Amazon VPC CNI is unable to allocate the IP addresses. Now, let’s explore how we can use the Enhanced Subnet discovery feature of VPC CNI to automatically discover the new VPC Subnets with the available IP space, and use it to allocate IP addresses for the k8s pods. Amazon VPC supports up to five secondary IP CIDR blocks to extend the VPC IP space. Start by adding a secondary CIDR block “10.1.0.0/16” to the Amazon EKS VPC. export EKS_VPC_ID=$(aws eks describe-cluster --name $CLUSTER_NAME \ --region $AWS_REGION --query "cluster.resourcesVpcConfig.vpcId" --output text) aws ec2 associate-vpc-cidr-block --vpc-id $EKS_VPC_ID \ --cidr-block "10.1.0.0/16" --region $AWS_REGION { "CidrBlockAssociation": { "AssociationId": "vpc-cidr-assoc-06515a22930a5d6e9", "CidrBlock": "10.1.0.0/16", "CidrBlockState": { "State": "associating" } }, "VpcId": "vpc-07f75e9b1d954689a" } Wait for the association to complete, and start creating new VPC subnets from the secondary CIDR block. We are also tagging the subnets with “kubernetes.io/role/cni=1” so that VPC CNI can auto-discover them. aws ec2 create-subnet --vpc-id $EKS_VPC_ID --region $AWS_REGION \ --availability-zone "$AWS_REGION"a --cidr-block 10.1.0.0/19 \ --tag-specifications "ResourceType=subnet,Tags=[{Key=kubernetes.io/role/cni,Value=1}]" aws ec2 create-subnet --vpc-id $EKS_VPC_ID --region $AWS_REGION \ --availability-zone "$AWS_REGION"b --cidr-block 10.1.32.0/19 \ --tag-specifications "ResourceType=subnet,Tags=[{Key=kubernetes.io/role/cni,Value=1}]" aws ec2 create-subnet --vpc-id $EKS_VPC_ID --region $AWS_REGION \ --availability-zone "$AWS_REGION"c --cidr-block 10.1.64.0/19 \ --tag-specifications "ResourceType=subnet,Tags=[{Key=kubernetes.io/role/cni,Value=1}]" In the default setup, VPC CNI assigns both the primary and secondary IP addresses of an ENI from the VPC subnet associated with the Amazon EKS worker node’s primary network interface. aws ec2 describe-network-interfaces --region $AWS_REGION \ --query "NetworkInterfaces[*].{ID:NetworkInterfaceId,DNSName:PrivateDnsName,PrimaryIP:PrivateIpAddress,SecondaryIPs:PrivateIpAddresses[].PrivateIpAddress}" \ --filters Name=tag:cluster.k8s.amazonaws.com/name,Values=$CLUSTER_NAME \ --output table -------------------------------------------------------------------------------------- | DescribeNetworkInterfaces | +-------------------------------------------+-------------------------+--------------+ | DNSName | ID | PrimaryIP | +-------------------------------------------+-------------------------+--------------+ | ip-10-0-0-155.us-west-2.compute.internal | eni-07f66d0e6b2408fc7 | 10.0.0.155 | +--------------------------------------------+------------------------+--------------+ || SecondaryIPs || |+-----------------------------------------------------------------------------------+| || 10.0.0.155 || || 10.0.0.136 || || 10.0.0.140 || || 10.0.0.141 || || 10.0.0.145 || || 10.0.0.150 || || 10.0.0.135 || || 10.0.0.151 || || 10.0.0.148 || || 10.0.0.149 || |+-----------------------------------------------------------------------------------+| ......... |+----------------------------------------------------------------------------------+| | DescribeNetworkInterfaces | +--------------------------------------------+-------------------------+-------------+ | DNSName | ID | PrimaryIP | +--------------------------------------------+-------------------------+--------------+ | ip-10-0-0-102.us-west-2.compute.internal | eni-087692b80786865e0 | 10.0.0.102 | +--------------------------------------------+-------------------------+--------------+ || SecondaryIPs || |+-----------------------------------------------------------------------------------+| || 10.0.0.102 || || 10.0.0.105 || || 10.0.0.110 || || 10.0.0.126 || || 10.0.0.124 || || 10.0.0.125 || || 10.0.0.118 || || 10.0.0.100 || || 10.0.0.116 || || 10.0.0.117 || |+-----------------------------------------------------------------------------------+| Now, verify the new Enhanced Subnet Discovery feature is enabled by checking the “ENABLE_SUBNET_DISCOVERY” environment variable in the Amazon VPC CNI Addon. You can use kubectl to verify this kubectl describe ds aws-node -n kube-system | grep ENABLE_SUBNET_DISCOVERY ENABLE_SUBNET_DISCOVERY: true As of this writing, the Enhanced Subnet Discovery feature is enabled by default in Amazon VPC CNI 1.18.0 and above. If the environment variable was not set to true, then you can use kubectl or AWS CLI to set this configuration: kubectl set env daemonset aws-node -n kube-system ENABLE_SUBNET_DISCOVERY=true \ -c aws-node or aws eks update-addon --cluster-name $CLUSTER_NAME --region $AWS_REGION \ --addon-name vpc-cni \ --configuration-values '{"env":{"ENABLE_SUBNET_DISCOVERY":"true"}}' When this feature is enabled, VPC CNI looks for the VPC Subnets tagged with “kubernetes.io/role/cni” and with the available IP Space. It attaches additional ENIs from these subnets to the Amazon EKS worker nodes so that it can assign the IP addresses to the k8s pods. In our walkthrough, as many pods are in the “ContainerCreating” state, VPC CNI automatically discovered the new subnets with /19 CIDR and attached them to the existing worker nodes. We can verify this using the following commands: kubectl get pods -o wide | grep ContainerCreating <<EMPTY OUTPUT>> Now, when we look at the ENIs attached to the worker nodes, note that an additional ENI from the secondary CIDR subnet is attached and pods are assigned from the 10.1.x.x IP range. aws ec2 describe-network-interfaces --region $AWS_REGION \ --query "NetworkInterfaces[*].{ID:NetworkInterfaceId,DNSName:PrivateDnsName,PrimaryIP:PrivateIpAddress,SecondaryIPs:PrivateIpAddresses[].PrivateIpAddress}" \ --filters Name=tag:cluster.k8s.amazonaws.com/name,Values=$CLUSTER_NAME \ --output table -------------------------------------------------------------------------------------- | DescribeNetworkInterfaces | +-------------------------------------------+-------------------------+--------------+ | DNSName | ID | PrimaryIP | +-------------------------------------------+-------------------------+--------------+ | ip-10-0-0-152.us-west-2.compute.internal | eni-0525ae09d044a6688 | 10.0.0.152 | +--------------------------------------------+-------------------------+--------------+ || SecondaryIPs || |+-----------------------------------------------------------------------------------+| || 10.0.0.152 || || 10.0.0.144 || || 10.0.0.154 || || 10.0.0.147 || || 10.0.0.133 || || 10.0.0.157 || || 10.0.0.153 || || 10.0.0.158 || || 10.0.0.146 || || 10.0.0.132 || |+-----------------------------------------------------------------------------------+| | DescribeNetworkInterfaces | +--------------------------------------------+-------------------------+--------------+ | DNSName | ID | PrimaryIP | +--------------------------------------------+-------------------------+--------------+ | ip-10-1-79-53.us-west-2.compute.internal | eni-0b2632fa77e9fbf68 | 10.1.79.53 | +--------------------------------------------+-------------------------+--------------+ || SecondaryIPs || |+-----------------------------------------------------------------------------------+| || 10.1.79.53 || || 10.1.78.231 || || 10.1.75.23 || || 10.1.81.171 || || 10.1.95.26 || || 10.1.86.60 || || 10.1.76.92 || || 10.1.65.140 || || 10.1.75.174 || || 10.1.94.30 || |+-----------------------------------------------------------------------------------+| Cleaning up To avoid ongoing charges, make sure to delete EKS cluster resources created in your AWS account. # Delete EKS cluster resources eksctl delete cluster -f cluster.yaml Key considerations Shared subnets When using this feature in the cross account scenario, where VPC and Subnets are created in the central AWS Account and shared with the participant AWS Account to deploy the EKS cluster, you should tag the subnets in the participant account where the cluster is launched. Refer to Use shared VPC Subnets in Amazon EKS for a detailed walkthrough. Custom networking Custom networking, a feature of Amazon VPC CNI, provides optionality to the IP exhaustion issue by assigning the Pod IPs from secondary VPC IP address spaces. When custom networking is enabled in VPC CNI, it creates secondary ENIs in the subnet defined under a custom resource named ENIConfig that includes an alternate subnet CIDR range created from a secondary VPC CIDR. The VPC CNI assigns Pods IP addresses from the CIDR range defined in the ENIConfig custom resource. Furthermore, the pods can use different security groups than that of the node’s primary network interface. Therefore, you might consider custom networking if you have a security requirement to run Pods on a different network with different security groups. Note that, unlike custom networking, enhanced subnet discovery does not need the creation of the ENIConfig custom resource, and thus reduces the configuration overhead. Custom networking takes precedence when both features are enabled on the VPC CNI. Pod networking use cases You can use this feature along with other VPC CNI use cases, such as “SNAT for Pods”, “Security groups for Pods”, “Kubernetes network policies”, and “Increase available IP addresses on a worker node”. Refer to Choose Pod networking use cases for a detailed comparison. Conclusion In this post we showed you how Amazon VPC CNI based subnet discovery can provide scale and flexibility to adjust your IPv4 address allocations to accommodate the growth of your EKS clusters with low operational overhead. We demonstrated how the feature enables adaptability to changes in size, and simplifies IP address management while supporting the dynamic needs of modern IT environments. Visit the Amazon EKS best practices guide for recommendations and additional considerations for securely scaling EKS clusters. For installation instructions for Amazon VPC CNI, refer to the Amazon EKS user guide. You may provide feedback on the Amazon VPC CNI plugin by leaving a comment or opening an issue on the AWS Containers Roadmap that is hosted on GitHub. View the full article
-
- amazon vpc
- vpc
-
(and 2 more)
Tagged with:
-
Forum Statistics
70.4k
Total Topics68.3k
Total Posts