Jump to content

Search the Community

Showing results for tags 'k8s'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

  • General
    • General Discussion
    • Artificial Intelligence
    • DevOpsForum News
  • DevOps & SRE
    • DevOps & SRE General Discussion
    • Databases, Data Engineering & Data Science
    • Development & Programming
    • CI/CD, GitOps, Orchestration & Scheduling
    • Docker, Containers, Microservices, Serverless & Virtualization
    • Infrastructure-as-Code
    • Kubernetes & Container Orchestration
    • Linux
    • Logging, Monitoring & Observability
    • Security, Governance, Risk & Compliance
  • Cloud Providers
    • Amazon Web Services
    • Google Cloud Platform
    • Microsoft Azure

Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


Website URL


LinkedIn Profile URL


About Me


Cloud Platforms


Cloud Experience


Development Experience


Current Role


Skills


Certifications


Favourite Tools


Interests

  1. Introduction Snapchat is an app that hundreds of millions of people around the world use to communicate with their close friends. The app is powered by microservice architectures deployed in Amazon Elastic Kubernetes Service (Amazon EKS) and datastores such as Amazon CloudFront, Amazon Simple Storage Service (Amazon S3), Amazon DynamoDB, and Amazon ElastiCache. This post explains how Snap builds its microservices, leveraging Amazon EKS with AWS Identity and Access Management. It also discusses how Snap protects its K8s resources with intelligent threat detection offered by GuardDuty that is augmented with Falco and in-house tooling to secure Snap’s cloud-native service mesh platform. The following figure (Figure 1) shows the main data flow when users send and receive snaps. To send a snap, the mobile app calls Snap API Gateway that routes the call to a Media service that persists the sender message media in S3 (Steps 1-3). Next, the Friend microservice validates the sender’s permission to Snap the recipient user by querying the messaging core service (MCS) that checks whether the recipient is a friend (Steps 4-6), and the conversation is stored in Snap DB powered by DynamoDB (Steps 7-8). To receive a snap, the mobile app calls MCS to get the message metadata such as the pointer to the media file. It calls the Media microservice to load the media file from the content system that persists user data, powered by CloudFront and Amazon S3 (Steps 9-11) Figure 1: Snaps end-to-end user data flow main data flow when users send and receive Snaps The original Snap service mesh design included single tenant microservice per EKS cluster. Snap discovered, however, that managing thousands of clusters added an operational burden as microservices grew. Additionally, they discovered that many environments were underused and unnecessarily consuming AWS account resources such as IAM roles and policies. This required enabling microservices to share clusters and redefining tenant isolation to meet security requirements. Finally, Snap wanted to limit access to microservice data and storage while keeping them centralized in a network account meshed with Google’s cloud. The following figure illustrates a Kubernetes-based microservice, Friends and Users, deployed in Amazon EKS or Google Kubernetes Engine (GKE). Snap users reach Snap’s API Gateway through Envoy. Switchboard, Snap’s mesh service configuration panel, updates Edge Envoy endpoints with available micro-services resources after deploying them. Figure 2 – Snap’s high mesh design Bootstrap The purpose of this stage is the preparation and implementation of a secure multi-cloud compute provisioning system. Snap uses Kubernetes clusters as a construct that defines an environment that hosts one or more microservices, such as Friends, and Users in the first figure. Snap’s security bootstrap includes three layers that include authentication, authorization, and admission control when designing a Kubernetes-based multi-cloud. Snap uses IAM roles for Kubernetes service accounts (IRSA) to provision fine-grained service identities for microservices running in shared EKS clusters that allow access to AWS services such as Amazon S3, DynamoDB, etc. For operator access scoped to the K8s namespace, Snap built a tool to manage K8s RBAC that maps K8s roles to IAM, allowing developers to perform service operations following the principle of least privileges. Beyond RBAC and IRSA, Snap wanted to impose deployment validations such as making sure containers are instantiated by approved image registries (such as Amazon Elastic Container Registry (Amazon ECR) or images built and signed by approved CI systems) as well as preventing containers from running with elevated permissions. To accomplish this, Snap built admission controller webhooks. Build-time Snap believes in empowering its engineers to architect microservices autonomously within K8s constructs. Snap’s goal was to maximize Amazon EKS security benefits while abstracting K8s semantics. Specifically, 1/ The security of the Cloud safeguarding the infrastructure that runs Amazon EKS, object-stores (Amazon S3), data-stores (KeyDB, ElastiCache, and DynamoDB), and the network that interconnects them. The security in the Cloud includes 2/ protecting the K8s cluster API server and etcd from malicious access, and finally, 3/ protecting Snap’s applications’ RBAC, network policies, data encryption, and containers. Switchboard – Snap’s mesh service configuration panel Snap built a configuration hub called Switchboard to provide a single control panel across AWS and GCP to create K8s clusters. Microservices owners can define environments in regions with specific compute types offered by cloud providers. Switchboard also enables service owners to follow approval workflows to establish trust for service-to-service authentication and specify public routes and other service configurations. It allows service owners to manage service dependencies, traffic routes between K8s clusters. Switchboard presents a simplified configuration model based on the environments. It manages metadata, such as the service owner, team email, and on-call paging information. Snap wanted to empower tenants to control access to microservice data objects (images, audio, and video) and metadata stores such as databases and cache stores so they deployed the data stores in separate data accounts controlled by IAM roles and policies in those accounts. Snap needed to centralize microservices network paths to mesh with GCP resources. Therefore, Snap deployed the microservices in a centralized account using IAM roles for service accounts that assume roles the tenants’ data AWS accounts. The following figure shows how multiple environments (Kubernetes cluster) host three tenants using two different IRSA. Friends’ Service A can read and write to a dedicated DynamoDB table deployed in a separate AWS account. Similarly, MCS’ Service B can get and cache sessions or friends in ElastiCache. Figure 2 – Snap’s high mesh design One of Snap’s design principles was to maximize autonomy while maintaining their desired level of isolation between environments, all while minimizing operational overhead. Snap chose Kubernetes service accounts as the minimal isolation level. Amazon EKS support for IRSA allowed Snap to leverage OIDC to simplify the process of granting IAM permissions to application pods. Snap also uses RBAC to limit access to K8s cluster resources and secure cluster users’ authentication. Snap considers adopting Amazon EKS Pod Identities to reuse associations when running the same application in multiple clusters. This is done by applying identical associations to each cluster without modifying the role trust policy. Deployment-time Cluster access by human operators AWS IAM users and roles are currently managed by Snap which generates policies based on business requirements. Operators use Switchboard to request access to their microservice. Switchboard map an IAM user to a cluster RBAC policy that grants access to Kuberentes objects. Snap is evaluating AWS Identity Center to allow Switchboard federate AWS Single Sign-On (SSO) with a central identity provider (IdP) for enabling cluster operators to have least-privilege access using cluster RBAC policies, enforced through AWS IAM. Isolation strategy Snap chose to isolate K8s cluster resources by namespaces to achieve isolation by limiting container permissions with IAM roles for Service Accounts, and CNI network policies. In addition, Snap provision separate pod identity for add-ons such as CNI, Cluster-AutoScaler, and FluentD. Add-ons uses separate IAM policies using IRSA and not overly permissive EC2 instance IAM roles. Network partitioning Snap’s mesh defines rules that restrict or permit network traffic between microservices pods with Amazon VPC CNI network policies. Snap wanted to minimize IP exhaustion caused by IPv4 address space limitations due to its massive scale. Instead of working around IPv4 limitations using Kubernetes IPv4/IPv6 dual-stack, Snap wanted to migrate gracefully to IPv6. Snap can connect IPv4-based Amazon EKS clusters to IPv6 clusters using Amazon EKS IPv6 support and Amazon VPC CNI. Container hardening Snap built admission controller webhook to audit and enforce pod security context to prevent containers from running with elevated permissions (RunAs) or accessing volumes at the cluster or namespace level. Snap validate that workloads don’t use configurations that break container isolation such as hostIPC, HostNetwork, and HostPort. Figure 4 – Snap’s admission controller service Network policies Kubernetes Network Policies enable you to define and enforce rules for traffic flow between pods. Policies act as a virtual firewall, which allows you to segment and secure your cluster by specifying network traffic rules for pods, namespaces, IP addresses, and ports. Amazon EKS extends and simplifies native support for network policies in Amazon VPC CNI and Amazon EC2, security groups, and network access control lists (NACLs) through the upstream Kubernetes Network Policy API. Run-time Audit logs Snap needs auditing system activities to enhance compliance, intrusion detection, and policy validation. This is to track unauthorized access, policy violations, suspicious activities, and incident responses. Snap uses Amazon EKS control plane logging that ingests API server, audit, authenticator, controller manager, and scheduler logs into CloudWatch. It also uses Amazon CloudTrail for cross-AWS services access and fluentd to ingest application logging to CloudWatch and Google’s operations suite. Runtime security monitoring Snap has begun using GuardDuty EKS Protection. This helps Snap monitor EKS cluster control plane activity by analyzing Amazon EKS audit logs to identify unauthorized and malicious access patterns. This functionality, combined with their admission controller events provides coverage of cluster changes. For runtime monitoring, Snap uses the open source Falco agent to monitor EKS workloads in the Snap service mesh. GuardDuty findings are contextualized by Falco rules based on container running processes. This context helps to identify cluster tenants with whom to triage the findings. Falco agents support Snap’s runtime monitoring goals and deliver consistent reporting. Snap compliments GuardDuty with Falco to ensure changes are not made to a running container by monitoring and analyzing container syscalls (container drift detection rule). Conclusion Snap’s cloud infrastructure has evolved from running a monolith inside Google App Engine to microservices deployed in Kubernetes across AWS and GCP. This streamlined architecture helped improve Snapchat’s reliability. Snap’s Kuberentes multi-tenant vision needed abstraction of cloud provider security semantics such as AWS security features to comply with strict security and privacy standards. This blog reviewed the methods and systems used to implement a secure compute and data platform on Amazon EKS and Amazon data-stores. This included bootstrapping, building, deploying, and running Snap’s workloads. Snap is not stopping here. Learn more about Snap and our collaboration with Snap. View the full article
  2. Today, we are announcing the general availability of provider-defined functions in the AWS, Google Cloud, and Kubernetes providers in conjunction with the HashiCorp Terraform 1.8 launch. This release represents yet another step forward in our unique approach to ecosystem extensibility. Provider-defined functions will allow anyone in the Terraform community to build custom functions within providers and extend the capabilities of Terraform. Introducing provider-defined functions Previously, users relied on a handful of built-in functions in the Terraform configuration language to perform a variety of tasks, including numeric calculations, string manipulations, collection transformations, validations, and other operations. However, the Terraform community needed more capabilities than the built-in functions could offer. With the release of Terraform 1.8, providers can implement custom functions that you can call from the Terraform configuration. The schema for a function is defined within the provider's schema using the Terraform provider plugin framework. To use a function, declare the provider as a required_provider in the terraform{} block: terraform { required_version = ">= 1.8.0" required_providers { time = { source = "hashicorp/local" version = "2.5.1" } } }Provider-defined functions can perform multiple tasks, including: Transforming existing data Parsing combined data into individual, referenceable components Building combined data from individual components Simplifying validations and assertions To access a provider-defined function, reference the provider:: namespace with the local name of the Terraform Provider. For example, you can use the direxists function by including provider::local::direxists() in your Terraform configuration. Below you’ll find several examples of new provider-defined functions in the officially supported AWS, Google Cloud, and Kubernetes providers. Terraform AWS provider The 5.40 release of the Terraform AWS provider includes its first provider-defined functions to parse and build Amazon Resource Names (ARNs), simplifying Terraform configurations where ARN manipulation is required. The arn_parse provider-defined function is used to parse an ARN and return an object of individual referenceable components, such as a region or account identifier. For example, to get the AWS account ID from an Amazon Elastic Container Registry (ECR) repository, use the arn_parse function to retrieve the account ID and set it as an output: # create an ECR repository resource "aws_ecr_repository" "hashicups" { name = "hashicups" image_scanning_configuration { scan_on_push = true } } # output the account ID of the ECR repository output "hashicups_ecr_repository_account_id" { value = provider::aws::arn_parse(aws_ecr_repository.hashicups.arn).account_id } Running terraform apply against the above configuration outputs the AWS Account ID: Apply complete! Resources: 2 added, 0 changed, 0 destroyed. Outputs: hashicups_ecr_repository_account_id = "751192555662" Without the arn_parse function, you would need to define and test a combination of built-in Terraform functions to split the ARN and reference the proper index or define a regular expression to match on a substring. The function handles the parsing for you in a concise manner so that you do not have to worry about doing this yourself. The AWS provider also includes a new arn_build function that builds an ARN from individual attributes and returns it as a string. This provider-defined function can create an ARN that you cannot reference from another resource. For example, you may want to allow another account to pull images from your ECR repository. The arn_build function below constructs an ARN for an IAM policy using an account ID: # allow another account to pull from the ECR repository data "aws_iam_policy_document" "cross_account_pull_ecr" { statement { sid = "AllowCrossAccountPull" effect = "Allow" principals { type = "AWS" identifiers = [ provider::aws::arn_build("aws", "iam", "", var.cross_account_id, "root"), ] } actions = [ "ecr:BatchGetImage", "ecr:GetDownloadUrlForLayer", ] } }The arn_build function helps to guide and simplify the process of combining substrings to form an ARN, and it improves readability compared to using string interpolation. Without it, you'd have to look up the exact ARN structure in the AWS documentation and manually test it. Terraform Google Cloud provider The 5.23 release of the Terraform Google Cloud provider adds a simplified way to get regions, zones, names, and projects from the IDs of resources that aren’t managed by your Terraform configuration. Provider-defined functions can now help parse Google IDs when adding an IAM binding to a resource that’s managed outside of Terraform: resource "google_cloud_run_service_iam_member" "example_run_invoker_jane" { member = "user:jane@example.com" role = "run.invoker" service = provider::google::name_from_id(var.example_cloud_run_service_id) location = provider::google::location_from_id(var.example_cloud_run_service_id) project = provider::google::project_from_id(var.example_cloud_run_service_id) }The Google Cloud provider also includes a new region_from_zone provider-defined function that helps obtain region names from a given zone (e.g. “us-west1” from “us-west1-a”). This simple string processing could be achieved in multiple ways using Terraform’s built-in functions previously, but the new function simplifies the process: locals { zone = “us-central1-a” # ways to derive the region “us-central1” using built-in functions region_1 = join("-", slice(split("-", local.zone), 0, 2)) region_2 = substr(local.zone, 0, length(local.zone)-2) # our new region_from_zone function makes this easier! region_3 = provider::google::region_from_zone(local.zone) }Terraform Kubernetes provider The 2.28 release of the Terraform Kubernetes provider includes provider-defined functions for encoding and decoding Kubernetes manifests into Terraform, making it easier for practitioners to work with the kubernetes_manifest resource. Users that have a Kubernetes manifest in YAML format can use the manifest_decode function to convert it into a Terraform object. The example below shows how to use the manifest_decode function by referring to a Kubernetes manifest in YAML format embedded in the Terraform configuration: locals { manifest = <If you prefer to decode a YAML file instead of using an embedded YAML format, you can do so by combining the built-in file function with the manifest_decode function. $ cat manifest.yaml --- kind: Namespace apiVersion: v1 metadata: name: test labels: name: testresource "kubernetes_manifest" "example" { manifest = provider::kubernetes::manifest_decode(file("${path.module}/manifest.yaml")) }If your manifest YAML contains multiple Kubernetes resources, you may use the manifestdecodemulti function to decode them into a list which can then be used with the for_each attribute on the kubernetes_manifest resource: $ cat manifest.yaml --- kind: Namespace apiVersion: v1 metadata: name: test-1 labels: name: test-1 --- kind: Namespace apiVersion: v1 metadata: name: test-2 labels: name: test-2 resource "kubernetes_manifest" "example" { for_each = { for m in provider::kubernetes::manifest_decode_multi(file("${path.module}/manifest.yaml"))): m.metadata.name => m } manifest = each.value }Getting started with provider-defined functions Provider-defined functions allow Terraform configurations to become more expressive and readable by declaring practitioner intent and reducing complex, repetitive expressions. To learn about all of the new launch-day provider-defined functions, please review the documentation and changelogs of the aforementioned providers: Terraform AWS provider Terraform Google provider Terraform Kubernetes provider Review our Terraform Plugin Framework documentation to learn more about how provider-defined functions work and how you can make your own. We are thankful to our partners and community members for their valuable contributions to the HashiCorp Terraform ecosystem. View the full article
  3. Multi-cluster Ingress (MCI) is an advanced feature typically used in cloud computing environments that enables the management of ingress (the entry point for external traffic into a network) across multiple Kubernetes clusters. This functionality is especially useful for applications that are deployed globally across several regions or clusters, offering a unified method to manage access to these applications. MCI simplifies the process of routing external traffic to the appropriate cluster, enhancing both the reliability and scalability of applications. Here are key features and benefits of Multi-cluster Ingress: Global Load Balancing: MCI can intelligently route traffic to different clusters based on factors like region, latency, and health of the service. This ensures users are directed to the nearest or best-performing cluster, improving the overall user experience. Centralized Management: It allows for the configuration of ingress rules from a single point, even though these rules are applied across multiple clusters. This simplification reduces the complexity of managing global applications. High Availability and Redundancy: By spreading resources across multiple clusters, MCI enhances the availability and fault tolerance of applications. If one cluster goes down, traffic can be automatically rerouted to another healthy cluster. Cross-Region Failover: In the event of a regional outage or a significant drop in performance within one cluster, MCI can perform automatic failover to another cluster in a different region, ensuring continuous availability of services. Cost Efficiency: MCI helps optimize resource utilization across clusters, potentially leading to cost savings. Traffic can be routed to clusters where resources are less expensive or more abundant. Simplified DNS Management: Typically, MCI solutions offer integrated DNS management, automatically updating DNS records based on the health and location of clusters. This removes the need for manual DNS management in a multi-cluster setup. How What is Multi-cluster Ingress (MCI) works? Multi-cluster Ingress (MCI) works by managing and routing external traffic into applications running across multiple Kubernetes clusters. This process involves several components and steps to ensure that traffic is efficiently and securely routed to the appropriate destination based on predefined rules and policies. Here’s a high-level overview of how MCI operates: 1. Deployment Across Multiple Clusters Clusters Preparation: You deploy your application across multiple Kubernetes clusters, often spread across different geographical locations or cloud regions, to ensure high availability and resilience. Ingress Configuration: Each cluster has its own set of resources and services that need to be exposed externally. With MCI, you configure ingress resources that are aware of the multi-cluster environment. 2. Central Management and Configuration Unified Ingress Control: A central control plane is used to manage the ingress resources across all participating clusters. This is where you define the rules for how external traffic should be routed to your services. DNS and Global Load Balancer Setup: MCI integrates with global load balancers and DNS systems to direct users to the closest or most appropriate cluster based on various criteria like location, latency, and the health of the clusters. 3. Traffic Routing Initial Request: When a user makes a request to access the application, the DNS resolution directs the request to the global load balancer. Global Load Balancing: The global load balancer evaluates the request against the configured routing rules and the current state of the clusters (e.g., load, health). It then selects the optimal cluster to handle the request. Cluster Selection: The criteria for cluster selection can include geographic proximity to the user, the health and capacity of the clusters, and other custom rules defined in the MCI configuration. Request Forwarding: Once the optimal cluster is selected, the global load balancer forwards the request to an ingress controller in that cluster. Service Routing: The ingress controller within the chosen cluster then routes the request to the appropriate service based on the path, host, or other headers in the request. 4. Health Checks and Failover Continuous Monitoring: MCI continuously monitors the health and performance of all clusters and their services. This includes performing health checks and monitoring metrics to ensure each service is functioning correctly. Failover and Redundancy: In case a cluster becomes unhealthy or is unable to handle additional traffic, MCI automatically reroutes traffic to another healthy cluster, ensuring uninterrupted access to the application. 5. Scalability and Maintenance Dynamic Scaling: As traffic patterns change or as clusters are added or removed, MCI dynamically adjusts routing rules and load balancing to optimize performance and resource utilization. Configuration Updates: Changes to the application or its deployment across clusters can be managed centrally through the MCI configuration, simplifying updates and maintenance. Example Deployment YAML for Multi-cluster Ingress with FrontendConfig and BackendConfig This example includes: A simple web application deployment. A service to expose the application within the cluster. A MultiClusterService to expose the service across clusters. A MultiClusterIngress to expose the service externally with FrontendConfig and BackendConfig. The post What is Multi-cluster Ingress (MCI) appeared first on DevOpsSchool.com. View the full article
  4. Autopilot mode for Google Kubernetes Engine (GKE) is our take on a fully managed, Pod-based Kubernetes platform. It provides category-defining features with a fully functional Kubernetes API with support for StatefulSet with block storage, GPU and other critical functionality that you don’t often found in nodeless/serverless-style Kubernetes offerings, while still offering a Pod-level SLA and a super-simple developer API. But until now, Autopilot, like other products in this category, did not offer the ability to temporarily burst CPU and memory resources beyond what was requested by the workload. I’m happy to announce that now, powered by the unique design of GKE Autopilot on Google Cloud, we are able to bring burstable workload support to GKE Autopilot. Bursting allows your Pod to temporarily utilize resources outside of those resources that it requests and is billed for. How does this work, and how can Autopilot offer burstable support, given the Pod-based model? The key is that in Autopilot mode we still group Pods together on Nodes. This is what powers several unique features of Autopilot such as our flexible Pod sizes. With this change, the capacity of your Pods is pooled, and Pods that set a limit higher than their requests can temporarily burst into this capacity (if it’s available). With the introduction of burstable support, we’re also introducing another groundbreaking change: 50m CPU Pods — that is, Pods as small as 1/20th of a vCPU. Until now, the smallest Pod we offered was ¼ of a vCPU (250m CPU) — five times bigger. Combined with burst, the door is now open to run high-density-yet-small workloads on Autopilot, without constraining each Pod to its resource requests. We’re also dropping the 250m CPU resource increment, so you can create any size of Pod you like between the minimum to the maximum size, for example, 59m CPU, 302m, 808m, 7682m etc (the memory to CPU ratio is automatically kept within a supported range, sizing up if needed). With these finer-grained increments, Vertical Pod Autoscaling now works even better, helping you tune your workload sizes automatically. If you want to add a little more automation to figuring out the right resource requirements, give Vertical Pod Autoscaling a try! Here’s what our customer Ubie had to say: “GKE Autopilot frees us from managing infrastructure so we can focus on developing and deploying our applications. It eliminates the challenges of node optimization, a critical benefit in our fast-paced startup environment. Autopilot's new bursting feature and lowered CPU minimums offer cost optimization and support multi-container architectures such as sidecars. We were already running many workloads in Autopilot mode, and with these new features, we're excited to migrate our remaining clusters to Autopilot mode.” - Jun Sakata, Head of Platform Engineering, Ubie A new home for high-density workloadsAutopilot is now a great place to run high-density workloads. Let’s say for example you have a multi-tenant architecture with many smaller replicas of a Pod, each running a website that receives relatively little traffic, but has the occasional spike. Combining the new burstable feature and the lower 50m CPU minimum, you can now deploy thousands of these Pods in a cost-effective manner (up to 20 per vCPU), while enabling burst so that when that traffic spike comes in, that Pod can temporarily burst into the pooled capacity of these replicas. What’s even better is that you don’t need to concern yourself with bin-packing these workloads, or the problem where you may have a large node that’s underutilized (for example, a 32 core node running a single 50m CPU Pod). Autopilot takes care of all of that for you, so you can focus on what’s key to your business: building great services. Calling all startups, students, and solopreneursEveryone needs to start somewhere, and if Autopilot wasn’t the perfect home for your workload before, we think it is now. With the 50m CPU minimum size, you can run individual containers in us-central1 for under $2/month each (50m CPU, 50MiB). And thanks to burst, these containers can use a little extra CPU when traffic spikes, so they’re not completely constrained to that low size. And if this workload isn’t mission-critical, you can even run it in Spot mode, where the price is even lower. In fact, thanks to GKE’s free tier, your costs in us-central1 can be as low as $30/month for small workloads (including production-grade load balancing) while tapping into the same world-class infrastructure that powers some of the biggest sites on the internet today. Importantly, if your startup grows, you can scale in-place without needing to migrate — since you’re already running on a production-grade Kubernetes cluster. So you can start small, while being confident that there is nothing limiting about your operating environment. We’re counting on your success as well, so good luck! And if you’re learning GKE or Kubernetes, this is a great platform to learn on. Nothing beats learning on real production systems — after all, that’s what you’ll be using on the job. With one free cluster per account, and all these new features, Autopilot is a fantastic learning environment. Plus, if you delete your Kubernetes workload and networking resources when you’re done for the day, any associated costs cease as well. When you’re ready to resume, just create those resources again, and you’re back at it. You can even persist state between learning sessions by deleting just the compute resources (and keeping the persistent disks). Don’t forget there’s a $300 free trial to cover your initial usage as well! Next steps: Learn about Pod bursting in GKE.Read more about the minimum and maximum resource requests in Autopilot mode.Learn how to set resource requirements automatically with VPA.Try GKE’s Autopilot mode for a workload-based API and simpler Day 2 ops with lower TCO.Going to Next ‘24? Check out session DEV224 to hear Ubie talk about how it uses burstable workloads in GKE. View the full article
  5. It’s Saturday night. You’re out to dinner with friends. Suddenly, a familiar tune emits from your pocket. Dread fills you as you fish your phone out of your pocket and unlock it. You tap the alert. Maybe it’s a lucky night and this is one alert you can just snooze or resolve. Maybe it’s a bad night, and the next step is you pulling your laptop from your bag — because you bring your laptop everywhere when you’re on-call — and trying to troubleshoot a problem in a crowded, noisy restaurant. The post How to Escape the 3 AM Page as a Kubernetes Site Reliability Engineer appeared first on Security Boulevard. View the full article
  6. Today, Amazon EKS announces general availability of extended support for Kubernetes versions. You can run EKS clusters using any Kubernetes version for up to 26 months from the time the version becomes available on EKS. Extended support for Kubernetes versions is available for Kubernetes versions 1.21 and higher. For a full list of versions and support dates, see here. View the full article
  7. On March 29, 2024, Red Hat disclosed CVE-2024-3094, scoring a critical CVSS rating of 10. Stemming from a The post Bombshell in SSH servers! What CVE-2024-3094 means for Kubernetes users appeared first on ARMO. The post Bombshell in SSH servers! What CVE-2024-3094 means for Kubernetes users appeared first on Security Boulevard. View the full article
  8. Explore how Akeyless Vaultless Secrets Management integrates with the Kubernetes Secrets Store CSI Driver to enhance security and streamline secrets management in your Kubernetes environment. The post Enhancing Kubernetes Secrets Management with Akeyless and CSI Driver Integration appeared first on Akeyless. The post Enhancing Kubernetes Secrets Management with Akeyless and CSI Driver Integration appeared first on Security Boulevard. View the full article
  9. Kubernetes has changed the way many organizations approach the deployment of their applications. But despite its benefits, the additional layers of abstraction and reams of data can cause complexity around Kubernetes monitoring. We’ve seen so much of these challenges borne out in the results of the 2024 Observability Pulse survey. In the survey report, 36% […]View the full article
  10. If you are a DevOps engineer who works with Kubernetes, you might have encountered a situation where some of your Pods get stuck in a terminating state and refuse to go away. This can be frustrating, especially if you need to free up resources or deploy new versions of your applications. In this article, I will explain why this happens, how to diagnose the problem, and how to resolve it in a few simple steps. Why do Pods get stuck in terminating?When you delete a Pod, either manually or through deployment, the Pod enters the terminating phase. This means that the Pod is scheduled to be deleted, but it is not yet removed from the node. The terminating phase is supposed to be a short-lived transitional state, where the Pod gracefully shuts down its containers, releases its resources, and sends a termination signal to the kubelet. The kubelet then removes the Pod from the API server and deletes its local data. However, sometimes a Pod will get stuck in the terminating phase, meaning the deletion process is incomplete. Below are some of the reasons why this may happen: The Pod has a finalizer that prevents it from being deleted until a certain condition is met. A finalizer is a field in the Pod's metadata that specifies an external controller or resource that needs to perform some cleanup or finalization tasks before the Pod is deleted. For example, a Pod might have a finalizer that waits for a backup to finish or a volume to unmount.The Pod has a preStop hook that takes too long to execute or fails. A preStop hook is a command or a script that runs inside the container before it is terminated. It is used to perform some graceful shutdown actions, like closing connections, flushing buffers, or sending notifications. However, if the preStop hook takes longer than the terminationGracePeriodSeconds (which defaults to 30 seconds), the kubelet will forcefully kill the container, and the Pod will remain in the terminating state. Similarly, if the preStop hook fails or returns a non-zero exit code, the Pod will not be deleted.The Pod is part of a StatefulSet that has a PodManagementPolicy of OrderedReady. A StatefulSet is a controller that manages Pods that have a stable identity and order. The PodManagementPolicy determines how the Pods are created and deleted. If the policy is OrderedReady, the Pods are created and deleted one by one in a strict order. This means that if a Pod is stuck in terminating, the next Pod in the sequence will not be created or deleted until the previous one is resolved.If you need a refresher on Pods in Kubernetes, you can read our article on What Are Pods in Kubernetes? A Quick Explanation for a simple overview. Resolving Pod Stuck in Termination Due to FinalizerIn this section, we shall use a simple example to demonstrate how to resolve the termination issue due to the finalizer. Create a YAML file named deployment.yaml with the following specs: apiVersion: v1 kind: Pod metadata: name: finalizer-demo finalizers: - kubernetes spec: containers: - name: finalizer-demo image: nginx:latest ports: - containerPort: 80 Create a Pod by running the following command: kubectl create -f deployment.yamlVerify the Pod has been created using the following command: kubectl get podsAfter the Pod creation process is complete, delete it by running the following command: kubectl delete pod finalizer-demoIf you run the kubectl get pods again, you’ll see the Pod is now stuck in the terminating stage. When you deleted the Pod, it was not deleted. Instead, it was modified to include deletion time. To view this, run the following command and check the metadata section: kubectl get pod/finalizer-demo -o yamlYou should see this in the metadata section: To delete this pod, you’ll need to remove the finalizer manually. You do this by running the command: kubectl edit pod finalizer-demo -n defaultThis will open the Pod's YAML definition in your default editor. Find the finalizers field in the metadata section and delete the line that contains the finalizer name. Save and exit the editor. This will update the Pod's definition and trigger its deletion. Resolving Pod Stuck in Termination Due to PreStop HookJust like in the previous section, we shall use a simple example to demonstrate how to resolve the termination issue due to the preStop hook. Create a YAML file named deployment.yaml with the following specs: apiVersion: v1 kind: Pod metadata: name: prestop-demo spec: terminationGracePeriodSeconds: 3600 containers: - name: prestop-demo image: nginx:latest ports: - containerPort: 80 lifecycle: preStop: exec: command: - /bin/sh - -c - sleep 3600 Create a Pod by running the following command: kubectl create -f deployment.yamlVerify the Pod has been created using the following command: kubectl get podsAfter the Pod creation process is complete, delete it by running the following command: kubectl delete pod prestop-demoIf you run the kubectl get pods again, you’ll see the Pod is now stuck in the terminating stage. When you deleted the Pod, it was not deleted. Instead, it was modified to include deletion time. To view this, run the following command and check the metadata section: kubectl get pod/prestop-demo -o yamlYou should see this in the metadata section: Note that the deletionGracePeriodSeconds is set at 3600. This means the pod will be terminated one hour after the delete command is executed. However, you can remove it instantly and bypass the grace period. You do that by setting the new grace period to 0 using the following command: kubectl delete Pod prestop-demo -n default --force --grace-period=0The command above bypasses the graceful termination process and sends a SIGKILL signal to the Pod's containers. This might cause some data loss or corruption, so use this method with caution. Resolving Pod Stuck In Termination Due To PodManagementPolicy of OrderedReadyAgain, we shall use a simple example to demonstrate how to resolve the termination issue due to PodManagementPolicy of OrderedReady. Create a YAML file named statefulset.yaml with the following specs. apiVersion: apps/v1 kind: StatefulSet metadata: name: web spec: serviceName: "nginx" replicas: 3 podManagementPolicy: OrderedReady # default value selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx:latest ports: - containerPort: 80 name: web volumeMounts: - name: www mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: www spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 1Gi Create a StatefulSet by running the following command: kubectl create -f statefulset.yamlVerify that StatefulSet has been created using the following command: kubectl get statefulsetsNow, try to delete the StatefulSet to see if it will be stuck in the terminating stage: kubectl delete statefulset webIf you run the kubectl get pods command, you will see that the Pods are deleted one by one, starting from the highest ordinal to the lowest. This is because the OrderedReady policy ensures that the Pods are created and deleted in order. This can be slow and inefficient, especially if you have a large number of Pods. To change the PodManagementPolicy to Parallel, you need to edit the StatefulSet’s YAML definition. You do this by running the command: kubectl edit statefulset web -n defaultThis will open the StatefulSet’s YAML definition in your default editor. Find the podManagementPolicy field in the spec section and change its value from OrderedReady to Parallel. Save and exit the editor. This will update the StatefulSet’s definition and allow it to create and delete Pods in parallel without waiting for the previous ones to be resolved. Now, if you try to delete the StatefulSet again, you will see that the Pods are deleted in parallel, without any order. This can be faster and more efficient, especially if you have a large number of Pods. kubectl delete statefulset webIf you run the kubectl get pods command, you will see something like this: Prevent Future OccurrencesBelow are some preventative measures you can take to avoid having Pods stuck in termination: Implement Pod hooks and preStop for graceful shutdown. You can use these hooks to perform some cleanup or finalization tasks, such as closing connections, flushing buffers, or sending notifications. This will help the Pod to shut down gracefully and avoid errors or data loss.Add liveness/readiness probes to prevent recreating loops. Use these probes to detect and recover from failures, such as deadlocks, crashes, or network issues. This will prevent the Pod from getting stuck in a loop of recreating and terminating.Learn more about readiness probe in this article: Kubernetes Readiness Probe: A Simple Guide with Examples Handle orphaned/erroneous child processes. Sometimes, a container might spawn child processes that are not properly terminated when the container exits. These processes might keep running in the background, consuming resources and preventing the Pod from being deleted. Below are the methods you can use to handle these processes:Use a PID namespace. A PID namespace is a feature that isolates the process IDs of a group of processes. This means that each Pod has its own set of process IDs, and the processes inside the Pod cannot interact with the processes outside the Pod. This will ensure that all the processes in the Pod are terminated together when the Pod is deleted.Use an init container. An init container is a special type of container that runs before the main containers in the Pod. It is used to perform some initialization tasks, such as setting up the environment, installing dependencies, or configuring the network. You can also use an init container to run a process manager, such as tini or dumb-init, that will act as the parent process of the main containers and handle the signals and reaping of the child processes. This will ensure that no processes are left behind when the Pod is deleted.Monitor and alert on long termination times. Use metrics and events to monitor and alert on long termination times. This will allow you to address the termination issues as soon as they occur.Check out our Kubernetes Learning Path to start learning Kubernetes today Conclusion In this article, we have explained why Kubernetes Pods get stuck in the ‘Terminating’ phase, how to diagnose the problem, and how to resolve it in a few simple steps. I hope you found this article helpful and interesting. If you have any questions or feedback, please feel free to leave a comment below. Are you ready for practical learning? Subscribe now on our plan and pricing page to unlock 70+ top DevOps courses. Begin your DevOps journey today! View the full article
  11. When developers are innovating quickly, security can be an afterthought. That’s even true for AI/ML workloads, where the stakes are high for organizations trying to protect valuable models and data. When you deploy an AI workload on Google Kubernetes Engine (GKE), you can benefit from the many security tools available in Google Cloud infrastructure. In this blog, we share security insights and hardening techniques for training AI/ML workloads on one framework in particular — Ray. Ray needs security hardening As a distributed compute framework for AI applications, Ray has grown in popularity in recent years, and deploying it on GKE is a popular choice that provides flexibility and configurable orchestration. You can read more on why we recommend GKE for Ray. However, Ray lacks built-in authentication and authorization, which means that if you can successfully send a request to the Ray cluster head, it will execute arbitrary code on your behalf. So how do you secure Ray? The authors state that security should be enforced outside of the Ray cluster, but how do you actually harden it? Running Ray on GKE can help you achieve a more secure, scalable, and reliable Ray deployment by taking advantage of existing global Google infrastructure components including Identity-Aware Proxy (IAP). We’re also making strides in the Ray community to make safer defaults for running Ray with Kubernetes using KubeRay. One focus area has been improving Ray component compliance with the restricted Pod Security Standards profile and by adding security best practices, such as running the operator as non-root to help prevent privilege escalation. Security separation supports multi-cluster operation One key advantage of running Ray inside Kubernetes is the ability to run multiple Ray clusters, with diverse workloads, managed by multiple teams, inside a single Kubernetes cluster. This gives you better resource sharing and utilization because nodes with accelerators can be used by several teams, and spinning up Ray on an existing GKE cluster saves waiting on VM provisioning time before workloads can begin execution. Security plays a supporting role in landing those multi-cluster advantages by using Kubernetes security features to help keep Ray clusters separate. The goal is to avoid accidental denial of service or accidental cross-tenant access. Note that the security separation here is not “hard” multitenancy — it is only sufficient for clusters running trusted code and teams that trust each other with their data. If further isolation is required, consider using separate GKE clusters. The architecture is shown in the following diagram. Different Ray clusters are separated by namespaces within the GKE cluster, allowing authorized users to make calls to their assigned Ray cluster, without accessing others. Diagram: Ray on GKE Architecture How to secure Ray on GKE At Google Cloud, we’ve been working on improving the security of KubeRay components, and making it easier to spin up a multi-team environment with the help of Terraform templates including sample security configurations that you can reuse. Below, we’ve summarized fundamental security best practices included in our sample templates: Namespaces: Separate Ray clusters into distinct namespaces by placing one Ray cluster per namespace to take advantage of Kubernetes policies based on the namespace boundary. Role-based access control (RBAC): Practice least privilege by creating a Kubernetes Service Account (KSA) per Ray cluster namespace, avoid using the default KSA associated with each Ray cluster namespace, and minimizing permissions down to no RoleBindings until deemed necessary. Optionally, consider setting automountServiceAccountToken:false on the KSA to ensure the KSA’s token is not available to the Ray cluster Pods, since Ray jobs are not expected to call the Kubernetes API. Resource quotas: Harden against denial of service due to resource exhaustion by setting limits for resource quotas (especially for CPUs, GPUs, TPUs, and memory) on your Ray cluster namespace. NetworkPolicy: Protect the Ray API as a critical measure to Ray security, since there is no authentication or authorization for submitting jobs. Use Kubernetes NetworkPolicy with GKE Dataplane V2 to control which traffic reaches the Ray components. Security context: Comply with Kubernetes Pod Security Standards by configuring Pods to run with hardened settings preventing privilege escalation, running as root, and restricting potentially dangerous syscalls. Workload identity federation: If necessary, secure access from your Ray deployment Pods to other Google Cloud services with workload identity federation such as Cloud Storage, by leveraging your KSA in a Google Cloud IAM policy. Additional security tools The following tools and references can provide additional security for your Ray clusters on GKE: Identity-Aware Policy (IAP): Control access to your Ray cluster with Google’s distributed global endpoint with IAP providing user and group authorization, with Ray deployed as an Kubernetes Ingress or Gateway service. Pod Security Standards (PSS): Turn Pod Security Standards on for each of your namespaces in order to prevent common insecure misconfigurations such as HostVolume mounts. If you need more policy customization, you can also use Policy Controller. GKE Sandbox: Leverage GKE Sandbox Pods based on gVisor to add a second security layer around Pods, further reducing the possibility of breakouts for your Ray clusters. Currently available for CPUs (also GPUs with some limitations). Cluster hardening: By default, GKE Autopilot already applies a lot of cluster hardening best practices, but there are some additional ways to lock down the cluster. The Ray API can be further secured by removing access from the Internet by using private nodes. Organization policies: Ensure your organization's clusters meet security and hardening standards by setting custom organization policies — for example, guarantee that all GKE clusters are Autopilot. Google continues to contribute to the community through our efforts to ensure safe and scalable deployments. We look forward to continued collaboration to ensure Ray runs safely on Kubernetes clusters. Please drop us a line with any feedback at ray-on-gke@google.com or comment on our GitHub repo. To learn more, check out the following resources: Terraform templates for hardening Ray on GKE GKE cluster hardening guide View the full article
  12. Author: Tim Hockin (Google) The Go programming language has played a huge role in the success of Kubernetes. As Kubernetes has grown, matured, and pushed the bounds of what "regular" projects do, the Go project team has also grown and evolved the language and tools. In recent releases, Go introduced a feature called "workspaces" which was aimed at making projects like Kubernetes easier to manage. We've just completed a major effort to adopt workspaces in Kubernetes, and the results are great. Our codebase is simpler and less error-prone, and we're no longer off on our own technology island. GOPATH and Go modules Kubernetes is one of the most visible open source projects written in Go. The earliest versions of Kubernetes, dating back to 2014, were built with Go 1.3. Today, 10 years later, Go is up to version 1.22 — and let's just say that a whole lot has changed. In 2014, Go development was entirely based on GOPATH. As a Go project, Kubernetes lived by the rules of GOPATH. In the buildup to Kubernetes 1.4 (mid 2016), we introduced a directory tree called staging. This allowed us to pretend to be multiple projects, but still exist within one git repository (which had advantages for development velocity). The magic of GOPATH allowed this to work. Kubernetes depends on several code-generation tools which have to find, read, and write Go code packages. Unsurprisingly, those tools grew to rely on GOPATH. This all worked pretty well until Go introduced modules in Go 1.11 (mid 2018). Modules were an answer to many issues around GOPATH. They gave more control to projects on how to track and manage dependencies, and were overall a great step forward. Kubernetes adopted them. However, modules had one major drawback — most Go tools could not work on multiple modules at once. This was a problem for our code-generation tools and scripts. Thankfully, Go offered a way to temporarily disable modules (GO111MODULE to the rescue). We could get the dependency tracking benefits of modules, but the flexibility of GOPATH for our tools. We even wrote helper tools to create fake GOPATH trees and played tricks with symlinks in our vendor directory (which holds a snapshot of our external dependencies), and we made it all work. And for the last 5 years it has worked pretty well. That is, it worked well unless you looked too closely at what was happening. Woe be upon you if you had the misfortune to work on one of the code-generation tools, or the build system, or the ever-expanding suite of bespoke shell scripts we use to glue everything together. The problems Like any large software project, we Kubernetes developers have all learned to deal with a certain amount of constant low-grade pain. Our custom staging mechanism let us bend the rules of Go; it was a little clunky, but when it worked (which was most of the time) it worked pretty well. When it failed, the errors were inscrutable and un-Googleable — nobody else was doing the silly things we were doing. Usually the fix was to re-run one or more of the update-* shell scripts in our aptly named hack directory. As time went on we drifted farther and farher from "regular" Go projects. At the same time, Kubernetes got more and more popular. For many people, Kubernetes was their first experience with Go, and it wasn't always a good experience. Our eccentricities also impacted people who consumed some of our code, such as our client library and the code-generation tools (which turned out to be useful in the growing ecosystem of custom resources). The tools only worked if you stored your code in a particular GOPATH-compatible directory structure, even though GOPATH had been replaced by modules more than four years prior. This state persisted because of the confluence of three factors: Most of the time it only hurt a little (punctuated with short moments of more acute pain). Kubernetes was still growing in popularity - we all had other, more urgent things to work on. The fix was not obvious, and whatever we came up with was going to be both hard and tedious. As a Kubernetes maintainer and long-timer, my fingerprints were all over the build system, the code-generation tools, and the hack scripts. While the pain of our mess may have been low on_average, I was one of the people who felt it regularly. Enter workspaces Along the way, the Go language team saw what we (and others) were doing and didn't love it. They designed a new way of stitching multiple modules together into a new workspace concept. Once enrolled in a workspace, Go tools had enough information to work in any directory structure and across modules, without GOPATH or symlinks or other dirty tricks. When I first saw this proposal I knew that this was the way out. This was how to break the logjam. If workspaces was the technical solution, then I would put in the work to make it happen. The work Adopting workspaces was deceptively easy. I very quickly had the codebase compiling and running tests with workspaces enabled. I set out to purge the repository of anything GOPATH related. That's when I hit the first real bump - the code-generation tools. We had about a dozen tools, totalling several thousand lines of code. All of them were built using an internal framework called gengo, which was built on Go's own parsing libraries. There were two main problems: Those parsing libraries didn't understand modules or workspaces. GOPATH allowed us to pretend that Go package paths and directories on disk were interchangeable in trivial ways. They are not. Switching to a modules- and workspaces-aware parsing library was the first step. Then I had to make a long series of changes to each of the code-generation tools. Critically, I had to find a way to do it that was possible for some other person to review! I knew that I needed reviewers who could cover the breadth of changes and reviewers who could go into great depth on specific topics like gengo and Go's module semantics. Looking at the history for the areas I was touching, I asked Joe Betz and Alex Zielenski (SIG API Machinery) to go deep on gengo and code-generation, Jordan Liggitt (SIG Architecture and all-around wizard) to cover Go modules and vendoring and the hack scripts, and Antonio Ojea (wearing his SIG Testing hat) to make sure the whole thing made sense. We agreed that a series of small commits would be easiest to review, even if the codebase might not actually work at each commit. Sadly, these were not mechanical changes. I had to dig into each tool to figure out where they were processing disk paths versus where they were processing package names, and where those were being conflated. I made extensive use of the delve debugger, which I just can't say enough good things about. One unfortunate result of this work was that I had to break compatibility. The gengo library simply did not have enough information to process packages outside of GOPATH. After discussion with gengo and Kubernetes maintainers, we agreed to make gengo/v2. I also used this as an opportunity to clean up some of the gengo APIs and the tools' CLIs to be more understandable and not conflate packages and directories. For example you can't just string-join directory names and assume the result is a valid package name. Once I had the code-generation tools converted, I shifted attention to the dozens of scripts in the hack directory. One by one I had to run them, debug, and fix failures. Some of them needed minor changes and some needed to be rewritten. Along the way we hit some cases that Go did not support, like workspace vendoring. Kubernetes depends on vendoring to ensure that our dependencies are always available, even if their source code is removed from the internet (it has happened more than once!). After discussing with the Go team, and looking at possible workarounds, they decided the right path was to implement workspace vendoring. The eventual Pull Request contained over 200 individual commits. Results Now that this work has been merged, what does this mean for Kubernetes users? Pretty much nothing. No features were added or changed. This work was not about fixing bugs (and hopefully none were introduced). This work was mainly for the benefit of the Kubernetes project, to help and simplify the lives of the core maintainers. In fact, it would not be a lie to say that it was rather self-serving - my own life is a little bit better now. This effort, while unusually large, is just a tiny fraction of the overall maintenance work that needs to be done. Like any large project, we have lots of "technical debt" — tools that made point-in-time assumptions and need revisiting, internal APIs whose organization doesn't make sense, code which doesn't follow conventions which didn't exist at the time, and tests which aren't as rigorous as they could be, just to throw out a few examples. This work is often called "grungy" or "dirty", but in reality it's just an indication that the project has grown and evolved. I love this stuff, but there's far more than I can ever tackle on my own, which makes it an interesting way for people to get involved. As our unofficial motto goes: "chop wood and carry water". Kubernetes used to be a case-study of how not to do large-scale Go development, but now our codebase is simpler (and in some cases faster!) and more consistent. Things that previously seemed like they should work, but didn't, now behave as expected. Our project is now a little more "regular". Not completely so, but we're getting closer. Thanks This effort would not have been possible without tons of support. First, thanks to the Go team for hearing our pain, taking feedback, and solving the problems for us. Special mega-thanks goes to Michael Matloob, on the Go team at Google, who designed and implemented workspaces. He guided me every step of the way, and was very generous with his time, answering all my questions, no matter how dumb. Writing code is just half of the work, so another special thanks to my reviewers: Jordan Liggitt, Joe Betz, Alexander Zielenski, and Antonio Ojea. These folks brought a wealth of expertise and attention to detail, and made this work smarter and safer. View the full article
  13. In the ever-evolving world of software development, efficiency and clarity in managing complex systems have become paramount. Kubernetes, the de facto orchestrator for containerized applications, brings its own set of challenges, especially when dealing with the vast amounts of JSON-formatted data it generates. Here, jq, a lightweight and powerful command-line JSON processor, emerges as a vital tool in a DevOps professional's arsenal. This comprehensive guide explores how to leverage jq to simplify, process, and analyze Kubernetes data, enhancing both productivity and insight. Understanding jq and Kubernetes Before diving into the integration of jq with Kubernetes, it's essential to grasp the basics. jq is a tool designed to transform, filter, map, and manipulate JSON data with ease. Kubernetes, on the other hand, manages containerized applications across a cluster of machines, producing and utilizing JSON outputs extensively through its API and command-line tools like kubectl. View the full article
  14. Kubernetes revolutionised container orchestration, allowing faster and more reliable application deployment and management. But even though it transformed the world of DevOps, it introduced new challenges around security maintenance, networking and application lifecycle management. Canonical has a long history of providing production-grade Kubernetes distributions, which gave us great insights into Kubernetes’ challenges and the unique experience of delivering K8s that match the expectations of both developers and operations teams. Unsurprisingly, there is a world of difference between them. Developers need a quick and reproducible way to set up an application environment on their workstations. Operations teams with clusters powering the edge need lightweight high-availability setups with reliable upgrades. Cloud installations need intelligent cluster lifecycle automation to ensure applications can be integrated with each other and the underlying infrastructure. We provide two distributions, Charmed Kubernetes and MicroK8s, to meet those different expectations. Charmed Kubernetes wraps upstream K8s with software operators to provide lifecycle management and automation for large and complex environments. It is also the best choice if the Kubernetes cluster has to integrate with custom storage, networking or GPU components. Microk8s has a thriving community of users; it is a production-grade, ZeroOps solution that powers laptops and edge environments. It is the simplest way to get Kubernetes anywhere and focus on software product development instead of working with infrastructure routines and operations. After providing Kubernetes distributions for over seven years, we decided to consolidate our experience into a new distribution that combines the best of both worlds: ZeroOps for small clusters and intelligent automation for larger production environments that also want to benefit from the latest community innovations. Canonical Kubernetes will be our third distribution and an excellent foundation for future MicroK8s and Charmed Kubernetes releases. You can find its beta in our Snap Store under the simple name k8s. We based it on the latest upstream Kubernetes 1.30 beta, which officially came out on 12 March. It will be a CNCF conformant distribution with an enhanced security posture and best-in-class open source components for the most demanding user needs: network, DNS, metrics server, local storage, ingress, gateway, and load balancer. ZeroOps with the most essential features built-in Canonical Kubernetes is easy to install and easy to maintain. Like MicroK8s, Canonical Kubernetes is installed as a snap, giving developers a great installation experience and advanced security features such as automated patch upgrades. Adding new nodes to your cluster comes with minimum hassle. It also provides a quick way to set up high availability. You need two commands to get a single node cluster, one for installation and another for cluster bootstrap. You can try it out now on your console by installing the k8s snap from the beta channel: sudo snap install k8s --channel=1.30-classic/beta --classic sudo k8s bootstrap If you look at the status of your cluster just after bootstrap – with the help of the k8s status command – you might immediately spot that the network, dns, and metrics-server are already running. In addition to those three, Canonical Kubernetes also provides local-storage, ingress, gateway, and load-balancer, which you can easily enable. Under the hood, these are powered by Cilium, CoreDNS, OpenEBS, and Metrics Server. We bundle these as built-in features to ensure tight integration and a seamless experience. We want to emphasise standard Kubernetes APIs and abstractions to minimise disruption during upgrades while enabling the platform to evolve. All our built-in features come with default configurations that make sense for the most popular use cases, but you can easily change them to suit your needs. Same Kubernetes for developer workstations, edge, cloud and data centres Typical application development flows start with the developer workstation and go through CI/CD pipelines to end up in the production environment. These software delivery stages, spanning various environments, should be closely aligned to enhance developer experience and avoid infrastructure configuration surprises as your software progresses through the pipeline. When done right, you can deploy applications faster. You also get better security assurance as everyone can use the same K8s binary offered by the same vendor across the entire infrastructure software stack. When you scale up from the workstation to a production environment, you will inevitably be exposed to a different class of problems inherent to large-scale infrastructure. For instance, managing and upgrading cluster nodes becomes complicated and time-consuming as the number of nodes and applications grows. To provide the smooth automation administrators need, we offer Kubernetes lifecycle management through Juju, Canonical’s open source orchestration engine for software operators. If you have Juju installed on your machine already, a Canonical Kubernetes cluster is only a single command away: juju deploy k8s --channel edge By letting Juju Charm automate your lifecycle management, you can benefit from its rich integration ecosystem, including the Canonical Observability Stack. Enhanced security posture Security is critical to any Kubernetes cluster, and we have addressed it from the beginning. Canonical Kubernetes 1.30 instals as a snap with a classic confinement level, enabling automatic patch upgrades to protect your infrastructure against known vulnerabilities. Canonical Kubernetes will be shipped as a strict snap in the future, which means it will run in complete isolation with minimal access to the underlying system’s resources. Additionally, Canonical Kubernetes will comply with security standards like FIPS, CIS and DISA-STIG. Critical functionalities we have built into Canonical Kubernetes, such as networking or dns, are shipped as secure container images maintained by our team. Those images are built with Ubuntu as their base OS and benefit from the same security commitments we make on the distribution. While it is necessary to contain core Kubernetes processes, we must also ensure that the user or operator-provided workloads running on top get a secure, adequately controlled environment. Future versions of Canonical Kubernetes will provide AppArmor profiles for the containers that do not inherit the enhanced features of the underlying container runtime. We will also work on creating an allowlist for kernel modules that can be loaded using the Kubernetes Daemonsets. It will contain a default list of the most popular modules, such as GPU modules needed by AI workloads. Operators will be able to edit the allowlist to suit their needs. Try out Canonical Kubernetes 1.30 beta We would love for you to try all the latest features in upstream Kubernetes through our beta. Get started by visiting http://documentation.ubuntu.com/canonical-kubernetes Besides getting a taste of the features I outlined above, you’ll be able to try exciting changes that will soon be included in the upcoming upstream GA release on 17 April 2024. Among others, CEL for admission controls will become stable, and the drop-in directory for Kubelet configuration files will go to the beta stage. Additionally, Contextual logging and CRDValidationRatcheting will graduate to beta and be enabled by default. There are also new metrics, such as image_pull_duration_seconds, which can tell you how much time the node spent waiting for the image. We want Canonical Kubernetes to be a great K8s for everyone, from developers to large-scale cluster administrators. Try it out and let us know what you think. We would love your feedback! You can find contact information on our community page. We’ll also be available at KubeCon in Paris, at booth E25 – if you are there, come and say hi. View the full article
  15. In the realm of containerized applications, Kubernetes reigns supreme. But with great power comes great responsibility, especially when it comes to safeguarding sensitive data within your cluster. Terraform, the infrastructure-as-code darling, offers a powerful solution for managing Kubernetes Secrets securely and efficiently. This blog delves beyond the basics, exploring advanced techniques and considerations for leveraging Terraform to manage your Kubernetes Secrets. Understanding Kubernetes Secrets Kubernetes Secrets provides a mechanism to store and manage sensitive information like passwords, API keys, and tokens used by your applications within the cluster. These secrets are not directly exposed in the container image and are instead injected into the pods at runtime. View the full article
  16. Authors: Amit Dsouza, Frederick Kautz, Kristin Martin, Abigail McCarthy, Natali Vlatko A quick look: exciting changes in Kubernetes v1.30 It's a new year and a new Kubernetes release. We're halfway through the release cycle and have quite a few interesting and exciting enhancements coming in v1.30. From brand new features in alpha, to established features graduating to stable, to long-awaited improvements, this release has something for everyone to pay attention to! To tide you over until the official release, here's a sneak peek of the enhancements we're most excited about in this cycle! Major changes for Kubernetes v1.30 Structured parameters for dynamic resource allocation (KEP-4381) Dynamic resource allocation was added to Kubernetes as an alpha feature in v1.26. It defines an alternative to the traditional device-plugin API for requesting access to third-party resources. By design, dynamic resource allocation uses parameters for resources that are completely opaque to core Kubernetes. This approach poses a problem for the Cluster Autoscaler (CA) or any higher-level controller that needs to make decisions for a group of pods (e.g. a job scheduler). It cannot simulate the effect of allocating or deallocating claims over time. Only the third-party DRA drivers have the information available to do this. ​​Structured Parameters for dynamic resource allocation is an extension to the original implementation that addresses this problem by building a framework to support making these claim parameters less opaque. Instead of handling the semantics of all claim parameters themselves, drivers could manage resources and describe them using a specific "structured model" pre-defined by Kubernetes. This would allow components aware of this "structured model" to make decisions about these resources without outsourcing them to some third-party controller. For example, the scheduler could allocate claims rapidly without back-and-forth communication with dynamic resource allocation drivers. Work done for this release centers on defining the framework necessary to enable different "structured models" and to implement the "named resources" model. This model allows listing individual resource instances and, compared to the traditional device plugin API, adds the ability to select those instances individually via attributes. Node memory swap support (KEP-2400) In Kubernetes v1.30, memory swap support on Linux nodes gets a big change to how it works - with a strong emphasis on improving system stability. In previous Kubernetes versions, the NodeSwap feature gate was disabled by default, and when enabled, it used UnlimitedSwap behavior as the default behavior. To achieve better stability, UnlimitedSwap behavior (which might compromise node stability) will be removed in v1.30. The updated, still-beta support for swap on Linux nodes will be available by default. However, the default behavior will be to run the node set to NoSwap (not UnlimitedSwap) mode. In NoSwap mode, the kubelet supports running on a node where swap space is active, but Pods don't use any of the page file. You'll still need to set --fail-swap-on=false for the kubelet to run on that node. However, the big change is the other mode: LimitedSwap. In this mode, the kubelet actually uses the page file on that node and allows Pods to have some of their virtual memory paged out. Containers (and their parent pods) do not have access to swap beyond their memory limit, but the system can still use the swap space if available. Kubernetes' Node special interest group (SIG Node) will also update the documentation to help you understand how to use the revised implementation, based on feedback from end users, contributors, and the wider Kubernetes community. Read the previous blog post or the node swap documentation for more details on Linux node swap support in Kubernetes. Support user namespaces in pods (KEP-127) User namespaces is a Linux-only feature that better isolates pods to prevent or mitigate several CVEs rated high/critical, including CVE-2024-21626, published in January 2024. In Kubernetes 1.30, support for user namespaces is migrating to beta and now supports pods with and without volumes, custom UID/GID ranges, and more! Structured authorization configuration (KEP-3221) Support for structured authorization configuration.) is moving to beta and will be enabled by default. This feature enables the creation of authorization chains with multiple webhooks with well-defined parameters that validate requests in a particular order and allows fine-grained control – such as explicit Deny on failures. The configuration file approach even allows you to specify CEL rules to pre-filter requests before they are dispatched to webhooks, helping you to prevent unnecessary invocations. The API server also automatically reloads the authorizer chain when the configuration file is modified. You must specify the path to that authorization configuration using the --authorization-config command line argument. If you want to keep using command line flags instead of a configuration file, those will continue to work as-is. To gain access to new authorization webhook capabilities like multiple webhooks, failure policy, and pre-filter rules, switch to putting options in an --authorization-config file. From Kubernetes 1.30, the configuration file format is beta-level, and only requires specifying --authorization-config since the feature gate is enabled by default. An example configuration with all possible values is provided in the Authorization docs. For more details, read the Authorization docs. Container resource based pod autoscaling (KEP-1610) Horizontal pod autoscaling based on ContainerResource metrics will graduate to stable in v1.30. This new behavior for HorizontalPodAutoscaler allows you to configure automatic scaling based on the resource usage for individual containers, rather than the aggregate resource use over a Pod. See our previous article for further details, or read container resource metrics. CEL for admission control (KEP-3488) Integrating Common Expression Language (CEL) for admission control in Kubernetes introduces a more dynamic and expressive way of evaluating admission requests. This feature allows complex, fine-grained policies to be defined and enforced directly through the Kubernetes API, enhancing security and governance capabilities without compromising performance or flexibility. CEL's addition to Kubernetes admission control empowers cluster administrators to craft intricate rules that can evaluate the content of API requests against the desired state and policies of the cluster without resorting to Webhook-based access controllers. This level of control is crucial for maintaining the integrity, security, and efficiency of cluster operations, making Kubernetes environments more robust and adaptable to various use cases and requirements. For more information on using CEL for admission control, see the API documentation for ValidatingAdmissionPolicy. We hope you're as excited for this release as we are. Keep an eye out for the official release blog in a few weeks for more highlights! View the full article
  17. Amazon CloudWatch Container Insights with Enhanced Observability for EKS now auto-discovers critical health and performance metrics from your NVIDIA GPUs and delivers them in automatic dashboards to enable faster problem isolation and troubleshooting for your AI/ML workloads. Container Insights with Enhanced Observability delivers you out-of-the-box trends and patterns on your infrastructure health and removes the overhead of manual dashboard and alarm set-ups saving you time and effort. View the full article
  18. Optimizing resource utilization is a universal aspiration, but achieving it is considerably more complex than one might express in mere words. The process demands extensive performance testing, precise server right-sizing, and numerous adjustments to resource specifications. These challenges persist and, indeed, become more nuanced within Kubernetes environments than in traditional systems. At the core of constructing a high-performing and cost-effective Kubernetes cluster is the art of efficiently managing resources by tailoring your Kubernetes workloads. Delving into the intricacies of Kubernetes, it's essential to comprehend the different components that interact when deploying applications on k8s clusters. During my research for this article, an enlightening piece on LinkedIn caught my attention, underscoring the tendency of enterprises to overprovision their Kubernetes clusters. I propose solutions for enterprises to enhance their cluster efficiency and reduce expenses. View the full article
  19. In the rapidly evolving landscape of container orchestration, Kubernetes has emerged as the de facto standard, offering a robust framework for deploying, managing, and scaling containerized applications. One of the cornerstone features of Kubernetes is its powerful and flexible scheduling system, which efficiently allocates workloads across a cluster of machines, known as nodes. This article delves deep into the mechanics of Kubernetes scheduling, focusing on the pivotal roles of pods and nodes, to equip technology professionals with the knowledge to harness the full potential of Kubernetes in their projects. Understanding Kubernetes Pods A pod is the smallest deployable unit in Kubernetes and serves as a wrapper for one or more containers that share the same context and resources. Pods encapsulate application containers, storage resources, a unique network IP, and options that govern how the container(s) should run. A key concept to grasp is that pods are ephemeral by nature; they are created and destroyed to match the state of your application as defined in deployments. View the full article
  20. This article will lead you through installing and configuring Prometheus, a popular open-source monitoring and alerting toolset, in a Kubernetes context. Prometheus is extensively used for cloud-native applications since it is built to monitor and gather metrics from many services and systems. This post will walk you through setting up Prometheus to successfully monitor your Kubernetes cluster. Prerequisites Before you begin, ensure you have the following prerequisites in place: View the full article
  21. We've all been there - you're testing out a new Kubernetes deployment only to be greeted with frustrating errors like "imagepullbackoff" or "errImagePull". I have faced these two errors many times; I know the frustration of just wanting your pods to run seamlessly. In this article, I'll walk through some common causes for image pull failures, how to troubleshoot and fix these errors, and how to avoid them in the future. What Causes ImagePullBackOff & ErrImagePull Errors?The ImagePullBackOff and ErrImagePull errors are two of the most common pod failures in Kubernetes. They both mean that the pod cannot start because the container image cannot be pulled from the registry. The difference between them is that ErrImagePull is the initial error, and ImagePullBackOff is the subsequent error after Kubernetes retries to pull the image several times and fails. Below are 5 possible causes of these errors. Cause 1: Network Issues Preventing Image PullOne of the possible causes of the ImagePullBackOff or ErrImagePull errors is network issues that prevent Pods and nodes from accessing the remote container image registries. This can be due to the following: The registry URL is incorrect or unreachable.The network or firewall configuration is blocking the connection to the registry.The proxy settings are not configured properly.To troubleshoot this cause, do the following: Check network connectivity Validate that the Pods and nodes can access the remote container image registries by using the `curl` or `wget` commands. If the command returns a valid response, it means that the network connectivity is fine. But if the command returns an error or times out, it means that there is a network issue that needs to be fixed. Check firewall rulesIf you have a network firewall, make sure that it allows outbound access to the required ports for the registry. For instance, if your registry is Docker Hub, you must connect to port 443 for HTTPS. You can use commands like `iptables` or `firewall-cmd` to see and change the firewall rules on your nodes. If the firewall rules are not configured properly, you need to update them to allow the connection to the registry Check proxy settingsIf you are pulling images through a proxy, make sure that you configure the HTTPS proxy settings on your nodes and Pods. You can use the `https_proxy` environment variable to set the proxy URL on your nodes and Pods. You can also use the `imagePullSecrets` field in your Pod spec to provide the proxy credentials to your Pods. For example, you can create a secret named `my-proxy-secret` with the proxy credentials and then use it in your Pod spec as shown below: apiVersion: v1 kind: Pod metadata: name: my-pod spec: containers: - name: my-container image: my-image:latest imagePullPolicy: Always imagePullSecrets: - name: my-proxy-secretIf the proxy settings are wrong, you need to update them to enable the image pull through the proxy. Cause 2: Invalid Image Names or TagsAnother reason for these errors is the mismatch between the image names or tags used and the image names and tags in the registry. This might be because there are issues with image names or tags, like typos, mismatch with the registry or using the latest tag which might cause unexpected updates. To fix this problem, do the following: Validate image names and tagsCheck that all the image names and tags in your Pod specs are correct and match the images in the registry. Use the `kubectl get pod` and `kubectl describe pod` commands to check the image names and tags in your Pod specs. If the image name or tag is incorrect, you need to fix it in your Pod spec and redeploy your Pod. Pull image manuallyTry pulling the image directly from the command line interface (CLI) to verify that the image name and tag are valid and exist in the registry. You can use the docker pull or podman pull commands to pull the image from the registry. If the command succeeds, it means that the image name and tag are valid and exist in the registry. If the command fails, it means that the image name or tag is invalid or does not exist in the registry. You need to fix the image name or tag in your Pod spec or push the image to the registry if it does not exist. Check for misspelled names or tagsSometimes, the image name or tag can be misspelled. For example, you might have typed my-image:lates instead of my-image:latest. To avoid this, you should use descriptive and consistent image names and tags. Additionally, avoid using the latest tag, which can cause unexpected image updates and inconsistencies. If you are not familiar with the Pod spec or the deployment config, you can refer to our article: Kubernetes Architecture Explained: Overview for DevOps Enthusiasts Cause 3: Insufficient Storage or Disk IssuesAnother potential cause of these errors is insufficient storage or disk issues on the nodes. This will prevent the image from being downloaded and stored. This occurs when: The node disk is full or has insufficient space for the imageThe node disk is slow or has high I/O latencyThe image is too large or has too many layersTo diagnose this as the potential root cause, you should: Check available storage capacityPods may fail to start if there is insufficient disk space on the node to store the image. Run the df -h command to check the available storage capacity on the node. If the command shows that the disk is full or has low free space, then you have to delete unused files, images, and Pods to free space or simply add more disk space to the node. To learn more on how to remove unused images, check our article on How To Remove Unused And Dangling Docker Images Check disk I/O performanceSaturated disk I/O can cause image pull timeouts or failures, especially if the image is large or has many layers. Run the iostat command to check the disk I/O performance on the node. If the command shows that the disk I/O is high or has high latency, you need to improve the disk I/O performance by reducing the disk load, using faster disks, or optimizing the image size or layers. Cause 4: Unauthorized AccessUnauthorized access to the registry or the image is another potential cause of this error. This can happen because: The registry requires authentication and the credentials are missing or invalidThe service account does not have permission to pull the imageThe credentials are expired or revokedTo troubleshoot potential unauthorized access problem, you should: Validate image pull secretIf the registry requires authentication, you need to provide the credentials to the Pod using an image pull secret. You can run the kubectl create secret command to create an image pull secret with the credentials and then use the imagePullSecrets field in your Pod spec to reference it. For example, you can create a secret named my-secret with the credentials for Docker Hub and then use it in your Pod. If the image pull secret is invalid, you need to fix it by replacing it with the correct credentials and referencing it in your Pod spec. Ensure the service account has pull permissionsIf you are using a service account to pull the image, make sure that it has permission to do so. You can run the following commands to check the role and role binding of the service account my-service-account in the default namespace: # Check the role of the service account kubectl get role my-role -n default # Check the role binding of the service account kubectl get rolebinding my-rolebinding -n defaultIf the role or role binding is missing or incorrect, you need to fix it or create a new one with the correct permissions and reference it in your Pod spec. For instance, you can create a role named my-role with the permission to pull images, and a role binding named my-rolebinding that binds the role to the service account my-service-account, and then use it in your Pod spec like this: If the credentials are expired or revoked, you need to renew them or create new ones and update the image pull secret or the service account accordingly. Cause 5: Image Registry IssuesAnother possible cause of these errors is image registry issues that prevent the image from being available or accessible. This can happen due to the following reasons: The registry is down or unreachableThe registry does not have the requested image or tagThe registry has errors or bugs that affect the image pullWhen you want to fix this problem, you should: Confirm registry is up and runningCheck the status and availability of the registry by using the curl or wget commands to test the registry URLs from your browser or CLI. If the command returns a valid response, it means that the registry is up and running. But if the command returns an error or times out, it means that the registry is down or unreachable. You will need to contact the registry provider or administrator to resolve the issue. Check the registry for requested imagesCheck the registry for the existence of the requested images and tags by using the registry web interface or API. For example, if you are using Docker Hub as your registry, you can use the following URL to check the image my-image:latest: https://hub.docker.com/r/my-user/my-image/tags?page=1&ordering=last_updatedIf the URL shows the image and tag, it means that the registry has the requested image and tag. If the URL does not show the image and tag, it means that the registry does not have the requested image and tag. You need to push the image and tag to the registry or fix the image name and tag in your pod spec if they are incorrect. Trace the registry logs for errorsTrace the logs on the registry for any errors or bugs that might affect the image pull. You can use the registry web interface or API to access the logs, or contact the registry provider or administrator to get the logs. If the logs show any errors or bugs, you will also need to contact the registry provider or administrator to resolve them. How to Avoid ImagePullBackOff & ErrImagePull Errors?Following the best practices below will help you avoid these errors: Use descriptive and consistent image names and tags, and avoid using the latest tag.Use a reliable and secure registry service, such as Docker Hub, Azure Container Registry, or Amazon Elastic Container Registry, and configure the registry’s URL correctly in your Pod spec.Use secrets to store and provide the registry credentials to your Pods, and avoid hard-coding the credentials in your Pod spec or Dockerfile.Test your images locally before pushing them to the registry, and make sure they are compatible with your Kubernetes cluster version and architecture.Monitor your network and firewall settings, and ensure that your nodes and Pods can communicate with the registry without any issues.Monitor your node disk space, and ensure that you have enough space for your images and Pods.By following these best practices, you can reduce the chances of encountering the ImagePullBackOff or ErrImagePull errors and improve the reliability and performance of your Kubernetes deployments. Check out our Kubernetes Learning Path to master Kubernetes! ConclusionAlways be methodical in your approach. Start with networking, then image configuration, credentials, and logs/events for clues. Nine times out of ten, the issue is one of those basics rather than something deeper in the Kubernetes API. I hope these tips help you resolve image pull errors faster so you can get back to developing awesome apps If you have any questions or feedback, please leave a comment below. KodeKloud is a platform that offers you 70+ learning resources, including courses, labs, quizzes, and projects, to help you acquire various DevOps skills. You can sign up for a free account at KodeKloud to start your DevOps journey today. View the full article
  22. Deployed by more than 60% of organizations worldwide, Kubernetes (K8s) is the most widely adopted container-orchestration system in cloud computing. K8s clusters have emerged as the preferred solution for practitioners looking to orchestrate containerized applications effectively, so these clusters often contain various software, services, and resources, enabling users to deploy and scale applications with relative ease. To support a typical K8s environment operation, a cluster is often granted access to other environments such as artifact repositories, CI/CD environments, databases etc. Thus, K8s clusters can store customer data, financial records, intellectual property, access credentials, secrets, configurations, container images, infrastructure credentials, encryption keys, certificates, and network or service information. With so many clusters containing potentially valuable and lucrative data exposed to the internet, K8s provides a tempting target for threat actors. This risk escalates with the number of organizations that have misconfigurations that leave K8s clusters exposed and vulnerable to attacks. View the full article
  23. Do you find yourself lying awake late at night, worried that your greatest observability fears will materialize as one of the most horrific specters of Kubernetes-driven chaos reaches up through your mattress to consume your very soul? Even as your mind races and you wonder just who that creepy character sneaking around the metaphysical boiler […]View the full article
  24. Learn how to launch an Apache Kafka with the Apache Kafka Raft (KRaft) consensus protocol and SSL encryption. This article is a continuation of my previous article Running Kafka in Kubernetes with KRaft mode... View the full article
  25. While troubleshooting a production issue in your Kubernetes cluster, you notice that a critical application pod is repeatedly crashing. In the pod’s logs, you see an error message indicating “Permission Denied.” After further investigation, you discover that the issue is related to security authorization. What could be a likely cause of this problem? Correct Answer: […]View the full article
  • Forum Statistics

    42.7k
    Total Topics
    42.5k
    Total Posts
×
×
  • Create New...