Jump to content

Featured Replies

Posted

Overview

Data transfer costs can play a significant role in determining the overall design of a system. Amazon Elastic Container Registry (Amazon ECR), Amazon Elastic Container Service (Amazon ECS), and Amazon Elastic Kubernetes Service (Amazon EKS) can incur data transfer charges depending on a variety of factors. It can be difficult to visualize what that means relative to an Amazon ECS or Amazon EKS deployment. This blog illustrates common deployment patterns for AWS container services and explains the resulting data transfer charges that might be incurred.

Amazon ECR

There are two types of Amazon ECR registries—public and private—and each one has different data transfer charges.

Amazon ECR public registry

Amazon ECR Public is a managed AWS container image registry service that is secure, scalable, and reliable. Amazon ECR supports public image repositories with resource-based permissions using AWS Identity and Access Management (IAM) so that specific users can access your public repositories to push images. Developers can use their preferred command line interface (CLI) to push and manage Docker images, Open Container Initiative (OCI) images, and OCI-compatible artifacts. Images are publicly available to pull, either anonymously or through an Amazon ECR Public authentication token.

All data transferred into Amazon ECR Public incurs no charge from Amazon ECR. Data transferred out is subject to charges when more than 5 TB per month are transferred out to non AWS destinations and you have authenticated to Amazon ECR with an AWS account. Up to 500 GB per month can be anonymously transferred out to non AWS destinations to clients that have not authenticated. After that limit is reached, no further anonymous data transfers are allowed.

Architecture of data transfer into and out of ECR public registry: Data transfer into ECR Public does not incur charges. Data transfer out exceeding 5 TB per month to non AWS destinations incurs charges.

Figure 1. Amazon ECR public registry

Amazon ECR private registry

An Amazon ECR private registry hosts container images in a highly available and scalable architecture. You can use an Amazon ECR private registry to manage private image repositories consisting of Docker and OCI images and artifacts.

Data transferred into the Amazon ECR private registry incurs no charge from Amazon ECR. Data transferred between Amazon ECR and other services within the same Region is free. Data transferred between Amazon ECR and other services in different Regions is charged at internet data transfer rates on both sides of the transfer. Note that this is aggregated with other outbound data transfers across multiple services, and rate tiers apply based on the amount of data transferred.

Architecture of data transfer into and out of ECR private registry: Data transfer across Regions for either image pulls or ECR cross-Region replication incurs charges. Data transfer within the same Region does not incur charges.

Figure 2. Amazon ECR private registry

Amazon ECR offers built-in functionality to replicate container images to different locations. This could be useful for disaster recovery purposes, a promote-to-production pipeline, and image pull time and data transfer cost reductions when running containers in different Regions. This data transfer is charged at the cross-Region data transfer rates described on the Amazon Elastic Compute Cloud (Amazon EC2) On-Demand Pricing page.

Refer to the Amazon ECR Pricing page and Amazon EC2 On-Demand Pricing page for more information.

Data transfers for Amazon ECS

Three common deployment models for Amazon ECS are Amazon ECS Anywhere, clusters with external network access, and clusters without external network access.

Amazon ECS Anywhere

Amazon ECS Anywhere is a feature of Amazon ECS that lets you easily run and manage container-based applications on premises, including on your own virtual machines (VMs) and bare-metal servers. With Amazon ECS Anywhere, you do not need to install or operate local container orchestration software, thus reducing operational overhead. Amazon ECS Anywhere offers a completely managed solution that lets you standardize container management across all of your environments.

Data transfer fees are accrued based on how the customer-managed infrastructure connects to the Amazon ECS Anywhere control plane. If connecting through the internet (Figure 3), there is no charge for data transfers when communication occurs over the open internet between the Amazon ECS Anywhere control plane and Amazon ECS agent.

Architecture of ECS Anywhere over the internet: When communication occurs between the ECS Anywhere control plane and ECS agent over the open internet, there is no charge for data transfer.

Figure 3. Amazon ECS Anywhere communication over the internet

If connecting through AWS Site-to-Site VPN or AWS Direct Connect (Figure 4), standard data transfer fees (data transfer out fees) apply to Site-to-Site VPN or Direct Connect when communication occurs between the Amazon ECS Anywhere control plane and Amazon ECS agent through Site-to-Site VPN or the Direct Connect link.

Architecture of ECS Anywhere over Direct Connect: Standard data transfer fees apply to Site-to-Site VPN or Direct Connect for communication from the ECS Anywhere control plane to the on-premises ECS agent.

Figure 4. Amazon ECS Anywhere with AWS Direct Connect

Refer to the Amazon ECS Anywhere Pricing page, AWS Site-to-Site VPN Pricing page, and AWS Direct Connect Pricing page for more details.

Amazon ECS clusters with external access

Compute capacity for container instances in Amazon ECS can be deployed within virtual private clouds (VPCs) that allow access to the Amazon ECS service endpoint using the internet. For example, the compute capacity for container instances can be deployed in public subnets (with an internet gateway) or private subnets (with a NAT gateway—shown in Figure 5), or it can route to the internet through another VPC, such as Amazon Virtual Private Cloud (Amazon VPC), using AWS Transit Gateway. Both Amazon EC2 and AWS Fargate can be used for the compute capacity in this type of deployment, and there is no difference in data transfer costs based on which service is chosen.

Diagram of sample ECS deployment with ECR, ECS tasks, and an RDS database deployed across multiple Availability Zones: Internet access is provided through NAT Gateway in public subnets in each Availability Zone. Data transfer is charged for communication across Availability Zones and outbound communication to ECR and the ECS control plane.

Figure 5. Amazon ECS deployment with internet access

The following data transfers are not charged for in the sample deployment:

  • Data transferred in from the Amazon ECS control plane (responses to API calls from the data plane) and Amazon ECR (image pulls)
  • Communication between tasks and the Application Load Balancer (assuming targets are available in each Availability Zone and avoid cross-zone load balancing)
  • Communication between tasks and the database instance in the same Availability Zone

In this deployment, data transfer charges accrue for the following:

  • Data transferred out through the NAT gateway, including polling traffic from the Amazon ECS agent to the Amazon ECS control plane and outbound data to Amazon ECR
  • Cross-Availability Zone traffic to the database

It is important to note that although the NAT gateway does not charge for data transferred in, there is still a data processing charge per gigabyte on data that flows through the NAT gateway, regardless of direction. The data transfer out charges are in addition to this charge. The NAT gateway pricing can be found on the Amazon VPC Pricing page.

Amazon ECS clusters with no external access

Another common pattern to deploy Amazon ECS workloads is to restrict all external network access (Figure 6). Because container instances in Amazon ECS clusters need external network access to communicate with the Amazon ECS service endpoint, AWS PrivateLink must be used to communicate with the service endpoints. Again, both Amazon EC2 and Fargate can be used for the compute capacity in this type of deployment. However, there is a difference in data transfer costs based on which service is chosen.

The following data transfers are not charged for in the sample deployment:

  • Communication between tasks and the Application Load Balancer
  • Communication between tasks and the database instance in the same Availability Zone

In this deployment, data transfer charges accrue for cross-Availability Zone traffic to the database.

The diagram depicts the Amazon ECS and Amazon ECR PrivateLink VPC interface endpoints as single objects. However, there are actually multiple endpoints required for each. For a description of endpoint requirements, consult this AWS Compute blog post or the Amazon ECS and Amazon ECR documentation.

Diagram of sample ECS deployment with ECR, ECS Tasks, and an RDS database deployed across multiple Availability Zones with no internet access: PrivateLink is used to communicate to S3, ECR, and ECS. Data transfer is charged for communication across Availability Zones.

Figure 6. Amazon ECS deployment in private network

Although there is no data transfer fee, PrivateLink imposes both a per-hour service charge and a per-gigabyte data processing charge for data processed through each VPC endpoint, billed per Availability Zone. More details can be found on the AWS PrivateLink Pricing page.

Using AWS Fargate in private Amazon ECS clusters

If Fargate provides the compute capacity for the Amazon ECS cluster, the PrivateLink endpoints for Amazon ECS are not required. This gets rid of the hourly and data processing service charges for communication to the Amazon ECS service endpoint. Note that the PrivateLink VPC endpoints for Amazon ECR are still required to pull images from Amazon ECR.

Data transfer for Amazon EKS

Similar to Amazon ECS, Amazon EKS data transfer charges follow the guidelines described on the Amazon EC2 Pricing page. Figure 7 represents a sample Amazon EKS workload with two pods deployed to different Amazon EKS worker nodes in different Availability Zones.

The following data transfers are not charged in this example:

  • Traffic in and out of the control plane (not shown on the diagram)
  • Image pulls from Amazon ECR (not shown on the diagram)
  • Communication between pods and the Application Load Balancer
  • Communication between pods and the database instance in the same Availability Zone

The following data transfers have accrued charges in this example:

  • Data transfer between the Kubernetes pod and database in a different Availability Zone
  • Data out (egress) from the Application Load Balancer
  • Communication between pods in different Availability Zones
Diagram of a sample EKS workload with two pods deployed to different EKS worker nodes in different Availability Zones, communicating with an RDS database: Data transfer is charged for communication across Availability Zones.

Figure 7. Sample application of Amazon EKS deployment

There are several other configuration options and deployment strategies to consider in an Amazon EKS deployment relative to data transfer costs, such as cluster access (public or private), pod-to-pod communication, and communication between the pods and the load balancer.

Public Amazon EKS clusters

Public Amazon EKS clusters (Figure 8) have the public endpoint access turned on and the private endpoint turned off. Communication between the control plane and worker nodes exits the VPC based on the VPC’s routing. For worker nodes in private subnets, communication traverses a NAT gateway and exits through an internet gateway. There is no data transfer charge associated with this. However, there is a per-gigabyte NAT gateway data processing fee. In addition, worker nodes that are not in the same Availability Zone as the NAT gateway incur a per-gigabyte data transfer charge for cross-Availability Zone traffic. For worker nodes in public subnets, communication exits the internet gateway and is not charged.

Diagram of an EKS cluster with three nodes deployed across three Availability Zones: The nodes communicate with the EKS control plane through a NAT Gateway deployed in a single Availability Zone. Data transfer is charged for communication across Availability Zones to the NAT gateway.

Figure 8. Public Amazon EKS cluster

Private Amazon EKS clusters

In a cluster with private Amazon EKS endpoints (Figure 9), worker nodes communicate with private endpoints. The Kubernetes API server endpoint URL resolves to elastic network interfaces within the (customer) VPC. Worker nodes inside the VPC communicate with these network interfaces. In this deployment model, there is a chance that a worker node communicates with a network interface in a different Availability Zone. This communication is subject to a cross-Availability Zone data transfer charge.

Diagram of a private EKS cluster with three nodes deployed across three Availability Zones: Worker nodes inside the VPC will communicate with the elastic network interfaces for the Kubernetes API server endpoints. In this deployment model, there is a chance that a worker node communicates with an ENI in a different Availability Zone. This communication would be subject to a cross-Availability Zone data transfer charge.

Figure 9. Private Amazon EKS cluster

Pod-to-pod communication

Kubernetes applications often use communication between containers and pods (Figure 10).

Containers within the same pod are guaranteed to be on the same worker node and communicate over the loopback device, resulting in no data transfer charges. Data transfer charges between pods depend on the pod placement:

  • There is no data transfer charge for pods deployed on the same node or within the same Availability Zone
  • Pods that communicate and are placed in different Availability Zones incur a data transfer charge
Diagram of pod-to-pod communication across Availability Zones: Pods that communicate within the same Availability Zone do not incur data transfer charges. Data transfer is charged if pods communicate across Availability Zones.

Figure 10. Pod-to-pod communication

Load balancer-to-pod communication

Amazon EKS workloads often include a load balancer to evenly distribute the traffic across pods. Pods are deployed alongside Kubernetes Service and Ingress objects. The add-on AWS Load Balancer Controller monitors these objects and automatically provisions the AWS load balancer resource. The path that traffic takes and resulting data transfer charges depend on how the Service and Ingress objects are configured. There are two common methods of configuration.

The first method involves having a target group that consists of all worker nodes within the cluster (Figure 11). This grouping includes Service: LoadBalancer, Service: NodePort, and Ingress TargetType: instance. Here, an ephemeral port (NodePort) is opened on each node in the cluster, and traffic is distributed evenly across all nodes. If the destination pod is on the node, no additional data transfer occurs. If the destination pod is scheduled on a different node, data transfer charges might accrue if the target pod is in a different Availability Zone.

Diagram of Application Load Balancer communicating to EKS pods using TargetType = “instance”: In this configuration, the traffic might be forwarded across the Availability Zone, resulting in data transfer charges.

Figure 11. Load balancer-to-pod communication using TargetType: instance

The second method involves a target group consisting of the pod IP addresses (Figure 12). In this method, the load balancer targets are the IP addresses of the pods. Communication bypasses the kube-proxy and targets the pod directly, keeping traffic in the same Availability Zone and avoiding data transfer charges.

Diagram of Application Load Balancer communicating to EKS pods using TargetType = “IP”: In this configuration, traffic stays within the same Availability Zone and does not incur data transfer charges.

Figure 12. Load balancer-to-pod communication using TargetType: IP

Tips

The following are tips to avoid excess data transfer and data processing charges in your container workloads:

  • Additional components in the network path, such as NAT gateways, PrivateLink, or Transit Gateway, might incur additional data transfer or data processing charges.
  • Review potential data transfer charges at both the source and target of your communication channel. Remember that “data transfer in” to a destination is also “data transfer out” from a source.
  • Limit cross-Region data transfer where possible. Use the Amazon ECR built-in cross-Region replication features to limit what is replicated.
  • Use cross-Region data replication to replicate images in Amazon ECR into additional Regions and then pull images from the local Region.
  • Try to limit container images to only the essentials required to run your workload. Images with extraneous binaries cost more to store and transfer, increase startup time, and increase the attack surface area.
  • Determine the most efficient network path for your traffic. For example, do your requirements indicate a need for a private cluster with no external connectivity? As more networking components are added, data transfer costs increase.
  • Consider consolidating PrivateLink endpoints in a central VPC connected by Transit Gateway, described in this AWS PrivateLink whitepaper.
  • In an Amazon ECS deployment with no external network connectivity, consider using Fargate to host your containers. This gets rid of the need for the PrivateLink endpoint for Amazon ECS.
  • When using a load balancer in your Amazon EKS workload, try to avoid NodePort services and target the pods directly using the IP-based TargetType for target groups.

Conclusion

In order to determine the most cost-efficient architecture for your container-based workloads, it’s important to understand how data transfer charges are calculated. Design decisions made related to compute capacity deployment, access to public AWS services, and general network architecture have an impact on data transfer charges.

Interested in learning more? Check out the Overview of Data Transfer Costs for Common Architectures blog post for an explanation of data transfer charges for common network architectures, and be sure to visit the AWS Containers Blog for more great container-related content!

View the full article

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...