Jump to content

Access Patterns of a Service Running on OpenShift Using AWS Athena


Red Hat

Recommended Posts

One thing every developer or operations personnel would like to know is who is accessing their application or service. They want information on the requester, IP address, and request type for not only analyzing traffic patterns of their customer base, but also for stopping certain IP addresses and CIDRs from accessing the service. Having insight into traffic patterns can be helpful for a security point of view as well as understanding customer behavior.

In this article, we see how we can gather that information for a containerized ParksMap application running on an IPI (installer provisioned infrastructure) deployed OpenShift Container Platform on AWS. The application we use for this blog uses an existing container image from docker.io/openshiftroadshow/parksmap-katacoda:1.2.0 for the front end.

The back-end service for the ParksMap application provides data, via a REST service API, on major national parks from all over the world. The ParksMap front-end web application will query this data and display it on an interactive map in your web browser.

You can get detailed step-by-step instructions on deploying the application from this link:

https://learn.openshift.com/introduction/getting-started/

Plan

To get the above information for our application, we need to first capture the access logs for the Elastic Load Balancer (ELB) used by OpenShift. The ELB access logs collected will be stored in an AWS Simple Storage Services (S3) bucket. We can then analyze the access logs directly from the S3 bucket using AWS Athena, which is an interactive query service, and get the desired information on the access patterns for our service running on OpenShift on AWS.

Details

ELB (Elastic Load Balancer) Access Logs

The access logs for Elastic Load Balancing capture detailed information for requests made to your load balancer, and it stores them as log files in the Amazon S3 bucket that you specify. Each log contains details such as the time a request was received, the client's IP address, latencies, request path, and server responses.

You can use these access logs to analyze traffic patterns and to troubleshoot your back-end applications. For more information, see Access logs for your Classic Load Balancer.

Configuring the ELB Access Logs

First, you must enable the access logs feature on the Load Balancer, which is disabled by default. Logs are stored in an Amazon S3 bucket, which incurs additional storage costs.

Elastic Load Balancer creates log files at user-defined intervals, between 5 and 60 minutes. Every single client request received by ELB is logged.

The IPI installer created load balancers to distribute traffic to the OpenShift control plane API requests and the Route/Ingress resources. The clients will be accessing the service/application running on OpenShift using the Classic Load Balancer used for the Route/Ingress resources, which listens on port 80 (http), and port 443 (https).

To enable access logs for your load balancer using the AWS Web UI

Blog%20-%20Access%20patters%20of%20a%20s

 

This will create a S3 bucket my-loadbalancer-logs with prefix my-app for saving the access log files.

To enable access logs for your load balancer using the AWS CLI

Command to get the Classic Load Balancer name from the CLI

mshetty@mshetty-mac .aws % aws elb describe-load-balancers  --output=text | grep LOADBALANCERDESCRIPTIONS | awk '{print $6}'
a1c90996a01f9448ea57b224d65c4483

Now that we have the Classic Load Balancer name, we will now enable access logs for it using the AWC CLI. Before doing that, create the “openshift-loadbalancer-logs” S3 bucket.

Attach a policy to your S3 bucket to grant Elastic Load Balancing permission to write logs to your bucket.

Next, create a .json file that enables Elastic Load Balancing to capture and deliver logs every 60 minutes to the S3 bucket that you created for the logs:


 "AccessLog": {
   "Enabled": true,
   "S3BucketName": "openshift-loadbalancer-logs",
   "EmitInterval": 60,
   "S3BucketPrefix": "my-app"
 }
}

To enable access logs, specify the .json file in the modify-load-balancer-attributes command as follows:

mshetty@mshetty-mac 4.5 % aws elb modify-load-balancer-attributes --load-balancer-name a1c90996a01f9448ea57b224d65c4483 --load-balancer-attributes file://my-json-file.json
{
"LoadBalancerName": "a1c90996a01f9448ea57b224d65c4483",
"LoadBalancerAttributes": {
    "AccessLog": {
        "Enabled": true,
        "S3BucketName": "openshift-loadbalancer-logs",
        "EmitInterval": 60,
        "S3BucketPrefix": "my-app"
    }
}
}

If you end up getting something like “An error occurred (InvalidConfigurationRequest) when calling the ModifyLoadBalancerAttributes operation: Access Denied for bucket: openshift-loadbalancer-logs. Please check S3 Bucket permission,” It is most likely related to an error in the Bucket Policy file.

Analyzing Load Balancer Access Logs With AWS Athena

ELB access logs can be useful when troubleshooting and investigating specific requests. However, if you want to find and analyze patterns in the overall access log files, you might want to use dedicated log analytics tools like AWS Athena, especially if you are dealing with large amounts of traffic generating heavy log file volume.

Amazon Athena is a simple-to-use interactive query service that allows you to analyze data in S3 using standard SQL, without the need to manage any infrastructure. Athena charges you on the amount of data scanned per query.

Blog%20-%20Access%20patters%20of%20a%20s

 

 

The query below tells us of all client IP addresses and the number of times they are accessing the service, not including your IP address.

SELECT request_ip, COUNT(request_ip)

FROM elb_logs WHERE request_ip != '<your-ip-address>'

GROUP BY request_ip

From the results, two IP addresses standout, as they have accessed the service unusually high number of times, in spite of it not being announced.

Blog%20-%20Access%20patters%20of%20a%20s

 

Blocking Unwanted Access with NACL

A network access control list (ACL) is an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets. A network ACL functions at dubnet level has separate inbound and outbound rules, and each rule can either allow or deny traffic.

As a packet comes to the subnet, we evaluate it against the inbound rules of the ACL that the subnet is associated with (starting at the top of the list of rules and moving to the bottom).

Now let's use the NACL to block our two unwanted IP addresses.

Blog%20-%20Access%20patters%20of%20a%20s

 

Summary

Using features like Access Logs in ELB, Network ACL (NACL), and service like AWS Athena, we were able to find out details of the clients accessing our application, and we were able to prevent unwanted clients from accessing our service running on OpenShift on AWS.

In the next post, we will see how to use AWS CloudFront and AWS WAF to whitelist or blacklist certain countries from accessing our service running on OpenShift on AWS.

References

Enable access logs for your Classic Load Balancer

Querying Classic Load Balancer Logs

Access logs for your Classic Load Balancer

Monitor your Classic Load Balancer

__ptq.gif?a=4305976&k=14&r=https%3A%2F%2

View the full article

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...