Search the Community
Showing results for tags 'observability'.
-
Without a doubt, you’ve heard about the persistent talent gap that has troubled the technology sector in recent years. It’s a problem that isn’t going away, plaguing everyone from engineering teams to IT security pros, and if you work in the industry today you’ve likely experienced it somewhere within your own teams. Despite major changes […]View the full article
-
Amazon CloudWatch Container Insights now offers observability for Windows containers running on Amazon Elastic Kubernetes Service (EKS), and helps customers collect, aggregate, and summarize metrics and logs from their Windows container infrastructure. With this support, customers can monitor utilization of resources such as CPU, memory, disk, and network, as well as get enhanced observability such as container-level EKS performance metrics, Kube-state metrics and EKS control plane metrics for Windows containers. CloudWatch also provides diagnostic information, such as container restart failures, for faster problem isolation and troubleshooting for Windows containers running on EKS. View the full article
-
- amazon cloudwatch container insights
- windows
- (and 2 more)
-
Keeping your data’s health up to date is the biggest challenge organizations face today. It’s the only way to ensure your information assets are fit for purpose, driving accurate insights. This is where data observability steps in. With data observability, you have complete visibility into the state of both your data and data systems, putting […]View the full article
-
You can now set up cross-account observability for Amazon CloudWatch Internet Monitor, so that you can get read-only access to monitors from multiple accounts within an AWS Region. Deploying applications by using resources in separate accounts is a good practice, to establish security and billing boundaries between teams and reduce the impact of operational events. For example, when you set up cross-account observability for Internet Monitor, you can access and view performance and availability measurements generated by monitors in different AWS accounts. View the full article
-
Every software-driven business strives for optimum performance and user experience. Observability—which allows engineering and IT Ops teams to understand the internal state of their cloud applications and infrastructure based on available telemetry data —has emerged as a crucial practice to help engage this process. For years, application performance monitoring (APM) was the de facto practice […]View the full article
-
Observability data can be voluminous, and riddled with false positives. Edge Delta has a realistic view of AI. David Wynn, Principal Solution Architect at Edge Delta, joins us to talk about how you can gain value from observability data with AI. Whether it’s to find important but hidden error messages or build complex queries easily – Edge Delta is out to bolster observability using AI. The post Edge Delta simplifies observability with the power of AI appeared first on Amazic. View the full article
-
- edge delta
- observability
-
(and 1 more)
Tagged with:
-
As organizations seek to drive more value from their data, observability plays a vital role in ensuring the performance, security and reliability of applications and pipelines while helping to reduce costs. At Snowflake, we aim to provide developers and engineers with the best possible observability experience to monitor and manage their Snowflake environment. One of our partners in this area is Observe, which offers a SaaS observability product that is built and operated on the Data Cloud. We’re excited to announce today that Snowflake Ventures is making an investment in Observe to significantly expand the observability experience we provide for our customers. Following the investment, Observe plans to develop best-in-class observability features that will help our customers monitor and manage their Snowflake environments even more effectively. Solutions such as out-of-the-box dashboards and new visualizations will empower developers and engineers to accelerate their work and troubleshoot problems more quickly and easily. In addition, because Observe is built on the Data Cloud, our customers will have the option to keep their observability data within their Snowflake account instead of sending it out to a third-party provider. This further simplifies and enhances their data governance by allowing them to keep more of their data within the secure environment of their Snowflake account. Observe is an example of how more companies are building and operating SaaS applications on the Data Cloud. By doing so, these companies gain access to our scalable infrastructure and powerful analytics while being able to offer a more advanced and differentiated experience to Snowflake customers. We will continue to expand the signals we provide for developers and engineers to manage, monitor and troubleshoot their workloads in the Data Cloud. Our partnerships with companies like Observe help turn signals into actionable insights that are presented in compelling and innovative ways. The post Snowflake Invests in Observe to Expand Observability in the Data Cloud appeared first on Snowflake. View the full article
-
As organizations look to expand DevOps maturity, improve operational efficiency, and increase developer velocity, they are embracing platform engineering as a key driver. Indeed, recent research found that 54% of organizations are investing in platforms to enable easier integration of tools and collaboration between teams involved in automation projects. Platform engineering creates and manages a shared infrastructure and set of tools, such as internal developer platforms (IDPs), to enable software developers to build, deploy, and operate applications more efficiently. The goal is to abstract away the underlying infrastructure’s complexities while providing a streamlined and standardized environment for development teams. As a result, teams can focus on writing code and building features rather than dealing with infrastructure nuances. During a breakout session at the Dynatrace Perform 2024 conference, Dynatrace DevSecOps activist Andreas Grabner and staff engineer Adam Gardner demonstrated how to use observability to monitor an IDP for key performance indicators (KPIs). The pair showed how to track factors including developer velocity, platform adoption, DevOps research and assessment metrics, security, and operational costs. Recent Dynatrace research has found that only 40% of a typical engineer’s time is spent on productive tasks, and 36% of developers resign because of a bad developer experience, Grabner noted. “If your developers are leaving the company, the IDP may have something to do with it,” he said. Platform engineering: Build for self-service Self-service deployment is a key attribute of platform engineering. It gives developers the means to create environments and toolsets unique to the projects they’re working on. “[An IDP] must be a product that developers want to use because it helps them get the job done,” Grabner said. “It makes them more productive . . . and reduces the complexity of things such as reading a new app or service. They shouldn’t worry about the platform; they should just start writing code.” Because of their versatility, teams can use IDPs for all types of software engineering projects, not just those in cloud-native scenarios. IDPs can eliminate much of the administrative minutiae that stalls development projects. Grabner gave the example of one Dynatrace banking customer who built an IDP that enables developers to provision new Azure machines or Chef policies without administrative help. “IDPs are not constrained to building microservices or a new serverless app,” Grabner noted. Before putting an IDP in place, organizations must encourage their platform engineering teams to adopt a product mindset with feedback loops between developers and users. They should also establish milestones to ensure the built product solves a defined business problem. The Dynatrace IDP encompasses platform services, delivery services, and access to observability and automation tools. The Dynatrace Operator automatically ingests all observability data from OpenTelemetry and Prometheus. Furthermore, OneAgent observes and gathers all remaining workload logs, metrics, traces, and events. Automate deployment for faster developer velocity Additionally, the IDP used during the session connects to the open-source Backstage developer portal platform and a library of templates stored in a GitLab repository. The templates can deploy automatically into the development environment with just a few clicks. Argo works in a GitOps fashion to automate the deployment of files stored in Git. “Argo has an eagle eye on the Git repository,” Gardner said. “Every time something changes, it’s synced to Kubernetes.” Backstage holds many of an organization’s critical development resources that must be treated with the same respect as business-critical data. Observability is not only about measuring performance and speed, but also about capturing granular business analytics to support data-driven decision-making. These metrics can include how many people are using the IDP, how quickly the tasks are running in the IDP, and more. “That means making it available, resilient, and secure,” Grabner said. Intelligent monitoring is also crucial. “If you don’t monitor, you risk building a product that nobody needs,” Grabner continued. Observability is a critical component of an IDP. It illuminates the activity of components such as Backstage, GitHub, Argo, and other tools. Service-level objectives (SLOs) are similarly important. SLOs ensure developers can accelerate their velocity and remain productive with an optimally functioning platform. Test continuously Synthetic testing simulates user behaviors within an application or service to pinpoint potential problems. This process is vital to an IDP’s effectiveness. An observability solution can monitor both synthetic and real-user tests to verify an application is on track. GitLab, a source code repository and collaborative software development platform for DevOps and DevSecOps projects, is populated with a set of pre-filled templates. The combination gives developers a unique set of tools they can deploy on a self-service basis with full monitoring by Dynatrace. “Every time [developers] pick a template in Backstage, they get their own version of the Git repository based on the template. Then, Argo deploys the app,” Grabner said. “It has worked kind of flawlessly.” Observability at the core “Platform engineering is about being responsible for making sure platforms are available,” Gardner said. “Dynatrace can tell us whether Argo is up and whether it’s killing GitHub with too many syncs. It lets us see events such as starts and traces in a standardized manner.” This certainty can accelerate developer velocity and improve the developer experience, resulting in better software and happier developers. Dynatrace has made the reference IDP architecture available on GitHub for anyone to use. It includes a notebook with configuration and deployment instructions. “It explains every single step that was involved in building the IDP, creating the configuration, and setting up Argo,” Gardner said. “You can launch a code space that starts a container that shows you everything about how an app was built and deployed.” Curious to learn more about observability to optimize KPI success? Check out the Perform 2024 session: Observability guide to platform engineering. Watch now Discover how unified observability unlocks platform engineering success in the free ebook: Driving DevOps and platform engineering for digital transformation. Read now! The post How platform engineering and IDP observability can accelerate developer velocity appeared first on Dynatrace news. View the full article
-
Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, The Modern DevOps Lifecycle: Shifting CI/CD and Application Architectures. Forbes estimates that cloud budgets will break all previous records as businesses will spend over $1 trillion on cloud computing infrastructure in 2024. Since most application releases depend on cloud infrastructure, having good continuous integration and continuous delivery (CI/CD) pipelines and end-to-end observability becomes essential for ensuring highly available systems. By integrating observability tools in CI/CD pipelines, organizations can increase deployment frequency, minimize risks, and build highly available systems. Complementing these practices is site reliability engineering (SRE), a discipline ensuring system reliability, performance, and scalability. View the full article
-
There’s no debate — in our increasingly AI-driven, lean and data-heavy world, automating key tasks to increase effectiveness and efficiency is the ultimate name of the game. No matter what job you hold today, you’re likely being pushed to not only do more with less, but also perform your work with a tighter focus on […]View the full article
-
BizDevSecOps might sound like a mouthful, but it marks a necessary evolution. As business goals and technology efforts continue to converge, organizations need to ensure teams are performing to their full potential. Business considerations are now part of the security, operations, and development framework. During a session at Dynatrace Perform 2024, Dynatrace colleagues Kristof Renders, director of innovation services, and Brian Chandler, principal solutions architect, demonstrated four BizDevSecOps use cases for the Dynatrace unified observability and security platform. Additionally, the pair illustrated the effect users can expect after implementation. Getting granular with user experience It all starts—and ends—with user experience. When users encounter issues with applications or services, performance and productivity drop. As a result, organizations need complete visibility into the user experience both individually and at scale. The Dynatrace real-time user experience dashboard helps organizations discover where issues are happening and how they’re affecting users. “You can see where drop-off happens,” Chandler said. “You can see where people can do business KPIs [key performance indicators], and you can see where downticks happen. We’ve built the ability to track all business SLOs [service-level objectives].” And with Dynatrace Site Reliability Guardian, all teams across the organization can understand how their specific silo operates in relation to critical systems. Triaging BizDevSecOps problems using segmentation Equipped with data that offers insight into the user experience, organizations are better prepared to triage potential problems using PurePath distributed traces. This starts with segmentation. “You can segment by user session,” Chandler said. “This lets you jump right into triage. You can get an overview of the individual user—from what ISP they’re using, to where they’re connecting, to their screen resolution.” These are all metrics Dynatrace collects directly out of the box. Organizations can also drill deeper to discover what’s happening on the server side. “Dynatrace PurePath can trace hop to hop what went on in a user interaction to give a highly sophisticated root-cause analysis,” Chandler noted. “All of this data can be bubbled up to a unified dashboard.” Users can then connect this dashboard data with underlying technical data, such as service-level agreement metrics. Managing BizDevSecOps incidents quickly and effectively With problems triaged and root causes identified, BizDevSecOps teams are ready for incident management. For Renders, the key to incident management is the ability to connect cause and effect: identifying what’s going wrong, why it’s going wrong, and where it started. In modern IT environments, however, creating these connections isn’t easy. Where organizations used to have a half dozen legacy applications running on premises, they now have hundreds of local and cloud-based applications pulling data from different sources simultaneously. This creates complexity. While the direct effects of IT problems are obvious, the sources are often obscured. “When something goes wrong, you want to get to a solution as quickly as possible,” Renders said. “Dynatrace will tell you that something is wrong and what is wrong. We can connect the root cause to the process owner.” Deploying secure, well-architected applications While many BizDevSecOps use cases center around identifying issues and mitigating their effect, Dynatrace can also help organizations ensure that application design, delivery, and deployment align with industry best practices, such as the six pillars of the AWS Well-Architected Framework. “We can actually go and look at leveraging security information to stop badly performing apps from being released,” Renders noted. “Then, we can ask Dynatrace if an app is adhering to development pillars. We can go into our workflows and map out a well-architected application.” For example, when a new app build is deployed and automated tests are executed, the outcome may trigger a quality gate. Dynatrace then performs automated quality validation through SLOs that either pass or fail the application and provide feedback to developers. Unified observability is key to BizDevSecOps progress From user experience to triage, incident management, and DevSecOps, Dynatrace delivers a unified observability and security platform that combines advanced AI and automation capabilities. The unified observability and security platform presents data in intuitive, user-friendly ways. This enables teams to gather and analyze data, while reducing mean time to repair and improving the performance and availability of applications. To learn more about how Dynatrace enables BizDevSecOps use cases, view the Perform session, “Top use cases for Biz, Dev, Sec, and Ops teams to get started with Dynatrace.” And for more information on news and insights from Perform, check out our guide. The post The benefits of unified observability and security for BizDevSecOps use cases appeared first on Dynatrace news. View the full article
-
Post co-written by Shahar Azulay, CEO and Co-Founder at GroundCover Introduction The abstraction introduced by Kubernetes allows teams to easily run applications at varying scale without worrying about resource allocation, autoscaling, or self-healing. However, abstraction isn’t without cost and adds complexity and difficulty tracking down the root cause of problems that Kubernetes users experience. To mitigate these issues, detailed observability into each application’s state is key but can be challenging. Users have to ensure they’re exposing the right metrics, emitting actionable logs, and instrumenting their application’s code with specific client-side libraries to collect spans and traces. If that’s not a hard task on its own, gaining such visibility into multiple small and distributed, interdependent pieces of code that comprise a modern Kubernetes microservices environment, becomes a much harder task at scale. A new emerging technology called eBPF (Extended Berkeley Packet Filter) is on its path to relieve many of these problems. Referred to as an invaluable technology by many industry leaders, eBPF allows users to trace any type of event regarding their application performance – such as network operations – directly from the Linux kernel, with minimal performance overhead, and without configuring side-cars or agents. The eBPF sensor is out-of-band to the application code, which means zero code changes or integrations and immediate time to value all across your stack. The result is granular visibility into what’s happening within an Amazon Elastic Kubernetes Service (Amazon EKS) Kubernetes cluster. In this post, we’ll cover what eBPF is, why it’s important, and what eBPF-based tools are available for you to get visibility into your Amazon EKS applications. We’ll also review how Groundcover uses eBPF to enable its cost-efficient architecture. In today’s fast-paced world of software development, Kubernetes has emerged as a game-changer, offering seamless scalability and resource management for applications. However, with this abstraction comes the challenge of maintaining comprehensive observability in complex microservices environments. In this blog post, we’ll explore how eBPF (Extended Berkeley Packet Filter) is revolutionizing Kubernetes observability on Amazon EKS). Amazon EKS-optimized Amazon Machine Images (AMI) fully support eBPF, which empowers users with unparalleled insights into their applications. Solution overview The need for observability As applications scale and interdependencies grow, gaining detailed insights into their state becomes vital for effective troubleshooting. Kubernetes users face the daunting task of instrumenting applications, collecting metrics, and emitting actionable logs to track down issues. This becomes even more challenging in modern microservices environments, where numerous small and distributed, independent pieces of code interact with each other. Introducing eBPF: A game-changer for Kubernetes observability In the quest for enhanced observability, eBPF emerges as an invaluable technology. Unlike traditional observability approaches, eBPF empowers users to trace any type of event regarding their application performance – such as network operations — directly from the Linux kernel, with minimal performance overhead, and without configuring side-cars or agents. The out-of-band advantage of eBPF The advantage of eBPF lies in its out-of-band approach to observability. Out-of-band means eBPF collects the data without being part of the application’s code. One of the advantages of eBPF is that no code change is needed and installation can be done on the instance level without handling configuration per application deployed. Another advantage is that it eliminates unexpected effects of the observability stack for your time-critical application code, as all data collection is being done outside of the application process. Users can gain granular visibility into applications deployed on their Kubernetes clusters without instrumenting their code or integrating any third-party libraries. By tracing at the kernel level, eBPF enables users to analyze the inner workings of their applications, with immediate time to value, full coverage and unprecedented depth. eBPF-based tools for Amazon EKS observability A plethora of eBPF-based tools has emerged to provide comprehensive observability into Amazon EKS applications. These tools offer tracing capabilities for various events, including network operations, system calls, and even custom application events. For instance, BCC is a toolkit that helps simplify eBPF bootstrapping and development and includes several useful tools like network traffic analysis and resource utilization profiling. Another example is bpftrace, which is a little more focused on one-liners and short scripts for quick insights. These tools were built to be ad hoc, so you can run them directly on any Linux machine and get real-time value. Amazon EKS users who want to inspect their Kubernetes environment with eBPF tools should check out Inspector Gadget, which manages the packaging, deployment and execution of eBPF programs in a Kubernetes cluster, including many based on BCC tools. In the following section, we’ll dive deeper into some of the open source projects that use eBPF to collect insight data about applications. Walkthrough Caretta Caretta helps teams instantly create a visual network map of the services running in their Kubernetes cluster. Caretta uses eBPF to collect data in an efficient way and is equipped with a Grafana Node Graph dashboard to quickly display a dynamic map of the cluster. The main idea behind Caretta is gaining a clear understanding of the inter-dependencies between the different workloads running in the cluster. Caretta maps all service interactions and their traffic rates, leveraging Kubernetes APIs (Application Program Interface) to create a clean and informative map of any Kubernetes cluster that can be used for on-demand granular observability, cost optimization, and security insights, which allows teams to quickly reach insights such as identifying central points of failure or pinpointing security anomalies. Installing Caretta Installing Carreta is as simple as installing a helm chart onto your Amazon EKS cluster: helm repo add groundcover https://helm.groundcover.com/ helm install caretta --namespace caretta --create-namespace groundcover/caretta kubectl port-forward -n caretta pods/<caretta-grafana POD NAME> 3000:3000 Once installed, Carreta provides a full network dependency map that captures the Kubernetes service interaction. The map is also interleaved with Prometheus metrics that it exposes to measure the total throughput of each link observed since launching the program. That information, scraped and consolidated by a Prometheus agent, can be easily analyzed with standard queries such as sorting, calculating rate, filtering namespaces and time ranges, and of course — visualizing as a network map. The following is an example of a metric exposed by a Caretta’s agent, and it’s related label that provide granular insight into the network traffic captured by eBPF: caretta_links_observed{client_kind="Deployment",client_name="checkoutservice",client_namespace="demo-ng",link_id="198768460",role="server", server_port="3550"} This is useful to detect unknown dependencies or network interactions but also to easily detect bottlenecks like the hot zones handling large volumes of network data and might be the root cause of problems in your Amazon EKS cluster. But what can we do when network monitoring isn’t enough and application-level API monitoring is where the problem lies? Hubble Hubble is a network observability and troubleshooting component within Cilium (which is also based on eBPF for networking). Hubble uses eBPF to gain deep visibility into network traffic and to collect fine-grained telemetry data within the Linux kernel. By attaching eBPF programs to specific network events, Hubble can capture data such as packet headers, network flows, latency metrics, and more. It provides a powerful and low-overhead mechanism for monitoring and analyzing network behavior in real-time. With the help of eBPF, Hubble can perform advanced network visibility tasks, including flow-level monitoring, service dependency mapping, network security analysis, and troubleshooting. It then takes this data and aggregates it to present it to the user through the Command Line Interface (CLI)or UI. Hubble enables platform teams to gain insights into network communications within their cloud-native environments and gives developers the ability to understand how their applications communicate without first becoming a networking expert. Just like Carreta, Hubble knows how to create service dependency maps and metrics about the network connection it tracks with eBPF. Metrics like req/s and packet drops are captured by Hubble and can be used to detect issues at the network layer hiding in your Amazon EKS environment. Hubble also provides Layer 7 metrics by tracking HTTP and DNS connections inside the cluster using eBPF. Here, metrics like request rate, latency rate, and error rate kick in unlocking application-level monitoring. You can use Hubble to detect application bugs, which high-latency APIs slow down the application, or observe slow performance degradation that might be lurking. Installing Hubble requires installation of Cilium, which is documented in the Cilium getting started guide and is out of scope of this post. Groundcover: Pioneering cost-efficient Amazon EKS observability Groundcover, a next-generation observability company, is an example of how future observability platforms will leverage eBPF as their main data collection sensor. Groundcover, focused on cloud-native environments, utilizes eBPF to create a full-stack observability platform that provides instant value without compromising on scale, granularity, or cost. Its eBPF sensor was built in a performance-first mindset, which harnesses kernel resources to create a cost-efficient architecture for Amazon EKS observability. By collecting all its observability data using eBPF, Groundcover eliminates the need for intrusive code changes and manual labor, and the need to deploy multiple external agents. This streamlined approach not only enhances the coverage and depth of observability but also optimizes costs by reducing performance overhead. Conclusion In this post, we showed you how the open-source eBPF tool ecosystem. Customers can try out this technology on their own with our example demonstrating how this technology translates into next-generation observability platforms. As the demand for Kubernetes observability continues to grow, eBPF has emerged as a transformative technology, redefining how we monitor applications in Amazon EKS clusters. With its unparalleled performance and seamless integration into the Linux kernel, eBPF offers granular visibility without disrupting existing application code. Through eBPF-based tools, developers and operations teams can now troubleshoot and optimize their applications effortlessly, making Kubernetes observability more accessible and efficient. As more businesses embrace eBPF’s capabilities, we can expect organizations to migrate to Kubernetes with more confidence and assured of their expected observability coverage. This enables users to unleash the full potential of their applications on Amazon EKS and focus on building great products on top of Kubernetes. With eBPF’s bright future ahead, Groundcover and other industry leaders are paving the way for a new era of Kubernetes observability. Shahar Azulay, groundcover Shahar, CEO and cofounder of groundcover is a serial R&D leader. Shahar brings experience in the world of cybersecurity and machine learning having worked as a leader in companies such as Apple, DayTwo, and Cymotive Technologies. Shahar spent many years in the Cyber division at the Israeli Prime Minister’s Office and holds three degrees in Physics, Electrical Engineering and Computer Science, and currently strives to use technological learnings from this rich background and bring it to today’s cloud native battlefield in the sharpest, most innovative form to make the world of dev a better place. View the full article
-
- kubernetes
- observability
-
(and 2 more)
Tagged with:
-
Amazon CloudWatch Container Insights now delivers enhanced observability for Amazon Elastic Kubernetes Service (EKS) with out-of-the-box detailed health and performance metrics, including container level EKS performance metrics, Kube-state metrics and EKS control plane metrics for faster problem isolation and troubleshooting. View the full article
-
- cloudwatch
- container insights
-
(and 3 more)
Tagged with:
-
Observability is becoming a keystone of contemporary DevOps practices. Even departments that weren’t traditionally a part of DevOps are seeing the benefits of being brought under the auspices of observability teams. In 2023, however, organizations are finding that the road to adoption is bumpier than expected. Here are seven of the biggest challenges DevOps teams […] View the full article
-
Do you find yourself lying awake late at night, worried that your greatest observability fears will materialize as one of the most horrific specters of Kubernetes-driven chaos reaches up through your mattress to consume your very soul? Even as your mind races and you wonder just who that creepy character sneaking around the metaphysical boiler […]View the full article
-
Forum Statistics
67.4k
Total Topics65.3k
Total Posts