Jump to content

Monitoring & Observability

  • Metrics & Time Series Databases (e.g., Prometheus, Grafana, InfluxDB)

  • Logging & Log Management (e.g., ELK Stack, Loki, Splunk)

  • Tracing & Distributed Systems Monitoring (e.g., Jaeger, Zipkin, OpenTelemetry)

  • Alerting & Incident Management (e.g., PagerDuty, Opsgenie)

  • Synthetic Monitoring & Uptime Checks

  1. Amazon CloudWatch Application Signals now supports creating Service Level Objectives (SLOs) using metrics from your service dependencies. You can now monitor the performance of your services' dependencies, and proactively resolve problems through SLO goal setting, thanks to this new ability. Using Application Signals you can create period-based or request-based SLOs that track key metrics like latency and faults for the outgoing requests from your services to their dependencies. You can see how your dependencies perform and how this impacts the reliability of your overall service. For example, if your e-commerce service relies on a payment processor, you can set an SLO…

  2. Started by Logz.io,

    John Willis, celebrated for his influential “The DevOps Handbook” and his newest adventure, “Rebels of Reason,” kicked things off by spotlighting the lightning-fast progress of AI. In just two short years—from 2022 to 2024—AI’s capabilities in reasoning and coding have soared dramatically, creating an unstoppable wave that’s transforming industries almost overnight. Willis humorously described how […]View the full article

  3. Started by Logz.io,

    Explore improvements Enhanced Lucene Query Editor Syntax highlighting and auto suggestions to streamline query building and troubleshooting. Date Picker Updates Now located next to the query line, the date picker features an improved timestamp for better readability. Trace Tab Indicator An indicator now appears on the Traces tab whenever trace data is linked to the […]View the full article

    • 0 replies
    • 13 views
  4. I remember a time when digging through logs, events, and dashboards felt like trying to find a single sock in a pile of laundry fresh out of the dryer—frustrating, time-consuming, and somehow, the answer was always just out of reach. That’s where AI Agents come in. Instead of wasting hours sifting through data, you simply […]View the full article

    • 0 replies
    • 4 views
  5. AWS CloudWatch is a widely used observability tool that comes built into AWS. It provides easy access to logs, metrics, and alarms, making it a convenient choice for teams monitoring AWS workloads. But while CloudWatch offers a lot of power, many teams unknowingly misconfigure or misuse it, leading to unexpected costs, limited visibility, and operational […]View the full article

    • 0 replies
    • 4 views
  6. As an observability leader, at Logz.io, we pride ourselves on continuous innovation. That’s why, last year, we released our AI agents to revolutionize observability by helping businesses, and their engineering and DevOps teams, automate data analysis and root cause analysis. The primary way in which engineering and DevOps teams interact with the agents is by […]View the full article

    • 0 replies
    • 5 views
  7. Observability Costs Are Out of Control – Here’s How to Fix It In today’s cloud-native world, keeping logs, metrics, and traces under control isn’t just about monitoring performance – it’s about managing costs. And if you’re an engineering leader or platform owner, you know that observability budgets can spiral fast. That’s exactly what we tackled […]View the full article

    • 0 replies
    • 6 views
  8. We’re excited to announce a series of upgrades to our AI Agent, Log Management Explore UI and core integrations designed to empower you with even deeper observability and streamlined operations. These updates enhance account visibility, multi-telemetry trace insights, and logging capabilities while ensuring seamless compatibility with OpenTelemetry. Read on to discover how these enhancements can […]View the full article

    • 0 replies
    • 5 views
  9. Today’s distributed, cloud-native systems generate logs at a high rate, making it increasingly difficult to derive actionable insights. AI and Generative AI (GenAI) technologies—particularly large language models (LLMs)— are transforming log management tools by enabling teams to sift through this data, identify anomalies, and deliver real-time, context-rich intelligence to streamline troubleshooting. By applying transformer-based architectures–which […]View the full article

    • 0 replies
    • 6 views
  10. Introduction to OTel In formal terms, OpenTelemetry is an open source framework used for instrumenting, generating, collecting, and exporting telemetry data for applications, services, and infrastructure. It provides vendor-neutral tools, SDKs and APIs for generating, collecting, and exporting telemetry data such as traces, metrics, and logs to any observability backend, including both open source and […]View the full article

    • 0 replies
    • 5 views
  11. We’re thrilled to announce that Logz.io received a Special Mention for Best Use of AI from the 2024 O11ys Awards, a celebration of innovation and excellence in observability. The 2024 O11ys Awards recognized our AI Agent, calling it: This recognition validates our mission to simplify observability with AI, empowering teams to troubleshoot faster, optimize costs, […]View the full article

    • 0 replies
    • 4 views
  12. Introducing Our New Support Help Center We’re thrilled to launch our brand-new and improved Support Help Center, designed to streamline how you interact with our support team and access the resources you need. This enhanced platform empowers users to: This is more than just a support portal—it’s a centralized hub to enhance your experience, provide […]View the full article

    • 0 replies
    • 4 views
  13. Managing modern systems requires a constant balance between operational efficiency and innovation; going a little further, maintaining seamless operations and delivering exceptional customer experiences increasingly depend on ensuring robust observability. For years, the ELK stack (Elasticsearch, Logstash, Kibana) has been the go-to solution for many organizations for log management and observability, offering flexibility control and […]View the full article

    • 0 replies
    • 4 views
  14. Complexity rules the day within the world of data systems and pipelines. A goal for any observability practice is to help reduce complexity and give users and administrators a clear view of what’s happening in any system. This is the path to unified observability, a mature system where monitoring and troubleshooting are streamlined. This has […]View the full article

  15. As digital applications and infrastructures grow increasingly complex, managing and understanding log data has become increasingly vital in achieving practical observability, enabling organizations to detect, diagnose, and prevent issues across their systems. However, traditional log analysis methods often struggle with the volume and complexities of modern log data in cloud-native environments. Further, many organizations have […]View the full article

  16. In today’s rapidly evolving digital landscape, organizations heavily rely on their applications and systems to deliver optimal performance. As such, driving down the key metric of Mean Time to Resolution (MTTR) is clearly one of the biggest challenges facing observability practitioners today. According to the 2024 Observability Pulse Report, based on our annual survey of […]View the full article

  17. These days, one of the most important decisions that organizations can make as it relates to their observability strategy is: “How much data do we want to retain in Hot storage to ensure we have everything needed for real time analysis — without running up associated costs?” The reality is, for their immediate troubleshooting efforts, […]View the full article

  18. A critical component of any monitoring and observability system is alerting. But alerts in and of themselves aren’t enough—when something goes wrong, time is of the essence, and your team needs to figure out not just what’s going on but how to fix it, and fast. Additionally, constantly chasing down alerts can be the bane […]View the full article

  19. Explore Upgrades and Improvements New Visualization FeaturesWe’re rolling out new visualization capabilities in the Explore log management interface that are available now in some accounts and will be added to all in the coming weeks and months. With these updates you can: Warm Tier: There is now a new option for log storage and access […]View the full article

  20. Too much time spent troubleshooting? You’re not alone. Manual investigation, jumping between dashboards, and piecing together scattered data are time-consuming and frustrating. That’s why we built the Logz.io AI Agent—to simplify root cause analysis (RCA), optimize performance, and help teams de-risk deployments. In our recent webinar, we explored how AI is reshaping observability workflows. The […]View the full article

  21. Cloud native technologies have made it harder to understand how systems are behaving. Logs are the answer, but they can be voluminous and complex in any environment. How do you make sense of them? Logz.io co-founder and CTO Asaf Yigal recently participated in a webinar hosted by LeadDev.com on this topic alongside fellow experts Tanvi […]View the full article

  22. Explore Upgrades and Improvements We’ve improved the filter pane to include: Additionally, a new time-picker option lets you mix absolute and relative times and manually set the date and time to the second. Additionally, you can view your data in either UTC or your local time zone. Saved searches from Explore can now be used […]View the full article

  23. Logz.io introduces its AI Agent in Beta, using GenAI to revolutionize observability. The AI Agent simplifies monitoring with automated data analysis and root cause detection, accelerating issue resolution by 3-5x for beta users—marking a critical step toward fully autonomous observability. Today, we’re thrilled to announce the launch of the Logz.io AI Agent, as we blaze […]View the full article

  24. Started by Logz.io,

    In conversations about cloud observability today, discussions often shift from “what’s possible” to “what’s practical.” Too often, these conversations highlight the shortcomings of current observability processes, tools and financial models. As observability data workloads continue to grow at an unprecedented pace, traditional dashboarding and alerting-based approaches are struggling to keep up. This hinders decision-making, extends […]View the full article

  25. At Logz.io, we believe that observability should be simple, smart, and fast—powered by AI to help teams move with confidence. This Fall, our users recognized that commitment by awarding Logz.io 15 badges on G2 across multiple categories and global markets. From ease of use to fast implementation, users and businesses alike are experiencing how AI-driven […]View the full article

  26. Observability is critical to innovation, but for many organizations, soaring costs make it impossible to achieve. Logz.io is here to change that. The Problem with Traditional Observability Pricing You don’t have to look very far these days to see that the issue of ever-increasing costs is getting in the way of many organizations achieving their […]View the full article

  27. The emergence of generative AI in observability tools was inevitable, but there’s already been an extreme degree of hype in the market. Monitoring, DevOps and ITOps have never been immune to trends, and with GenAI capabilities, the propagandahype machine is running out of control. Organizations looking to ride the wave of GenAI undoubtedly recall the […]View the full article

  28. When Gartner publishes their annual observability industry research, it’s always exciting to find your company named among the most successful and high-profile providers in this space. That’s why Logz.io is thrilled to find itself listed as a Visionary for the third consecutive year in the Gartner® Magic Quadrant™ for Observability Platforms (previously known as the […]View the full article

  29. “Logz.io Observability IQ Assistant helps us to find the root cause of the issues faster, and it reduces a lot of the manual processes that we were doing before.” That’s the assessment of Senior DevOps Engineer and Logz.io user Armin Morattab when discussing the impact of AI on his day-to-day job. He dives deep on […]View the full article

    • 0 replies
    • 28 views
  30. Your team is responsible for ensuring the reliability and performance of your organization’s critical applications and infrastructure. What keeps you up at night? Your applications are more complex, distributed and cloud-native than ever, meaning that understanding what’s happening under the hood has never been more complex than it is now. Is it system bugs, or […]View the full article

    • 0 replies
    • 11 views
  31. At Logz.io, we’ve found that for most organizations observability challenges start with log management. Today more than ever, log management is a highly complex practice that involves mountains of ephemeral data, and the related obstacles are preventing people from achieving their observability goals, full stop. That’s why we designed our new log management UI to […]View the full article

    • 0 replies
    • 77 views
  32. Kubernetes has just reached its 10th anniversary, signifying the maturity of the containers movement. Now it’s time to explore the next frontier in cloud-native evolution: WebAssembly, a.k.a. WASM or Wasm. Moving beyond containers and Kubernetes, WASM bears the promise to revolutionize the cloud landscape with unparalleled performance, portability, and security. In a recent episode of […]View the full article

    • 0 replies
    • 46 views
  33. Started by Logz.io,

    Despite advances in the world of observability, log management hasn’t evolved much in recent years. In our webinar, Your Faster, Easier Log Management UI is Here, we share our vision for log management and provide a demo of Logz.io Explore, the new UI for our Log Management solution. NOTE: Watch to the end of the […]View the full article

    • 0 replies
    • 43 views
  34. There’s no question that achieving end-to-end observability is among the most challenging tasks facing engineering and ops teams today. A quick look back at the 2024 Observability Pulse survey throws this conclusion into stark relief as: Logz.io is committed to making observability smarter, faster, and easier — from data ingestion, to troubleshooting, to managing costs. […]View the full article

    • 0 replies
    • 42 views
  35. The business of observability is all about data: what you’re observing in the data, how you’re visualizing it, what it indicates about the state of your environment, and how to address issues that may occur. Creating your own perspective for observability, and understanding what you’re seeing, can be difficult. The use of both generative and […]View the full article

    • 0 replies
    • 34 views
  36. Redis is no longer open source. In March 2024 the project was relicensed, leaving its vast community confused. But the community did not give up, and started work to fork Redis to keep it open. On my recent OpenObservabilty Talks episode, I delved into Valkey, a prominent fork of Redis. Valkey was established under The […]View the full article

    • 0 replies
    • 33 views
  37. There’s so much hype around the use of AI in observability — but how does that translate into making tangible progress with your day-to-day tasks? At Logz.io we’ve introduced an AI-based chatbot assistant to the Open 360™ platform that automatically delves into your stack, fine-tunes your workflows and enables conversation directly with your systems and […]View the full article

    • 0 replies
    • 69 views
  38. It may sound complicated and daunting, but so much of observability is about discovering the unknown unknowns in your critical systems. The capabilities of observability engineering can help you make those discoveries. Most organizations have some form of monitoring, alerting and troubleshooting, which can be adequate to a point but fall short when trying to […]View the full article

    • 0 replies
    • 73 views
  39. Intro – LogMetrics Feature At Logz.io we provide an observability platform with the ability to ship logs, metrics, and traces and then interact with them using our app. LogMetrics is an integral part of our observability offering, which bridges the gap between logs and metrics. It provides the seamless conversion of one type of signal […]View the full article

    • 0 replies
    • 52 views
  40. In the past few years we’ve been witnessing tectonic shifts in the open source realm, with established projects taken off open source or otherwise turning to the dark side. On the other hand, we’ve seen active forks aiming to keep these projects open gaining momentum. What does it mean for the Free and Open Source […]View the full article

    • 0 replies
    • 61 views
  41. At Logz.io, we believe the future of observability will center on the rapid advancement of automation, innovations around artificial intelligence, and streamlining processes that currently remain far too complex. This is no different than many other areas of technology, but the opportunities in observability are vast, and we see all of these areas connecting and […]View the full article

    • 0 replies
    • 51 views
  42. In the fast-paced realm of DevOps and Site Reliability Engineering (SRE), success starts with effective monitoring. Understanding the fundamental metrics is crucial for identifying and mitigating issues proactively. In this article, we’ll delve into the leading metrics frameworks — R.E.D., U.S.E., and the “Four Golden Signals” — which will provide you with a solid foundation […]View the full article

    • 0 replies
    • 65 views
  43. Generative AI and large language models (LLM) are fundamentally changing the way we interact with data, especially in the realm of Kubernetes and observability. These technologies are reshaping our field, and there is a lot to understand and unpack so organizations like yours can make sense of it all. What data is important, and what […]View the full article

    • 0 replies
    • 70 views
  44. AI has been the biggest macro-trend in technology for some time now, and the observability space is no exception to this rule. Just look at the findings of the 2024 Observability Pulse Report; it’s evident that organizations are hungry for AI capabilities that help address pervasive issues of observability process maturity, the talent shortage, ever-increasing […]View the full article

    • 0 replies
    • 66 views
  45. Troubleshooting within Kubernetes environments can be a daunting task. If we could only have a magical artificial intelligence advisor that could gather all the data about what goes on the system, and tell me what’s wrong, and even how to solve it. Wouldn’t it be nice? K8sGPT is a young open source project that uses […]View the full article

    • 0 replies
    • 84 views
  46. Started by Logz.io,

    In technology, having “modern” capabilities is standard. Staying ahead of the curve is critical, and keeping outdated technology or processes going can be a recipe for disaster in a complex, ever-changing landscape. Ensuring the smooth functioning and performance of software systems is paramount. This is where modern observability—a sophisticated approach to monitoring and understanding the […]View the full article

    • 0 replies
    • 57 views
  47. Observability isn’t new. But organizations are struggling to adopt mature observability practices, and the impact on business is palpable. Organizations are seeing the value of observability for their applications and infrastructure—the results of our 2024 Observability Pulse survey of 500 global IT professionals reflects that across the board. But respondents are challenged by the notion […]View the full article

    • 0 replies
    • 176 views
  48. Amazon CloudWatch is excited to announce a resource filtering capability for cross-account observability, providing customers with the flexibility to share a subset of their logs or metrics across multiple AWS accounts using configurable filters. View the full article

  49. In today’s data-driven landscape, managing and analyzing vast amounts of data, especially logs, is crucial for organizations to derive insights and make informed decisions. However, handling this data efficiently presents a significant challenge, prompting organizations to seek scalable solutions without the complexity of infrastructure management. Amazon OpenSearch Serverless lets you run OpenSearch in the AWS Cloud, without worrying about scaling infrastructure. With OpenSearch Serverless, you can ingest, analyze, and visualize your time-series data. Without the need for infrastructure provisioning, OpenSearch Serverless simplifies data management and enables you to d…

  50. Amazon OpenSearch Service now supports Amazon Route 53 alias records for defining custom domain endpoints. Alias records provide better flexibility when configuring routing to AWS resources. For more information about Route 53 alias records, please see documentation. View the full article

  51. Starting today, the Amazon CloudWatch metrics for monitoring AWS Config data usage will display only billable usage. With this enhancement, non-billable usage will no longer be displayed in both the Amazon CloudWatch Config metrics and AWS Config console. This allows you to validate AWS Config setup and usage using Amazon CloudWatch metrics and correlate billable usage with associated costs. View the full article

  52. Without a doubt, you’ve heard about the persistent talent gap that has troubled the technology sector in recent years. It’s a problem that isn’t going away, plaguing everyone from engineering teams to IT security pros, and if you work in the industry today you’ve likely experienced it somewhere within your own teams. Despite major changes […]View the full article

    • 0 replies
    • 95 views
  53. Amazon CloudWatch RUM, which enables customers to monitor their web applications by collecting client side performance and error data in real time, is generally available in the following 5 AWS Regions starting today: Asia Pacific (Hyderabad), Asia Pacific (Melbourne), Europe (Spain), Europe (Zurich), and Middle East (UAE). View the full article

  54. Amazon OpenSearch Service adds support for Hebrew and HanLP (Chinese NLP) language analyzer plugins. These are now available as optional plugins that you can associate with your Amazon OpenSearch Service clusters. View the full article

  55. Amazon CloudWatch Container Insights with Enhanced Observability for EKS now auto-discovers critical health metrics from your AWS accelerators Trainium and Inferentia, and AWS high performance network adapters (Elastic Fabric Adapters) as well as NVIDIA GPUs. You can visualize these out-of-the-box metrics in curated Container Insights dashboards to help monitor your accelerated infrastructure and optimize your AI workloads for operational excellence. View the full article

  56. The Internet has a plethora of moving parts: routers, switches, hubs, terrestrial and submarine cables, and connectors on the hardware side, and complex protocol stacks and configurations on the software side. When something goes wrong that slows or disrupts the Internet in a way that affects your customers, you want to be able to localize and understand the issue as quickly as possible. New Map The new Amazon CloudWatch Internet Weather Map is here to help! Built atop of collection of global monitors operated by AWS, you get a broad, global view of Internet weather, with the ability to zoom in and understand performance and availability issues that affect a particular…

  57. All AWS customers who navigate to Amazon CloudWatch Internet Monitor console can now view the internet weather map, at no charge, which shares a 24-hour global snapshot of internet latency and availability outages. The map lets you see, at a glance, recent internet issues across the world, including specific cities and service providers. View the full article

  58. Amazon Managed Workflows for Apache Airflow (MWAA) now offers larger environment sizes, giving customers of the managed service the ability to define a greater number of workflows in each Apache Airflow environment, supporting more complex tasks that can utilize increased resources. View the full article

  59. Despite advances in the world of observability, log management hasn’t evolved much in recent years. Users are familiar with the experience of Kibana or OpenSearch Dashboards (OSD), but those don’t always meet modern use cases. Logz.io is ready to change the conversation with the introduction of Explore, the new path forward for Log Management for […]View the full article

    • 0 replies
    • 101 views
  60. Today, AWS announces the release of workflow monitor for live video, a media-centric tool to simplify and elevate the monitoring of your video workloads. Accessible via the AWS Elemental MediaLive console and API, workflow monitor discovers and visualizes resources. It creates signal maps showing video across AWS Elemental MediaConnect, MediaLive, and MediaPackage along with Amazon S3 and Amazon CloudFront to provide end-to-end visibility. With the workflow monitor, you can create your own alarm templates or start from a set of recommended alarms, and build custom templates for alarm notifications. View the full article

  61. Amazon CloudWatch Container Insights now offers observability for Windows containers running on Amazon Elastic Kubernetes Service (EKS), and helps customers collect, aggregate, and summarize metrics and logs from their Windows container infrastructure. With this support, customers can monitor utilization of resources such as CPU, memory, disk, and network, as well as get enhanced observability such as container-level EKS performance metrics, Kube-state metrics and EKS control plane metrics for Windows containers. CloudWatch also provides diagnostic information, such as container restart failures, for faster problem isolation and troubleshooting for Windows containers runn…

  62. You can now create or associate a monitor for a distribution directly from the Amazon CloudFront console. By adding your distribution to a monitor, you can gain improved visibility into your application's internet performance and availability using Amazon CloudWatch Internet Monitor. You can create a monitor for the distribution, or add the distribution to an existing monitor, directly from the distribution metrics dashboard on the CloudFront console. View the full article

  63. Amazon OpenSearch Service is now extending the ability to update the number of data nodes without requiring a blue/green deployment for clusters without dedicated cluster manager (master) nodes. This change will allow you to make node count changes faster. Clusters with dedicated cluster manager nodes already supported updating the data node count without a blue/green deployment. View the full article

  64. Amazon CloudWatch RUM, which enables customers to monitor their web applications by collecting client side performance and error data in real time, is generally available in the following 11 AWS Regions starting today: Africa (Cape Town), Asia Pacific (Jakarta), Asia Pacific (Mumbai), Asia Pacific (Osaka), Asia Pacific (Seoul), Canada (Central), Europe (Milan), Europe (Paris), Middle East (Bahrain), South America (Sao Paulo), and US West (N. California). View the full article

  65. Amazon CloudWatch now supports using AWS CloudFormation to manage tags when you create, update, or delete alarms. View the full article

  66. You can now set up cross-account observability for Amazon CloudWatch Internet Monitor, so that you can get read-only access to monitors from multiple accounts within an AWS Region. Deploying applications by using resources in separate accounts is a good practice, to establish security and billing boundaries between teams and reduce the impact of operational events. For example, when you set up cross-account observability for Internet Monitor, you can access and view performance and availability measurements generated by monitors in different AWS accounts. View the full article

  67. Amazon OpenSearch Ingestion now enables you to enrich events with geographical location data from an IP address, allowing you to add additional context to your observability and security data in realtime. Additionally, you can configure mapping templates in Amazon OpenSearch clusters to automatically display these enriched events on a geographical map using OpenSearch Dashboards. View the full article

  68. OR1, the OpenSearch Optimized Instance family, now doubles the max allowed storage per instance. OR1 also expands availability to four additional regions- Canada Central, EU (London), and Asia Pacific (Hyderabad, Seoul). OR1 delivers up to 30% price-performance improvement over existing instances (based on internal benchmarks), and uses Amazon S3 to provide 11 9s of durability. The new OR1 instances are best suited for indexing-heavy workloads, and offers better indexing performance compared to the existing memory optimized instances available on OpenSearch Service. View the full article

  69. Amazon CloudWatch now supports Anomaly Detection on metrics shared across your accounts. CloudWatch Anomaly Detection now lets you track unexpected changes in metric behavior across multiple accounts from a single monitoring account through CloudWatch cross-account observability. View the full article

  70. Amazon OpenSearch Service is an Apache-2.0-licensed distributed search and analytics suite offered by AWS. This fully managed service allows organizations to secure data, perform keyword and semantic search, analyze logs, alert on anomalies, explore interactive log analytics, implement real-time application monitoring, and gain a more profound understanding of their information landscape. OpenSearch Service provides the tools and resources needed to unlock the full potential of your data. With its scalability, reliability, and ease of use, it’s a valuable solution for businesses seeking to optimize their data-driven decision-making processes and improve overall operationa…

  71. Every software-driven business strives for optimum performance and user experience. Observability—which allows engineering and IT Ops teams to understand the internal state of their cloud applications and infrastructure based on available telemetry data —has emerged as a crucial practice to help engage this process. For years, application performance monitoring (APM) was the de facto practice […]View the full article

    • 0 replies
    • 86 views
  72. You can use Amazon Data Firehose to aggregate and deliver log events from your applications and services captured in Amazon CloudWatch Logs to your Amazon Simple Storage Service (Amazon S3) bucket and Splunk destinations, for use cases such as data analytics, security analysis, application troubleshooting etc. By default, CloudWatch Logs are delivered as gzip-compressed objects. You might want the data to be decompressed, or want logs to be delivered to Splunk, which requires decompressed data input, for application monitoring and auditing. AWS released a feature to support decompression of CloudWatch Logs in Firehose. With this new feature, you can specify an option in…

  73. Data volumes are soaring. Environments are increasingly intricate. The risk of applications and systems encountering breakdowns is sky-high, and the mean time to recovery (MTTR) for production incidents is moving in the wrong direction. Disruptions not only jeopardize critical infrastructure but also have a direct impact on the bottom line of organizations. Swift recovery of […]View the full article

    • 0 replies
    • 78 views
  74. Today, AWS IoT Core for LoRaWAN announces a new fleet monitoring application that enables developers capture and visualize critical operational and health parameters related to the functioning of LoRaWAN-based gateways and devices. AWS IoT Core for LoRaWAN is a fully managed LoRaWAN Network Server that supports cloud connectivity for LoRaWAN-based wireless devices. Using the new metrics feature, developers can now quickly capture system health data, such as connection signal strength, data rate, and gateway latency and analyze their fleet’s performance. View the full article

  75. In Part 2 of this series, we discussed how to enable AWS Glue job observability metrics and integrate them with Grafana for real-time monitoring. Grafana provides powerful customizable dashboards to view pipeline health. However, to analyze trends over time, aggregate from different dimensions, and share insights across the organization, a purpose-built business intelligence (BI) tool like Amazon QuickSight may be more effective for your business. QuickSight makes it straightforward for business users to visualize data in interactive dashboards and reports. In this post, we explore how to connect QuickSight to Amazon CloudWatch metrics and build graphs to uncover trends…

  76. Krones provides breweries, beverage bottlers, and food producers all over the world with individual machines and complete production lines. Every day, millions of glass bottles, cans, and PET containers run through a Krones line. Production lines are complex systems with lots of possible errors that could stall the line and decrease the production yield. Krones wants to detect the failure as early as possible (sometimes even before it happens) and notify production line operators to increase reliability and output. So how to detect a failure? Krones equips their lines with sensors for data collection, which can then be evaluated against rules. Krones, as the line manufact…

  77. The topic of continuous profiling has been an ongoing discussion in the observability world for some time. I said back in 2021 that profiling was set to be the next major telemetry signal in observability, and in fact, since then there’s been growing interest in profiles. Startups and large observability vendors have gotten into this […]View the full article

    • 0 replies
    • 70 views
  78. Amazon CloudWatch Logs now supports increased default API quotas. The default quota for ingesting logs has increased from 1,500 to 5,000 Transactions Per Second (TPS) in select regions. The increased quotas are available automatically with no changes required. View the full article

  79. Kubernetes has changed the way many organizations approach the deployment of their applications. But despite its benefits, the additional layers of abstraction and reams of data can cause complexity around Kubernetes monitoring. We’ve seen so much of these challenges borne out in the results of the 2024 Observability Pulse survey. In the survey report, 36% […]View the full article

    • 0 replies
    • 75 views
  80. The more data you have, the harder it becomes to read through it, let alone identify trends or crucial patterns. Couple that with a shortage of time, and the ability not only to visualize but also to communicate with your data becomes paramount. To help empower your data analysis like never before, we’re introducing a […]View the full article

    • 0 replies
    • 96 views
  81. Amazon OpenSearch Service has been a long-standing supporter of both lexical and semantic search, facilitated by its utilization of the k-nearest neighbors (k-NN) plugin. By using OpenSearch Service as a vector database, you can seamlessly combine the advantages of both lexical and vector search. The introduction of the neural search feature in OpenSearch Service 2.9 further simplifies integration with artificial intelligence (AI) and machine learning (ML) models, facilitating the implementation of semantic search. Lexical search using TF/IDF or BM25 has been the workhorse of search systems for decades. These traditional lexical search algorithms match user queries with…

  82. Amazon CloudWatch Synthetics, an outside-in monitoring capability to continually verify your customers’ experience using snippets of code called canaries, is extending historical data for canary runs that pass or fail from 7-days to 30-days. Canary run troubleshooting artifacts such as screenshots from the canary run, HAR files, and log files for historical runs can be viewed for up to 30 days to easily pin point persistent versus intermittent canary run failure patterns on the CloudWatch console. View the full article

  83. You can now obtain an aggregated picture of the performance and health of your WorkSpaces instances using the Amazon CloudWatch Automatic dashboard. This enables WorkSpaces administrators to quickly start monitoring WorkSpaces metrics and identify issues and their potential causes. You can also use CloudWatch Automatic dashboard as a starting point and create your own custom dashboards to meet your monitoring needs. View the full article

  84. Organizations like yours are increasingly reliant on complex IT infrastructures to support their operations. Pervasive use of Kubernetes and microservices architectures continues to up the ante. Amidst this complexity, achieving comprehensive visibility into systems and applications has become both imperative for ensuring performance, reliability, and security, while also becoming ever-more challenging to achieve. End-to-end, or […]View the full article

    • 0 replies
    • 73 views
  85. Amazon CloudWatch Container Insights with Enhanced Observability for EKS now auto-discovers critical health and performance metrics from your NVIDIA GPUs and delivers them in automatic dashboards to enable faster problem isolation and troubleshooting for your AI/ML workloads. Container Insights with Enhanced Observability delivers you out-of-the-box trends and patterns on your infrastructure health and removes the overhead of manual dashboard and alarm set-ups saving you time and effort. View the full article

  86. Amazon CloudWatch Synthetics announces new runtime version releases; syn-nodejs-puppeteer-7.0 for NodeJS Runtime and syn-python-selenium-3.0 for Python Runtime. The NodeJS Runtime update includes dependency upgrades to puppeteer (v21.9.0) and Chromium (v121.0.6167.0.85). The Python Runtime update includes dependency upgrades to Chromium and Chromedriver (v121.0.6167.85). To learn more, see NodeJS release notes and Python release notes. View the full article

  87. Amazon CloudWatch announces support for streaming of daily metrics on CloudWatch Metric Streams. With Metric Streams, you can create a continuous, near real-time stream of metrics to a destination of your choice. You can use Metric Streams to send metrics to your data lake on Amazon Web Services (AWS), such as Amazon Simple Storage Service (Amazon S3), or AWS Partner solutions including Datadog, New Relic, Splunk, Dynatrace and Sumo Logic. This new capability provides additional metrics for streaming, adding daily metrics with timestamps up to two days old. View the full article

  88. We are excited to announce that Amazon OpenSearch Serverless can now scan and search up to 10TB of time series data which includes one or more indexes within a collection. OpenSearch Serverless is a serverless deployment option for Amazon OpenSearch Service that makes it simple for you to run search and analytics workloads without having to think about infrastructure management. With the support for much larger datasets than before, you can further enhance unlocking valuable operational insights and make data driven decisions to troubleshoot application downtime, improve system performance, or identify fraudulent activities. View the full article

  89. We are excited to announce that Amazon OpenSearch Serverless is enhancing access controls for VPC endpoints. With this feature, administrators can attach endpoint policies to control which AWS principals are allowed or denied access to the OpenSearch resources through their VPC endpoint(s). With a VPC endpoint policy, users can also combine actions along with AWS principals and resources to have finer control on the allowing or denying the traffic through their VPC endpoint(s). View the full article

  90. The .NET programming language is taking cloud native deployment and observability seriously, and most notably with the recent announcement of .NET Aspire stack unveiled at the recent .NET Conf 2023. In the latest episode of OpenObservability Talks, we reviewed the journey to making .NET a “by default, out of the box observable platform,” as ASP.NET […]View the full article

    • 0 replies
    • 76 views
  91. There’s no debate — in our increasingly AI-driven, lean and data-heavy world, automating key tasks to increase effectiveness and efficiency is the ultimate name of the game. No matter what job you hold today, you’re likely being pushed to not only do more with less, but also perform your work with a tighter focus on […]View the full article

    • 0 replies
    • 72 views
  92. Amazon CloudWatch Logs now offers customer to use Internet Protocol version 6 (IPv6) addresses for their new and existing domains. Customers moving to IPV6 can simplify their network stack by running their CloudWatch log groups on a dual-stack network that supports both IPv4 and IPv6. View the full article

  93. Amazon CloudWatch announces a comprehensive set of enhancements to the alarm and dashboard experience. It introduces out-of-the-box, best practice alarm recommendations for 24 AWS services, streamlining your monitoring setup. You can easily view all metrics with recommended alarms using a convenient toggle. Creating alarms is simpler with pre-filled configuration in the alarm wizard or bulk downloading infrastructure-as-code templates for the recommended alarms. View the full article

  94. We are pleased to announce Amazon OpenSearch Serverless now offers improved security options for workloads with the support of Transport Layer Security (TLS) version 1.3. OpenSearch Serverless is the serverless option for Amazon OpenSearch Service that makes it simpler for you to run search and analytics workloads without having to think about infrastructure management. View the full article

  95. Amazon OpenSearch Service now lets you update cluster volume size, volume type, IOPS and throughput without requiring a blue/green deployment. This makes it easier for you to make changes to your EBS settings without having to plan upfront for a blue/green deployment. View the full article

  96. AWS AppSync is a fully managed service that allows customers to connect applications to data and events with GraphQL APIs. AppSync allows customers to create APIs that connect to multiple data sources like microservice APIs, relational databases, and NoSQL databases. With AppSync APIs, applications can efficiently fetch data from different sources in one request. View the full article

  97. Amazon Managed Blockchain (AMB) Query now supports Amazon CloudWatch usage metrics, enabling customers to monitor their AMB Query API usage. View the full article

  98. Amazon CloudWatch Synthetics announces release of Synthetics NodeJS Runtime versions - syn-nodejs-puppeteer-6.2, syn-nodejs-puppeteer-5.2 - and Python Runtime version - syn-python-selenium-2.1. This release brings updated Chromium dependency libs for forward compatibility with Lambda OS and adds new Lambda Ephemeral Storage usage metric in customer account. To learn more, see release notes. View the full article

  99. Amazon OpenSearch Service now provides improved visibility into the progress of domain updates. You can see granular status values representing different stages of an update, simplifying monitoring and automation of configuration changes. View the full article

  100. Starting today, AWS Elemental MediaConnect offers additional output metrics for SRT and MediaLive outputs to improve video stream monitoring. The new metrics are total packets sent, Forward Error Correction (FEC) packets, Automatic Repeat Requests (ARQ), number of resent packets, number of packets not recovered, round trip time, and bitrate for each output from a MediaConnect flow. View the full article