Search the Community
Showing results for tags 'sre'.
-
As a DevOps Engineer or Site Reliability Engineer (SRE), managing cloud infrastructure deployments is a critical aspect of your daily activities. It is vital to use tools that automate the provisioning and configuration of cloud infrastructure to achieve efficient and scalable infrastructure management. One of the best tools for this is HashiCorp Terraform, and as […] The article Why HashiCorp Terraform is Essential for SREs and DevOps Engineers appeared first on Build5Nines. View the full article
-
As we close out 2022, we at DevOps.com wanted to highlight the most popular articles of the year. Following is the latest in our series of the Best of 2022. Site reliability engineering (SRE) teams and platform engineering teams share similar goals—like maximizing automation and reducing toil—and similar methodologies. But they have different priorities, and […] The post Best of 2022: SRE Vs. Platform Engineering: What’s the Difference? appeared first on DevOps.com. View the full article
-
Ask most SREs how many incidents they’d have to respond to in a perfect world, and their answer would probably be ‘zero.’ After all, making software and infrastructure so reliable that incidents never occur is the dream that SREs are theoretically chasing. Reducing the number of actual incidents as much as possible is a noble […] The post Why More Incidents Are Better appeared first on DevOps.com. View the full article
-
Site reliability engineering (SRE) isn’t a new term or practice. The practice of applying software engineering skills and principles to operations problems and tasks happened even before site reliability engineer was a defined job title. But organizing a proactive approach to building and maintaining software drives long-term success in improving operational efficiency, data-driven roadmap planning […] View the full article
-
Site reliability engineers (SREs) have a considerable set of tasks to juggle no matter where they work or how long their company has had an SRE practice. But if you’re the very first one to join an organization—as many SREs are these days, given that the trend is trickling down into smaller and smaller companies—you […] The post 5 Tips If You’re the First SRE Hire appeared first on DevOps.com. View the full article
-
Here: https://about.gitlab.com/blog/2022/05/17/how-we-removed-all-502-errors-by-caring-about-pid-1-in-kubernetes/?utm_id=FAUN_Kaptain321_Link_title
-
- kubernetes
- gitlab
-
(and 3 more)
Tagged with:
-
What happens when the tools and services you depend on to drive site reliability engineering turns out to be susceptible to reliability failures of their own? That’s the question teams at about 400 businesses presumably asked themselves in the wake of a major outage in Atlassian Cloud... View the full article
-
The amount of routine toil that site reliability engineers (SREs) perform declined slightly in the last year even though IT environments in general are becoming more complex to manage. An annual survey of 300 SREs conducted by Catchpoint, a provider of an IT monitoring platform, in collaboration with VMware and the DevOps Institute suggests that […] The post Survey Reveals Slight Decline in Level of SRE Toil appeared first on DevOps.com. View the full article
-
Who else is glad that 2020 is almost over? We’ve had one of the most difficult years in recent history. With everything going on, it’s been difficult to think further than a few days out, much less into the new year. But, we’re hopeful that 2021 will be a better year for everyone. And we’re predicting some exciting things in the future for SRE. View the full article
-
November zine covering the latest and greatest from the SRE and resilience engineering community. View the full article
-
One of the fundamental premises of software reliability engineering is that you should base your reliability goals—i.e., your service level objectives (SLOs)—on the level of service that keeps your customers happy. The problem is, defining what makes your customers happy requires communication between software reliability engineers (SREs) and product managers (PMs) (aka business stakeholders), and […] The post SREs: Stop Asking Your Product Managers for SLOs appeared first on DevOps.com. View the full article
-
Good blog post looking at some of the many roles an SRE can play, and how to find people with those skill sets... https://thechief.io/c/blameless/how-build-your-sre-team/
-
BOO! Did we scare you? We couldn’t help it, we’re just so happy it’s spooky season. Here’s the October issue of SREview! View the full article
-
By adopting a multilevel approach to site reliability engineering and arming your team with the right tools, you can unleash benefits that impact the entire service-delivery continuum In today’s application-driven economy, the infrastructure supporting business-critical applications has never been more important. In response, many companies are recruiting site reliability engineering (SRE) specialists to help them […] The post Why It’s Time for Site Reliability Engineering to Shift Left appeared first on DevOps.com. View the full article
-
OpenTelemetry is an open source project to provide a de facto standard trace API and a metric and log agent to end vendor lock-in It’s no surprise that understanding the internal health of applications and systems is a high priority for developers and SREs. It’s a well-understood problem with dozens of vendors focused on helping […] The post OpenTelemetry and the Future of Monitoring Instrumentation appeared first on DevOps.com. View the full article
-
Are you excited about reliability? Is your significant other tired of hearing about distributed systems? Are you the one being paged when systems go down? Have you had “aha!” moments when reading the SRE books? If you answered ‘yes’ to any of these questions, join us for a virtual conference on everything SRE! We’re looking for presenters on topics such as: building reliable systems monitoring and alerting distributed systems chaos engineering automated testing https://www.papercall.io/conf42-sre-2021
-
Conf42: Chaos Engineering is back in 2021! Come and join other engineers and SREs and talk about failure, dealing with failure, breaking things on purpose and other fun things. https://www.papercall.io/conf42-chaos-engineering-2021
-
- sre
- chaos engineering
-
(and 2 more)
Tagged with:
-
Forum Statistics
39.8k
Total Topics40k
Total Posts