Jump to content

Search the Community

Showing results for tags 'sre'.

  • Search By Tags

    Type tags separated by commas.
  • Search By Author

Content Type


Forums

There are no results to display.

There are no results to display.


Find results in...

Find results that contain...


Date Created

  • Start

    End


Last Updated

  • Start

    End


Filter by number of...

Joined

  • Start

    End


Group


Website URL


LinkedIn Profile URL


About Me


Cloud Platforms


Cloud Experience


Development Experience


Current Role


Skills


Certifications


Favourite Tools


Interests

  1. It’s Saturday night. You’re out to dinner with friends. Suddenly, a familiar tune emits from your pocket. Dread fills you as you fish your phone out of your pocket and unlock it. You tap the alert. Maybe it’s a lucky night and this is one alert you can just snooze or resolve. Maybe it’s a bad night, and the next step is you pulling your laptop from your bag — because you bring your laptop everywhere when you’re on-call — and trying to troubleshoot a problem in a crowded, noisy restaurant. The post How to Escape the 3 AM Page as a Kubernetes Site Reliability Engineer appeared first on Security Boulevard. View the full article
  2. Claim your spot for the free Google Site Reliability Engineering in partnership with Uplimit right now! Starts March 11.View the full article
  3. AI-engineered tools can be used to improve the SRE measurements and observability pillar of practice. View the full article
  4. In the rapidly changing world of technology, it is essential for professionals in the DevOps, IT Ops, Platform Engineering, and SRE domains to stay up-to-date on the latest innovations and best practices. Google Cloud Next, is the ideal place to do just that! Think of it as a golden opportunity to gain insights, expand your knowledge, and connect with like-minded peers. Read on for five compelling reasons why attending Next ‘23 is a must for operations professionals this year. 1. Uncover the potential of generative AI for operationsA recent IDC survey1 found that IT operations is the area with the most potential to benefit from generative AI assistance. From automating routine tasks to predicting and preventing potential issues, generative AI could revolutionize the way IT teams work. At Next ‘23, you'll have the chance to delve into the latest AI breakthroughs and explore how they can help supercharge IT operations. Our expert speakers will share real-world success stories of building an AIOps platform on Google Cloud, revealing the immense potential AI holds for optimizing workflows, enhancing system reliability, and driving efficiency in IT environments. 2. Embrace platform engineering Platform engineering has emerged as a crucial discipline for organizations aiming to build robust, scalable, and efficient software delivery platforms. Next ‘23 is a great place for professionals to dive deep into the principles and practices of Platform Engineering. Through sessions such as a panel with industry-leading thought leaders, and platform engineering customer success stories, you’ll gain the expertise you need to architect and build your own software delivery platform for your developers. 3. Learn about the latest innovations in GKEGoogle Kubernetes Engine fans, you don't want to miss out on this opportunity to gain insights into the latest GKE features, best practices, and real-world use cases from your peers. As the first managed Kubernetes offering, GKE has won the hearts of countless customers, and at Next ‘23 this year, you'll get the chance to discover why! GKE customers like ANZ Bank, Equifax, Ordaōs Bio, Etsy, and Moloco will share their success stories. More than that, we’ll also unveil the latest innovations that we’re adding to GKE to help you run modern workloads. 4. Get inspired by your peersLearning from peers is invaluable for technology professionals. At Next ‘23, you’ll be exposed to customer success stories about how leading organizations have leveraged Google Cloud's solutions to drive business growth and innovation. Hear from Charles Schwab on their best practices operating Google Cloud services; learn from SAP about how they control cost and build financial resilience with automation; listen to Wayfair on how they cut costs by 64% by moving from open source to Google Cloud operational tools. Snap will share their approach to observability; Priceline will discuss how they optimize Kubernetes for reliability and cost-efficiency; and Uber will show you how they build an AIOps platform with Google Cloud services. 5. Network and build communityNetworking and community building are an integral part of the Google Cloud Next experience, and this year, it gets even better! For the first time since the pandemic, our premier event is going back to its roots as an in-person gathering. Engage with your peers at the Innovator Hive’s specialized zone for the DevOps and IT Ops professionals. Get your hands on the interactive demos, try the micro-challenges, throw yourself into immersive learning, play the games, and meet with Google experts and your peers for architecture deep dives. We’re confident these interactions will open doors to new perspectives, ideas, and future partnerships. Google Cloud Next is not just another tech event; it's an immersive experience that has the power to transform your career and ignite your passion for all things cloud and technology. From exploring the latest GKE innovations to discovering the magic of AI in IT operations, from embracing platform engineering to learning from your peers, Next ‘23 is brimming with opportunities to expand your horizons and elevate your skills. So, don't miss your chance to be a part of this extraordinary event. Mark your calendar, pack your enthusiasm, and join us at Google Cloud Next. Together, we'll unleash the full potential of cloud technology and pave the way for a brighter, more innovative future. 1: US – Generative AI, IDC, April 2023; N=200; Base=All Respondents; Notes: Managed by IDC’s Global Primary Research Group; Data Not Weighted. Use caution when interpreting small sample sizes.
  5. As a DevOps Engineer or Site Reliability Engineer (SRE), managing cloud infrastructure deployments is a critical aspect of your daily activities. It is vital to use tools that automate the provisioning and configuration of cloud infrastructure to achieve efficient and scalable infrastructure management. One of the best tools for this is HashiCorp Terraform, and as […] The article Why HashiCorp Terraform is Essential for SREs and DevOps Engineers appeared first on Build5Nines. View the full article
  6. We are all looking to advance our careers and to find tips and tricks to help us get the leading edge in the industry. Technology certifications are a great way to to prove you have the expertise needed for the job. Sure, having expertise certainly helps when it comes to being more efficient with practical, […] The article How Adoption of ChatGPT Can Benefit Your Career in DevOps, SRE or Software Development appeared first on Build5Nines. View the full article
  7. Artificial Intelligence (AI) has the potential to greatly assist Site Reliability Engineers (SREs), or even DevOps Engineers, in a number of ways. Some potential applications of AI in SRE work might include automation of routine tasks, such as monitoring systems for issues and alerting team members when issues are detected, as well as responding to […] The article Future Benefits of using AI as a Site Reliability Engineer with insight from ChatGPT appeared first on Build5Nines. View the full article
  8. As we close out 2022, we at DevOps.com wanted to highlight the most popular articles of the year. Following is the latest in our series of the Best of 2022. Site reliability engineering (SRE) teams and platform engineering teams share similar goals—like maximizing automation and reducing toil—and similar methodologies. But they have different priorities, and […] The post Best of 2022: SRE Vs. Platform Engineering: What’s the Difference? appeared first on DevOps.com. View the full article
  9. Ask most SREs how many incidents they’d have to respond to in a perfect world, and their answer would probably be ‘zero.’ After all, making software and infrastructure so reliable that incidents never occur is the dream that SREs are theoretically chasing. Reducing the number of actual incidents as much as possible is a noble […] The post Why More Incidents Are Better appeared first on DevOps.com. View the full article
  10. Site reliability engineering (SRE) isn’t a new term or practice. The practice of applying software engineering skills and principles to operations problems and tasks happened even before site reliability engineer was a defined job title. But organizing a proactive approach to building and maintaining software drives long-term success in improving operational efficiency, data-driven roadmap planning […] View the full article
  11. Site reliability engineers (SREs) have a considerable set of tasks to juggle no matter where they work or how long their company has had an SRE practice. But if you’re the very first one to join an organization—as many SREs are these days, given that the trend is trickling down into smaller and smaller companies—you […] The post 5 Tips If You’re the First SRE Hire appeared first on DevOps.com. View the full article
  12. Here: https://about.gitlab.com/blog/2022/05/17/how-we-removed-all-502-errors-by-caring-about-pid-1-in-kubernetes/?utm_id=FAUN_Kaptain321_Link_title
  13. What happens when the tools and services you depend on to drive site reliability engineering turns out to be susceptible to reliability failures of their own? That’s the question teams at about 400 businesses presumably asked themselves in the wake of a major outage in Atlassian Cloud... View the full article
  14. The amount of routine toil that site reliability engineers (SREs) perform declined slightly in the last year even though IT environments in general are becoming more complex to manage. An annual survey of 300 SREs conducted by Catchpoint, a provider of an IT monitoring platform, in collaboration with VMware and the DevOps Institute suggests that […] The post Survey Reveals Slight Decline in Level of SRE Toil appeared first on DevOps.com. View the full article
  15. Who else is glad that 2020 is almost over? We’ve had one of the most difficult years in recent history. With everything going on, it’s been difficult to think further than a few days out, much less into the new year. But, we’re hopeful that 2021 will be a better year for everyone. And we’re predicting some exciting things in the future for SRE. View the full article
  16. All crucial systems are built in order to be “safe for failure.” They serve two functions, as they are both vitamins and morphine, treaties and weapons. We must make sure our systems, too, are dually focused. View the full article
  17. November zine covering the latest and greatest from the SRE and resilience engineering community. View the full article
  18. One of the fundamental premises of software reliability engineering is that you should base your reliability goals—i.e., your service level objectives (SLOs)—on the level of service that keeps your customers happy. The problem is, defining what makes your customers happy requires communication between software reliability engineers (SREs) and product managers (PMs) (aka business stakeholders), and […] The post SREs: Stop Asking Your Product Managers for SLOs appeared first on DevOps.com. View the full article
  19. In this interview, we’ll delve into what draws Yury to SRE and chaos engineering, how she defines resilience, as well as her predictions on emerging trends in the SRE landscape. View the full article
  20. In this blog post, we’ll cover how SRE takes onboarding to the next level. View the full article
  21. In this blog post, we’ll look at the business value of SRE through customer focus, observability, and efficiency. View the full article
  22. BOO! Did we scare you? We couldn’t help it, we’re just so happy it’s spooky season. Here’s the October issue of SREview! View the full article
  23. By adopting a multilevel approach to site reliability engineering and arming your team with the right tools, you can unleash benefits that impact the entire service-delivery continuum In today’s application-driven economy, the infrastructure supporting business-critical applications has never been more important. In response, many companies are recruiting site reliability engineering (SRE) specialists to help them […] The post Why It’s Time for Site Reliability Engineering to Shift Left appeared first on DevOps.com. View the full article
  24. SRE practices and tools can help achieve security objectives. In this blog post, we’ll break down how to use SRE to enhance your security procedures. View the full article
  25. OpenTelemetry is an open source project to provide a de facto standard trace API and a metric and log agent to end vendor lock-in It’s no surprise that understanding the internal health of applications and systems is a high priority for developers and SREs. It’s a well-understood problem with dozens of vendors focused on helping […] The post OpenTelemetry and the Future of Monitoring Instrumentation appeared first on DevOps.com. View the full article
  • Forum Statistics

    67.4k
    Total Topics
    65.3k
    Total Posts
×
×
  • Create New...