How To Land Your First DevOps Role

DevOps roles seem impossible to attain. They're everywhere and nowhere. Let's go through the steps to make yourself employable for that DevOps career!

Fixing The ClickHouse Node Failure On Distributed Systems - A How-To Guide

Part One: ClickHouse Failures, by Marcel Birkner. ... Fixing The ClickHouse Node Failure On Distributed Systems - A How-To Guide.

DevOps and SRE, Chapter 4: Explaining It To Business Management Executives

Why the collection of practices that today we know as DevOps and SRE (Site Reliability Engineering) are becoming the norm for modern systems management.

How i cleared the AWS Solutions Architect Associate — C02 exam

believe this duration can be best utilised by studying for the certification which you always dreamt for. I utilised the moment to get my AWS Solutions architect Associate Certification. Here are my takeaways for the exam.

How to Operate Less and Innovate More Using Observability and AI

How to Operate Less and Innovate More Using Observability and AI. Take a look at a few ways DevOps pros and SRE teams can leverage observability and AI to operate less and innovate more.

3 Lessons DevOps Can Learn From 5 Biggest Outages of Q2 2020

Read this article to learn 3 lessons from the biggest outages of IBM Cloud, T-Mobile, and GitHub. The second quarter of 2020 was marked by several serious outages at prominent services including IBM Cloud, GitHub, Slack, Zoom and even T-Mobile (Source: StatusGator Report).

How to Build an Effective and Sustainable On-Call Schedule For Your Team

A lot of tech companies struggle with creating an effective and efficient on-call schedule internally for their product and service, which results in longer downtimes when something goes wrong. They often over-burden their team members with repeated on-call duty, resulting in team fatigue. Here’s how to create an on-call schedule that your team might just love.

How To Adjust Size Of A Kubernetes Cluster Using Cluster Autoscaler

Spawning an AWS EKS cluster has never been easier and options are many: CloudFormation, Terraform or CDK. For the lazy, you can even use the great CLI utility eksctl from Weavework.

Why SRE?

Imagine that you had a good idea and decide to create a digital solution. The service is innovative and you have no competitors. After some months you see your user base grow exponentially. More and more features are added in each release.

A Modern Approach to Sandboxes

In this article, I’ll go over some of the tricks we used to make our cloud sandbox safe, reliable, and low maintenance.

Which Type of Delivery Pipeline Does Your Business Rely On?

A delivery pipeline is the thing that takes freshly written software out of the hands of a developer, and turns it into running services, potentially accessible to the public.

How To NOT pass the Certified Kubernetes Administrator Exam

You will find a lot of articles with good tips on how to ace the CKA exam. You should definitely read them! When preparing for the exam, it sure helped me a lot. But as I am now certified, in retrospect, I can say some of these advices may not be worth your time. In this article I will focus on tips I would call discussable: they do not give you a real competitive advantage and may even impediment your success. Let’s start!

Choosing the Right SRE Tools

In this blog, we’ll talk about what to look for in an SRE tool, and how they’ll help you on your journey to reliability excellence.

Docker Networking(Bridge-Network)

In this blog we gone see how two containers can communicate with each other using the concept of Docker Networking. One of the most important thing that docker.

How should a company organize DevOps and SRE activities?

Making DevOps and SRE work for you: SRE and DevOps are trending buzzwords, especially in the startup ecosystems. What exactly are these? What is the difference?

5 Tips for Getting Alert Fatigue Under Control - DZone DevOps

It’s important to minimize alert or pager fatigue as much as possible, for the health and well being of your team members.Here are 5 tips to help.

Reduce Engineering Problems With a Resiliency Mindset - DZone DevOps

Reduce Engineering Problems With a Resiliency Mindset: To reach your optimal state of resilience, there are some crucial SRE best practices you should adopt to strengthen your processes.

SRE: Service Reliability

The primary role of the Site Reliability Engineer is to identify and manage asset risks that could adversely affect plan or business operations.

What is site reliability engineering (SRE) and how is it different from DevOps?

I’ll explain what SRE is and why SRE helps maintain software quality in production systems. I will also discuss how DevOps and SRE relate to each other.