Hirschorn: I’m going to talk about both some of the chaos practices that we use at Condé Nast which have been running for more than two years now, and also incident management. Also, about probably an often overlooked but even more important aspect which is about growing the resilience of your organization in the face of adversity and failure as well. I’m a VP of Engineering at Condé Nast. I’ve been there for just over three years. I look after their global strategy and operations for technology and engineering. You’ll see in a second a picture of what we do. You might be like, who’s Condé Nast? We are a portfolio company. We’ve published more than 30 different publications around the world, both in print and in digital. Here are some well-known brands that you might recognize. We have Vogue, GQ, WIRED, Vanity Fair, Glamour, The New Yorker, Bon Appétit. The list goes on. There’s more than 30 brands in our portfolio. This is wild. I think it gives you a sense of our global footprint in terms of our company, and what we have to try and serve around the world. When I say more than half a billion customers, I’m like, is that requests or customers? I can’t imagine one-sixteenth of the world is actually looking at our publications every month, but, apparently so. This is a bit old, actually, we’re now in about 40 countries around the world. This is just giving you a taste of what publications run where. What underpins this is a Kubernetes platform that we’ve been building for the last two-and-a-half years. That platform itself runs about 10 clusters at the minute. We’re running in five geographically distributed regions, namely in Japan, China, Frankfurt, Ireland, and U.S.-East-1 as well. We run on AWS as well.

#resilience #qcon london 2020 #transcripts #chaos engineering #devops #presentation

Growing Resilience: Serving Half a Billion Users Monthly at Condé Nast
1.20 GEEK