Why Upstream?

In “Upstream” by Dan Health, we explore a variety of different problems ranging from homelessness, to high school graduation rates, to the state of sidewalks in different neighborhoods within the same city. In each of these examples, Dan discusses how upstream thinking decreased downstream work. Upstream thinking is characterized as proactive, collective actions to improve outcomes rather than reactions after an issue has already occurred.

You can also apply this method to software development.

With technology moving at a breakneck pace, it’s difficult to keep up with unplanned work such as incidents and unknown unknowns that come with increasing software complexity and interdependencies. Yet, we can’t halt development. As Dan points out, “Curiosity and innovation and competitiveness push them forward, forward, forward. When it comes to innovation, there’s an accelerator but no break” (“Upstream”, pg 224).

We can’t impede innovation, but we can Dan Heath’s wisdom from upstream thinking to move away from reactive modes of work and make our teams and our systems more reliable.

Barriers to Upstream Thinking

Before we can focus on implementing upstream thinking, we should acknowledge common barriers. Dan notes the problem here: “Organizations are constantly dealing with urgent short-term problems. Planning for speculative future ones is, by definition, not urgent. As a result, it’s hard to convince people to collaborate when hardship hasn’t forced them to” (220).

This might make it feel like everything is a barrier to upstream thinking. But Dan separates these issues into three groups: problem blindness, lack of ownership, and tunneling.

Problem Blindness

Problem blindness is self-explanatory: you are unaware that you have a problem. Issues and daily grievances are brushed off as just the way things are.

Consider alert fatigue. When you’re paged so often that you begin ignoring the alerts, you’re exhibiting problem blindness. Not only are you ignoring potentially important notifications, but you’re desensitized and possibly becoming burned out.

In this situation, you might hear people say things like, “Oh, that’s just the way it is. Our alerts are noisy. You can ignore them,” or “I can’t remember the last time I got a weekend off. You’ll get used to it.” Tony Lykke faced this issue and gave a talk at SREcon America in 2019. His talk, “Fixing On-Call when Nobody Thinks it’s (Too) Broken” describes this apathy.

It’s important to grow wise to problems. If you aren’t aware of them, you can’t begin to fix them. Question the status quo. Are there problems within your organization that has been dismissed or swept under the rug? These are sources of problem blindness. As Dan says, “The escape from problem blindness begins with the shock of awareness that you’ve come to treat the abnormal as normal” (37).

#devops #reliability #site reliability engineering #site reliability #site reliability engineer

Look Upstream to Solve Your Team's Reliability Issues
1.10 GEEK