9 Comments

Blameless culture is all that matters. Great article, Gregor!

Expand full comment

Right, best kind of culture! Thanks Petar.

Expand full comment

Leveraging offshore talent is super great for helping with this problem. You can have part of your engineering team working while the rest are sleeping. That is how I recommend doing it!

Expand full comment

Indeed Matt! Follow-the-sun model is a great way to make it less stressful for people and no need to be 24/7 on-call.

Expand full comment

Simplicity should be maintained until complexity becomes unavoidable. Often, adding layers of complexity only introduces more problems than it resolves. As a company scales, on-call rotations may become necessary, but the real value lies in creating systems that reduce the need for them.

Expand full comment

Couldn't agree more Tomek! Reducing the issues as much as possible is always the right way of thinking.

Expand full comment

Let's imagine the company have in call process in place.

One thing I see very useful to avoid being called during the on call duties is having proper runbooks.

Those runbooks should be clear enough to be used by SRE team if an incident happens, solving or mitigating the issue, and avoiding call you in the middle of the night.

Expand full comment

+1 for runbooks Marcos! Very true, they can save so much stress, anxiety and time in general. Thanks for pointing this out.

Expand full comment

The line, "On-call should be focused on resolving the incident and not on blaming or looking for the person that caused the incident." speaks volumes. I am not an engineer, but I do work closely with our developer and architect teams. The blame game, when it happens from stakeholders, takes time away from resolving the matter at hands. I work to bring us all back to targeting the root issue to get to a resolution promptly.

Expand full comment