Good reads: Rethinking On-Call

Rethinking On-Call: Compensation, Runbooks, and Sustainable Practices

Old School Burke

24 May 2024 — 1 min read

Let us talk about one of the most-hated aspect of the engineering experience: the on-call experience.

How to Write Good Runbooks: This article emphasizes the importance of well-crafted runbooks in incident management. It offers practical advice on making runbooks actionable, reducing the stress and uncertainty that often accompany on-call situations.
Navigating On-Call Compensation in the Tech Industry in 2023: Discusses the evolving landscape of on-call compensation, providing insights into how fair and motivating compensation practices are essential for maintaining team morale and performance.
Incident Metrics Tell You Nothing About Reliability by Dan Slimmon: Dan assesses (critically) the effectiveness of using incident metrics to gauge system reliability, proposing a more nuanced approach to understanding what these metrics truly indicate about our systems.
Oncall and Sustainable Software Development: This piece links effective on-call practices with sustainable software development, suggesting ways to align on-call duties with a broader commitment to developer health and software quality.
Project Star: Streamlining Our On-Call Process: A case study from LinkedIn detailing how they refined their on-call process to boost developer satisfaction and productivity, providing a practical example of successful on-call management.

reads-edition Newsletter

Old School Burke

010: Don’t Panic: Unblock yourself first

Unblocking yourself is part of the learning journey. When you get stuck, resist the temptation to type “Help!” immediately and run. Try these steps first: * Give your brain a chance to self-solve * Dive into existing docs or knowledge bases * Tinker, test, and experiment * Reach out methodically, with strong context, only

18 Feb 2025

Paid Members Public

009: The Ladder of Autonomy

Understanding Task Relevant Maturity and Ladder of Autonomy

3 Feb 2025

Paid Members Public

008: Complete Ownership During Incidents

There’s a constant temptation in our software engineering world to treat incidents as someone else’s problem. When your service experiences downtime because of an infra hiccup, it’s easy to say, “This is Infra's problem,” and then sit back. But if you’re the service owner,

Old School Burke

Old School Burke Newsletter

Related Posts

010: Don’t Panic: Unblock yourself first

009: The Ladder of Autonomy

008: Complete Ownership During Incidents