Counterpoint: I'm at a small business and I'm primary for 24x7 oncall. I don't even take shifts with my coworker. But, this is because I'm empowered to make out of hours (overnight, weekend, holiday) calls STOP. I get woken up by something about once or twice a quarter.
When something wakes me up, the next day I start a process to ensure I don't get that same alert again: bugfixes, adjusting thresholds or time-to-critical, detecting problems and auto-remediation, determining it can be a "business hours" response.
This also requires buy-in from development. Literally yesterday I had an education opportunity with one of the developers about a ticket slated to go into production that evening that would have immediately eliminated one of our leading monitoring indicators, because it would have started creating hundreds or thousands of Sentry issues an hour. "I was thinking it was more like logging, where more information is better, where with monitoring we want the fewest messages possible."
Always, always, look at every pager hit and ask "what can prevent that from happening again?"
When something wakes me up, the next day I start a process to ensure I don't get that same alert again: bugfixes, adjusting thresholds or time-to-critical, detecting problems and auto-remediation, determining it can be a "business hours" response.
This also requires buy-in from development. Literally yesterday I had an education opportunity with one of the developers about a ticket slated to go into production that evening that would have immediately eliminated one of our leading monitoring indicators, because it would have started creating hundreds or thousands of Sentry issues an hour. "I was thinking it was more like logging, where more information is better, where with monitoring we want the fewest messages possible."
Always, always, look at every pager hit and ask "what can prevent that from happening again?"