Here's an idea: pay extra for on-call work. As a professional I want to fix stuff if I break it, but there's a limit to demands on my time.
It's especially infuriating to spend an evening away from my family to fix a problem that someone else caused and could have fixed in 5 minutes, but I had to spend several hours getting familiar with.
At this point, if management asked me to start on an on-call rotation I'd want to know how I was going to be compensated for the additional time and opportunity cost of being on call, or I'd start looking around for a new gig.
Because orgs don't want to pay extra for requiring you to be on call.
Unless it's codified in labor law, they can extract that off hours work from you for free (in other professions, you're compensated just for being on call, and then further if a call comes in).
As others in this thread have mentioned, people are wising up to the perils of being on call, along with the lack of compensation that goes with it.
This probably comes from Ubisoft's French roots. It sounds like a very French attitude to take towards being called after hours (which is illegal there).
> people are wising up to the perils of being on call
I did too late.
It started with being asked to sleep in the guest bedroom while on shift and only got worse from there. Take it from me: it takes active investment to keep a family healthy while on call, especially when you're on call at a shaky startup with several pages per day that require extensive remediation.
I'm not stupid and I know on call was not the reason in itself, but it was a significant catalyst and debit upon my time and mental health -- not sleeping adds up. Be aware of it lest you end up like me, finally off call with an empty home to show for it.
It is not as bad if you can rely on a strong team (3.5 years there, no significant consequences in personal life).
Some things that help:
1. Everybody is in the roll, not just the new guy(s).
2. The team is big enough for everyone to have a reasonable ratio of on-call vs off-call days. Merging two or more small teams into one single on-call roll of death does not count; people who is not knowledgeable on the problem at hand (e.g. everyone, eventually) will just fuck up and end up calling someone from the correct team anyways (after the problem has grown worse and the customer is angrier).
3. Team is encouraged to trade days or cover for each other if needed.
4. On-call guy has vetoe power over deployments. If you want to push something urgent at the end of the day, you better make yourself available to the guy that can vetoe your deployment.
5. Management understands that developer's productivity will slow down while on-call, and plan accordingly.
Yeah, I worked at a hedge fund for nearly a decade. I got called back to the office from vacation after driving across the country for a development - not production - issue. I started getting 2-4 am calls for outages in dev by our offshore support who were too lazy and/or incompetent to read a log file and take action. Glad I don't work there anymore.
Every policy that gets put in place has the possibility of influencing behavior of those involved.
If your organization pays extra for on-call work, that influences buggier code because you will get paid to fix or solve it later (even if it's not intentionally buggy).
You see this all the time with contract work and it's a huge issue with low bids that end up costing a lot more because of the sunk cost.
Edit In fact, you already have developers fighting this incentive at your organization. If developers are already writing buggy code hoping that it will be someone else's issue and not caring when it's their turn to be on call, imagine when they get paid extra to fix the bugs they introduced to begin with
Rotate through different employees and give them times when they are responsible for //being awake and ready to work on short notice//: IE this is a normal hourly wage with the bonus that you don't have to be at work.
For both the above and 'on call', you also charge the 'department' with the failure's root cause (as determined later) the penalty and pay out a bonus for those needed to respond to the incident.
That's fine, but unless your on call scheduling is erratic and non-predictable then it's no different than pay that's included with your salary.
Like the article says, you really get the best results when the people responsible for the code is also responsible for issues that come up after hours. That means your team should be rotating out so that everyone feels ownership and responsibility for the deployment.
> Placing folks around the globe doesn't really help with the weekend or holiday situation.
Given that non-working "weekends" and "holidays" aren't global standards, it kind of can , though arrangements to simultaneously mitigate those situations and working hours situations may be more complex than ones intended to handle only one or the other problem.
This works. We did this for developer on-call rotations (which was also a development opportunity for rising engineers). We also did this when I ran Ops.
It's not perfectly fair in the micro sense, but it's fair enough to not trip most people's frustration meter and it's very easy to administer.
The way it used to work in the organisation I used to be part of (generally a good bunch) was that there was a flat-rate, plus an extra payment if we were required to do something.
Occasionally there was time in lieu given as well, if we were actually called and it was overnight/took a while.
Worked well, but as a general rule we were delivering a quality product, had good guys working for us and so we all felt responsible if something went wrong out of hours and would look at what happened and how we would prevent it going forward.
We also gave comp time for significant disruptions/outages, but that was mostly recognition of the practical reality that someone who normally worked days and unexpectedly worked 1-4 AM responding to the pager was going to be useless the next day anyway.
I specifically wanted to avoid variable pay, the associated timesheets tracking and approval processes, the HR and finance/payroll integration, and any temptation (beyond normal professional responsibility) to either decrease or increase the hours spent on a problem response. I'm sure it's different for different businesses, but I judged that paying on-call bonuses for weeks on-call quarterly was low enough finance integration effort and still gave the employees a sense that they were being paid some differential for being on-call.
Your last sentence is the key outcome to strive for and as long as you have decent leadership and culture, I think that's fairly easy to achieve in a small group. I don't need to be paid "extra" to do a little extra to help my team and company.
IMHO. The problem with on-call rotation is that people just try to push the problems until it is someone else's turn. No one is looking at the root causes. No one is making changes on the systems to avoid problems in the future. No one is building operational support in the applications (mentioned by Johnny555 in another comment).
If you are an hourly employee, there is already extra pay for nights/weekend/long shifts.
If you are salaried, you're expected to do the job that's needed, without particular daily time keeping.
Easiest is to build robust infrastructure that recovers from failure, uses circuit breakers, soaks in forked workloads before being live, and such, so being on call is a low risk venture.
It's especially infuriating to spend an evening away from my family to fix a problem that someone else caused and could have fixed in 5 minutes, but I had to spend several hours getting familiar with.
At this point, if management asked me to start on an on-call rotation I'd want to know how I was going to be compensated for the additional time and opportunity cost of being on call, or I'd start looking around for a new gig.