Someone have to be oncall, why they can’t be you?
Last updated
Was this helpful?
Last updated
Was this helpful?
A common question I got when I interviewed candidates for KMS was: “how is your oncall load?” My answer is: “KMS’ oncall load is reasonable as a tier-0 service. We are proud of our operational excellence. But we are a tier-0 service in AWS. When KMS has availability drops or latency spikes, many other AWS services will be impacted. So if you join KMS you will be in oncall shift, and you will be involved in large service events, that is guaranteed. This is not a place for faint heart.” A lot of people are scared of the oncall culture of AWS. They worry about being paged in the middle of the nights; they worry about large events that might impact millions of people’s life; they don’t want that responsibility. But when they have an emergency at home and they call 911, they do expect the rescue crew to be there for them in minutes, no matter the time of the day, don’t they? Someone have to do the hard work. If the “someone” are not us, then who else? Why should someone else always take care of us? What do we give back to them, to the society? We can’t all be takers of the society. The only way a society can function is that enough people contribute more than they take from others. I come to realize the total engineering ownership model - “you operate the service you develop. If you develop a service with horrible quality you have to deal with the operational issues, including being paged.” - is the secret of Amazon’s success. Because it keeps the skin of everyone involved in the service in the game. That is the only way to make anything worthy in life - to have your skin in the game. Last week my cottage had a fire accident from a guest. Nobody was hurt but the living room was burned. We were lucky the fire fighters in the neighborhood responded quickly. Within minutes three fire trucks arrived. The firefighters were well prepared, well trained and well equipped. The fire was contained and the structure of the house was saved. I visited the fire station the next day to collect the incident report from the fire chief oncall. I was impressed by the professionalism and empathy projected from the fire chief and his crews. They deeply cared about the people and the properties they saved. It is not just they have skins in the game, quite literally; they do it - being ready to leave for the dangerous fire fighting scenes every moment during their on duty shift - for the love of the game! They are proud of their work because their work make this world better and safer. So when I whine about being paged 2 or 3 times at night in my oncall weeks, when I have to get off my warm bed and open my laptop to restore KMS service, I think about the firefighters and all other people who make our life possible. I think about their sacrifices and feel ashamed of my fuss over my tiny inconvenience. We do this, for the love of this game, and the game is life!