Published in Gett Tech·Apr 10, 2020Managing major incidents from home — What I learned from Mission ImpossibleFull disclosure — I am not Tom Cruise (Shocker, I know 🙃). What I am, is an Incident manager, who, not unlike Ethan hunt (portrayed by the mighty Tom), also has to deal with uncertainties in life and surprises. All the more in this day and age where COVID-19 forced…Incident Management5 min read
Published in Gett Tech·Jan 23, 2020Show me the money! — Monitoring Production the “Jerry Maguire” wayEvery Escalation engineer knows this simple truth — “if you find it faster, you will solve it faster” (“it” being the incident you want to avoid). …Incident Management5 min read
Published in Gett Tech·Nov 7, 2019Scientia Potentia Est — Knowledge Is PowerOne of my favorite proverbs of all times is this: “Give a man a fish and you feed him for a day. Teach a man to fish and you feed him for life” (Attributed to Maimonides, 12th century A.D). Why do I like this proverb so much? Because it embodies…Engineering5 min read
Published in Gett Tech·Jul 11, 2019The importance of Proactive Service MonitoringWhen I started my journey in the IT world, my badge bore the motto of the company that saw fit to grant me this opportunity (Comverse technologies, you are forever in my mind 😃): “Our goal is to meet or exceed our customers’ expectations”. Take a moment and think about…Dev Ops5 min read
Published in Gett Tech·May 30, 2019A “Fire” Incident case studyHere at Gett, we have several degrees for production environment related issues: 🔨 Trivial: A small issue that affects the occasional user in a remote aspect of the application ⚠️ Medium: slightly more inconvenient, but not business disrupting ❗️ Critical: a potentially business affecting issue if not treated within the…Dev Ops4 min read
Published in Gett Tech·Feb 28, 2019Gett Global Technical support team — the Gatekeepers of the systemGett is a global operation driving scores of passengers daily around 3 geographical regions and providing service to hundreds of thousands of end users. Did you ever stop to ask yourself what makes this great operation tick? Besides the amazingly talented R&D engineers that write the code and the fearless…Dev Ops4 min read
Published in Gett Tech·Jan 30, 2019Gett incident management — Minimizing Crisis resolution timeIncident management is a crucial part in any service-providing company. Its the process where a critical service disruption is managed from start to finish, all the while taking into account the following: In-time problem detection. Alerting the correct service owners of the problem. Coordinating the repair efforts. Alerting stakeholders (of…Dev Ops4 min read