On Data Science in Accident Prevention
What would you say, if we could focus on 10% of locations, to prevent half of the accidents?

The US Occupational Safety and Health Administration agency (OSHA) estimates US companies spend around $150 billion on workplace safety each year.
Is this money being used effectively? Does it produce the intended results? Which initiatives are the best?
The results are being “measured” on a broad scale in terms of total injuries and lost time rates. Currently there is no measurement or estimation of the impact of individual initiatives. The programs and initiatives are being put in place based on the judgement of senior staff, hoping to reduce the trend over the coming years. If it doesn’t work out, there is usually one answer — do more.
A New Way
Recent advances in Data Science and Analytics enable us to take a look under the hood, understand the drivers of injury rates, and determine the impact of each individual workplace safety initiative.
This information can be used in the following ways:
- Run safety initiatives at their optimal level
- In a budget constraint situation, decide which initiatives can be reduced without loss in performance
- Focus on most effective initiatives for increased impact
- Identify business units and job scenarios which are not using the safety initiatives effectively
- Determine quantified risk (probability of injury) of a specific business unit or job scenario
- Quantify the safety engagement of groups or individual employees
- Propose specific actions to reduce the quantified risk and increase engagement

Case in Point
A data driven methodology was used at a leading Oil and Gas service company to understand and reduce driving accidents in the US.
A valid statistical model was developed which quantified the impact of each initiative intended to reduce driving accidents. Based on this, an optimal level of engagement was determined, highlighting what can be done better to reduce driving accidents over the next year, without adding more controls.
The statistical model was further used to determine the risk of each operational location specifically for the next month, highlighting that half of the accidents will happen at only 15% of the locations. We were able to determine precisely which locations that would be for the coming month.
Now the safety function can focus on a fraction of the locations in a targeted approach.
The Results
The insights in some cases confirmed what senior employees “knew” by intuition and proved it with real numbers. In other cases, it shed a new light on the processes that nobody had expected.






The Future
At the beginning of 2017, the statistical model was used to determine the probability of an accident at each operational location. At the end of the year, locations classified as high-risk ended up having over half of the accidents throughout the year.

The concentration of accidents under the high-probability flag allows the company’s safety personnel to focus on a manageable sub-set of locations.
Furthermore, having knowledge of the specific impact of each safety initiative, we can propose specific actions, and know how much improvement can be expected.

In this case, a step-change of 10% was identified as possible just by tweaking existing controls to their optimal levels, which were determined in the first part of this study.
While other controls are in place which could not be included due to data availability, this approach clearly shows the potential, but also the limitations of those considered.
One More Thing
The same approach was used to determine the quantified risk of each location each month. Performance from the previous month was used to calculate the risk level of the next month.

The results were similarly astonishing, we were able to identify 64% of the accidents concentrated within a small sub-set of locations.

Conclusion
We were able to quantify the impact of individual safety initiatives, determine the optimal level of each initiative, quantify the risk of each location on a yearly and monthly basis, suggest specific actions to reduce the accident rate, and identify accident locations ahead of time.
A similar approach can be used on other accident types and job scenarios, as well as to quantify job risk and business risk overall.
Data analytics is proving to be a powerful tool for answering “non-technical” questions and providing reliable recommendations for confident, data-driven business decisions.
Following projects would be possible with the same methodology:
- Optimal Engagement — How are safety and job quality initiatives done most effectively (by whom, how often, when, etc.)
- Targeting Audits — Probability of a site-audit uncovering non-conformances to reduce unnecessary checks and “pencil-whipping” of mostly positive reviews
- Job Intervention — Real-Time probability of undesirable operational events like accidents or quality non-conformance (similar to credit card fraud detection upon each card-swipe)
- Measuring Fatigue — Effects of long work hours and periods with little rest on performance and incident rates
www.linkedin.com/in/marekdanis
Disclaimer: All numbers presented are for illustration purposes only and encoded with preserved ratios.