Simple Applications of Multiple Regression Models — Part 1: O-Ring Erosion

Manali Shinde
One Datum At A Time
4 min readMar 29, 2018

--

Welcome Readers!

Recently, I went through the process of learning (and re-learning) regression and multiple regression, the building blocks of machine learning and predictive analysis. In light of this knowledge, I thought I would make a two part series about simple multiple regression models. In part 1, I’d like to talk about how I constructed the model, and used it to make a prediction, something similar to forecasting. In part 2, I’ll move on to include more predictors, using sckit learn, and talk a little about optimizing your model. Let’s begin!

Background

On January 28th, 1986, the spacecraft known as the challenger broke apart after 73 seconds into flight. Not only destroying millions of dollars worth of equipment but as well as taking the lives of 7 astronauts. After much investigation and speculation, it was concluded that this disaster was due to the fact that the right rocket solid booster failed to lift off because the O-ring seal was damaged. It was discovered that O-ring seals were not designed to function well in very cold temperatures (which were seen in this flight) and therefore did not perform and therefore, pressurized gas made it’s way to the external part of the ship and lead to its demise.

During the investigation, data from different spacecrafts was taken in order to test O-ring design under different temperatures (F) and leak pressures (PSI).

Below, you can observe the libraries that i will be using in this analysis, as well as head of the data that will be used. I will be changing the column names so that they are easier to interpret. For this model, the dependent variable will be the Rings under distress as that is what we want to check for degradation and our predictors will be temperature and leak check pressure.

The Regression Model

I wanted to make a predictive model to observe the O-ring degradation at different temperatures. This model could come in handy in forecasting and in order to predict what an optimal temperature would be to support these O-rings. To do this, I utilized the statsmodel package, and imported the ols tool. The ols or ordinary least squares method is a way to estimate unknown parameters in a regression model. This tool comes in handy when we want to observe the intercept and slope of a regression line in our data. The ols tool in python gives us the intercept, slope, and as well as the R-squared value — a handy tool to help us see whether our results are significant.

In python, you bring up the ols tool, indicate the Y variable ~ X variable(s),

(Note: this is the R notation, there is a python specific notation but it requires an extra step so…why not use this?)

Input the data, and then tell it to fit the variables. You are then presented with a neat chart that gives you all the information you may need to construct an equation for your model.

PS: you can also you sckit learn to do this…python is a language of many libraries and many ways to get the result you want!

The Equation and Application

Anyone who as taken high school mathematics will easily recognize the equation y = b + m1x1 + m2x2 (….) or :

Where B0 is the intercept, B1 is the slope of the predictor (x) and y is the prediction we would like to make. Luckily, our ols model has provided us with the intercept and the slopes of both the variables we want to measure (temperature and leak check pressure). Now, the only thing left to do is construct the equation and put it into use!

Below, you can observe the model/equation and a few calculation. We know that the temperature we want to make a prediction for is 31F, but, we don’t know the pressure (PSI) we should be making the prediction for. Therefore, it is beneficial to observe for all the pressures we have been given in our data (0, 50, 100, 500).

I have manually used the equation to make prediction about O-ring degradation, however, when using the sckit learn, you can use the .predict method to easily make these predictions as well.

In sum, the model predicts that in all four of the tested leak pressure, and a temperature of 31F, around 2 O-rings will degrade.

We can use predictions such as these to construct a simple regression model when starting out and practicing our predictive modeling skills. The next step would be learning how to optimize the model, and you will be well on your way to constructing machine learning models!

In part two of this series, I will go deeper into making good models (i.e. optimization), and how to use sckit learn (the more popular library for machine learning) in order to make your model. Tune in for more and thanks for reading!

Open Source Data taken from: https://archive.ics.uci.edu/ml/datasets/Challenger+USA+Space+Shuttle+O-Ring

More on the challenger flight: https://en.wikipedia.org/wiki/Space_Shuttle_Challenger_disaster

--

--

Manali Shinde
One Datum At A Time

A health informatician and aspiring health data analyst. I am a photographer, writer, dancer, and public health advocate. Join me on my journey!