Predictive Maintenance — a dive into DB2 and Watson Studio

Vincent Cheng
7 min readMay 20, 2020

--

MarketsandMarkets forecasts the global predictive maintenance market size to grow from USD $3.0 billion in 2019 to USD $10.7 billion by 2024.

Predictive maintenance is a strategy to directly monitor the condition of equipment and detect when performance is needed to minimize unplanned failures. It is one of the top applications of artificial intelligence and machine learning. Predictive maintenance is generally thought to be most applicable to the manufacturing industry since any equipment downtime is very costly to a manufacturer. To that same point, unnecessarily servicing equipment can also be expensive, as you might be paying someone to go waste time inspecting equipment that is functioning perfectly.

For this tutorial, you will need to download this hard drive data set which includes the date, serial number, model, failure, and a number of SMART parameters represented. SMART is a monitoring system built into most modern hard drives which stands for Self-Monitoring, Analysis and Reporting Technology. These systems are in placed to detect various reliability problems at an early stage, giving warning signs well in advance before the hard drive fails. By the end of this lab, you will be able to identify the likelihood of failure for a specific hard drive model.

Below are step-by-step instructions that go over how to import that data into DB2 and connect it to Watson Studio to generate an automated machine learning pipeline — all on IBM Cloud.

1. IBM Db2: Create a DB2 instance

IBM® Db2® Warehouse on Cloud is a managed public cloud service. You can set up IBM Db2 Warehouse on premises with your own hardware or in a private cloud. As a database warehouse, it includes features such as in-memory data processing and columnar tables for online analytical processing (OLAP). These deployment options have a common database engine so your data workloads can be moved and optimized with ease.

To get started, sign up or log in to your IBM Cloud account at https://www.ibm.com/cloud
Once you are signed in, search Db2 in the search bar.
In the Create tab, select Dallas as your region. You can complete this lab with a FREE Lite plan for your DB2 instance.
Once it has finished creating an instance, you will be automatically guided your Db2 Getting Started Page.
You will need to create a Service Credential to connect your Db2 Instance to Watson Studio later in the lab. On the navigation panel, go to service credential and select New Credential.
You may rename the credential or leave as is. Then click Add.
Now that you have successfully created a service credential, you will be able to connect an app or external consumer. Leave this page on your browser and duplicate tab.
Next, go to Manage, then Open Console to DB2 .
In the top left navigation panel click Load, then Load Data.
Use the LoadLoad Data screen to load a single delimited text file (CSV) from your computer to the system.
In the Source stage, choose what type of data you have, and whether it’s on your local system or online object storage. You can find the drag the EoM_HardDriveData.csv into the File Selection box or browse your local computer to upload.
In the Target stage, select where you want to put your data. It can be an existing table or one you create on the fly. you will select a schema to create your table in. Each DB2 instance has a unique schema name, thus your schema name will NOT be the same as the one in the screenshot. Select the schema with the similar 8 characters (NOT the ones that start with DB2 or SQL).
In the Define stage, you have the option to change the code page, reformat columns, separate columns and prepare the database. Notice the error symbol on the right — To fix this error, simply click the down arrow in the Date Format and select the format M/D/YYYY.
In the Finalize step, review your selections before you start loading your data. No changes are needed, and Begin Load.
DB2 will take a few seconds to load your data — you have successfully loaded data into Db2. If you received an error, make sure you have selected the appropriate schema. You may need to retry using another Schema listed in the Target stage.
(OPTIONAL) TRY YOURSELF — Repeat the previous steps to append more hard drive data. (i.e. from the 2013-HardDriveFailure Folder, try appending the 2013–05–01.csv file
Click View Table once you finished uploading your data.
As you can see in the table, the hard drive dataset includes the date, serial number, model, failure and a number of S.M.A.R.T. parameters represented. The availability of these parameters can depend on the specific vendor and model of hard drive — therefore, we will see missing values represented as 0’s in some S.M.A.R.T. parameter columns where the hard drive vendor did not supply them.
Use the Connections screen to monitor all available connections to the database and which application is using the connection. Under the navigation panel, select Connection Information. You will need this connection details to connect for later in the lab.
Select Without SSL and leave this page open. Duplicate your tab once more you to refer back to this connection information later in the lab.

2. IBM Watson Studio: Let’s Create a Project

Watson Studio provides you with the environment and tools to solve your business problems by collaboratively working with data. You can choose the tools you need to analyze and visualize data, to cleanse and shape data, or to create and train machine learning models.

To get started, sign up or log in to your Watson Studio account on IBM Cloud at https://www.ibm.com/cloud/watson-studio
Once you have successfully signed in, Create a project to get started.
Select Create an empty project.
Name your project Predictive Maintenance and add a description.
Once you’ve created your project, you will see an overview of your project dashboard. Take a moment to explore the dashboard going through each tab.
After exploring the different tabs, select Assets — Add to project.
Select Connection.
You can see a number of connections to third-party services as well as IBM services that you can connect to — Select DB2.
Input your Db2 connection details using the service credentials and the connection information (see below screenshots). De-select the port is configured to accept SSL connections, then click Create.
Use the DB2 Service Credential above.
Use the DB2 Connection Information above.
After your Db2 connection has been successfully connected, it will be listed under your data assets.
Next, you will connect to your data tables. Click Add project and select Connected Data.
Name the connected data asset as Db2_PredictiveMaintenance and Select source
Select Db2_PredictiveMaintenance, your unique schema name and the table with the DB2_PREDICTIVEMAINTENANCE table
Add Db2_PredictiveMaintenance as the name and add a description. Then click Create
Now, you will see you have successfully connected to your tables in Db2. Click Db2_PredictiveMaintenance data asset to explore the details of this table.
From here, you can see the top 1000 rows. A very handy feature of Watson Studio is the Refine capability which enables you clean, prepare, and transform your dataset.

3. IBM Watson Studio: AutoAI (Watson Machine Learning)

AutoAI is a graphical tool in Watson Studio that automatically analyzes your data and generates candidate model pipelines customized for your predictive modeling problem. These model pipelines are created over time as AutoAI analyzes your dataset and discovers data transformations, algorithms, and parameter settings that work best for your problem setting. Results are displayed on a leaderboard, showing the automatically generated model pipelines ranked according to your problem optimization objective.

Back in the Predictive Maintenance Project, re-upload the EoM_HardDriveData.csv data.
To start an AutoAI project, select Add to project and AutoAI.
Name your AutoAI experiment Hard Drive Failure AutoAI and give it any description you’d like. You will need to create a Watson Machine Learning Service Instance, click Associate a Machine Learning service instance.
When you click the Associate a machine service instance, select WatsonMachineLearning in the dropdown menu.
Select Reload and your instance of WatsonMachineLearning will be created. Click Create.
Click Browse and select the EndofMonth.csv file
Select Asset once you have selected the appropriate file
Under Configure details, select failure as your prediction column. Explore the Experiment settings, and leave everything in the default setting once you are done. Then Run experiment to begin model pipeline creation
The Relationship Map info-graphic shows you the creation of pipelines for your data. The duration of this phase depends on the size of your data set. In this case, it can take up to ~40 minutes to complete the experiment. You can explore other parts of Watson Studio while the pipelines build.
You can click on the Swap view on the right to see the progress map of your model which will better show which steps the AutoAI tool is going through. The Progress map shows the entire pipeline of what Watson is doing with the data
Scroll down to see the highest-ranked pipelines displayed in a leader-board. This leader-board provides the option to save select model pipelines after reviewing them. You can view more information about them. As you can see, Pipeline 1 has performed the best with an accuracy at 1.0 and had the fastest build time.
In the model Evaluation section, you can see the summary of the pipeline including Model Evaluation, Confusion Matrix, and a Precision Recall Curve which were used in the resulting pipeline model evaluation. Select Feature Importance to see some of the indicators for failure.
Feature importance shows some of the indicators for thee hard drive failure. According to this AutoAI experiment,feature smart_1_raw was identified as the biggeeset indicator for failure. After exploring your model, go ahead and save the model
The model name is automatically generated, click on Save

4. IBM Watson Studio: Deployment

After you train and save a model, you will create a deployment space so you can embed the model into your applications. Find your saved model in the Watson Machine Learning Models on your project assets page.
Select the Deployments tab After you train and save a model, you need to create a deployment space so you can use the model to make predictions. Click Add Deployment.
Name the deployment space as Predictive Maintenance Deployment Space and click Save
The status of your deployment space will take a few seconds. Once the status is ready, click your deployment space to see more details.
In the overview tab, you can see the different details about your deployment space
In your deployment space, AutoAI has automatically created code snippets in Java, JavaScript, Python, and Scala. To interact programmatically with an AutoAI deployment, refer to the deployment syntax in Watson Machine Learning Python Client Library
Alternatively on the Test tab of the deployment details page, you can test your deployment by entering JSON-formatted payload data in the input data box. Input 5/1/2020 for date and Hitachi HDS5C3030 for model.

Congratulations!

You have successfully finished the lab and deployed an automated predictive maintenance model with DB2 and Watson Studio.

IBM is helping companies across industries apply predictive maintenance to improve business performance. Check out these 5 IBM client examples demonstrating how predictive maintenance in the cloud is helping businesses from five different industries excel.

Feel free to connect with me on Linkedin or email me directly at vincent.cheng@ibm.com if you have any questions about this lab.

Check out this tutorial by Parker Merritt where he goes over how to connect IBM Cloud Pak for Data to an Amazon Web Services S3 data source to prepare data for analysis, and generate a similar automated AI pipeline.

Disclaimer: All data collected from this tutorial was pulled directly from an external public source and used for informational purposes only.

--

--

Vincent Cheng

Data & AI Technology Specialist at IBM interested in: Technology, analytics, design and entrepreneurship