Exploring new methods of data exploration in the maintenance domain
Part 2: A vision for the future:
In Part 1 we discussed how important data for maintenance may be held in a number of different silos. This data may be held in incompatible formats with different labels and names that describe common entities. Traditional database management systems also contain structured data. There is a whole raft of other unstructured data in documents, images and video footage that may also be exploited. This situation increases costs and obscures latent value that may be realised by having a shared and common data repository. Some existing systems for combining data include data lakes, with offerings such as SAP Hana, or adopting emerging standards such as Mimosa or the expansion of ISO 10303 (PLCS) to a full lifecycle perspective. There may be another option we discuss in this blog.
In this Maintenance Guru blog, we will show how some disparate data held in differing formats could be converted into an RDFox triple store, so we can realise new insights that would not be otherwise possible. We can also describe the ‘reasoning’ capability in RDFox to use logic to derive new relationships in the data with scripted rules.
This blog will avoid some of the scripting and programming details, to focus on the maintenance domain. There will be another version of this blog released under the Oxford Semantic Technologies Blog, that shows the scripting detail.
A data triple may be expressed as three elements, a subject -> a predicate -> an object. This can be illustrated in a graph format as:
The triple forms a small directed graph, which makes sense semantically and logically and provides a more natural way of structuring data and its relationships compared with a relational database.
If we look at how we may use graphs to model what is in a database table, we can re-use a diagram we first saw in the previous blog:
The part node corresponds to the data table that represents an ‘entity’. This may be a ‘type’ or a ‘class’ in the graph. Instances of the rows in the table appear in pink in the graph with the relationships between nodes representing the column names or attributes of the entity. The data table may have a primary key that uniquely identifies the row of values, which is represented as the ID referencing all the other values in a single row in the graph. This means data contained in relational database tables may be automatically converted into triples if the SQL tables or flat-file spreadsheets (or csv) are structured correctly.
To illustrate the power of using triple stores, we will first describe three separate views of a fixed plant subsystem, which are normally in different information systems.
The system of interest has the following operating context. It is an automated filling system for a liquid holding vessel which has an electrically driven pump to supply liquid to a holding tank. The liquid in the holding tank is maintained between lower and upper levels from a level indication system. There is a system controller that starts and stops the pump via its electric motor starter.
The controller also suppresses continued starts if they occur too frequently, the motor winding temperatures may rise and exceed a threshold value. Induction motors draw 8 times normal running current when they start, the heat spike is marked. The danger is that the motor temperature increase starts to degrade the lacquer insulation on the windings that risk short circuit conditions, cascading to eventual motor burn out.
The electrical system view
Below is a diagram of the electrical distribution supplying the vessel filling system elements. The system is normally supplied via the national grid supply of three-phase 440v AC, via a grid feeder air-circuit breaker (a breaker is essentially a large switch). The diesel-generator is used as a back-up if the grid supply fails with its own supply breaker. An electrical system has under and over voltage and over current protection devices on all the breakers and starters. Fuses are also included for over current.
The value of the trip protection is set by rules of ‘electrical discrimination’ such that if a fault occurs on a downstream device the protection on the next upstream device will trip before any other, thereby protecting as much of the fault-free circuitry as possible. This rule helps us model the dependencies in the system.
Having an electrical distribution drawing makes it possible to describe how power is distributed from sources via isolation devices such as breakers fuses to their end services, such as machinery actuators or electronics.
The electrical distribution diagram is already a type of graph format equipment is connected by supply cabling from sources (Grid and Diesel Generator) to end equipment such as the fill motor and sensors. These relationships should be able to be extracted from CAD drawings or other specialised drawing applications.
The process view
The following diagram shows a state machine diagram that describes the filling process for the tank. State machines are a commonly used tool in Systems Engineering in a modelling methodology called SysML. Control system software specifications are often based on state machines and sequence diagrams.
This diagram essentially reproduces the logic inside the fill-system controller, but clearly shows the cyclic process of our system of interest under normal and abnormal states. Many SysML models should be able to be decomposed into data elements that can be structured and imported into a triple store.
The diagram shows the controller starting and stopping the motor-pump through the opening and closing its starter air-circuit contactor in normal operation. The contactor is tripped if the motor temperature or a motor overcurrent situation exists and puts the starter contactor into a tripped state (which is open, but not cocked to reclose). The breaker cannot be reset to the open position in the idle state until the trip conditions are cleared, allowing the normal operational cycle to recommence. Each of the steps and the associated equipment it involves has an ‘effect’ on the filling process.
We can build a generalised ontology that models the relationships between processes and the equipment and use this to populate the filling system data. We can use an ‘affects’ predicate between the process and equipment. For example, the Filling process is affected by the level sensor, starter contactor, motor and pump. The data elements are contained in the state machine diagram that may be converted to the triple format. Part of the affects -> process view can be seen in the figure below.
The equipment’s view
The following diagram shows a physical breakdown of the tank filling system showing the functionally significant equipment items at the lower levels. This type of breakdown is often seen in FMEAs. The data in this view is separated from, but has a correspondence to the electrical view, with the inclusion of functional location identifiers. Each node on the right of the taxonomy below is a maintainable item.
The equipment view contains electrically supplied components that have similar equipment breakdowns, and dependency data as shown in the electrical system view.
This is a graph view of the equipment data. Because the equipment view was originally held as SQL, it was not difficult in writing a python script to convert it to a triple format. The colour coding in the figure below shows the relationships between RDF types and instances. If we drew all the relationships within the figure, it would look like spaghetti.
Now that we have seen and described the three views we want to combine, we now turn to the objectives we set.
The first objective is to take these three views and combine them into a single triple data store.
Having converted each of the views into Turtle scripts and ensuring we have defined common RDF types to classify the same data attributes in all three views we can import the data into a common data store. Much of the work in combining electrical and mechanical equipment is already achieved.
The second objective is to use rules to extend the data and use a query to address many fundamental questions.
For example, a question might be:
What system processes are affected if I safely isolated the fill system pump motor for maintenance?
This question also relates to data we would want in a FMEA, the effects of the pump failure on the processes and what would be the functional failures, given the close relationship between required processes and functions.
The approach to answering this question is understanding the motor and the relevant relationships in the triple store. This query also relies on another business rule that is not yet expressed in the triple store. Safely isolating the motor from its electrical supplies may have an internal safety rule that applies that says electrical machinery must be isolated by removing the supply fuses and tagging associated operating switches. This rule ensures that we avoid a possible latent-process failure and significantly reduce the chance of a maintainer being accidentally electrocuted.
The following diagram shows the graph that is directly relevant to this query, where the Supplied_by chain from the motor to its nearest upstream set of fuses is shown.
We electrically isolate the motor by removing the next upstream fuses. We will need to consider other machinery that has also been isolated, the starter, and the controller. These other Fitted_equipment items may have affects on other processes. The Starter affects the Filling, Idle and Trip processes whilst the motor affects the Filling process
It is worthy of note that in the previous paragraph we have used the English language to describe business rules and the relationships that very closely match the semantics that the triples represent. The triples are both computer and human friendly.
At this point, it is worth introducing the idea of transient relationships in the graph. We can use this to our advantage in making transitive relationships in the database using rules, to help query performance.
If we use the graph directly above, we can see that the motor links to the starter through an isSuppliedBy relationship. The starter is also similarly linked to the controller and main fuses. The motor is then transitively linked to the motor main fuses and the controller. We can take advantage of this by defining the transient relationships via a rule.
The transitive relationships shown from the motor, in red in the graph above, make the querying faster and easier to find the isolating fuses, than having to transit through several linkages of isSuppliedBy relationships. All of the new relationships were formed using a single rule.
So what? What have we achieved?
This small demonstration uses three separate sources of data about a subsystem and shows the following:
- through the conversion of the given data in three separate formats, we have formed a baseline triple store that contains the data in a single database.
- We have used reasoning and rules to infer new relationships in the triple store that have linked data from different views of a system
- We have queried the resulting triple store to extract information that is valuable in real-life maintenance circumstances
- The triple store could be further developed to form the basis of an FMEA and other useful engineering frameworks. We have shown that using the affects property, we can determine the processes affected by a machine, which indicates the effects of failure in an FMEA.
All of this might have been achieved through more conventional means using relational databases, but the level of effort required would have been greater than using the triple store. The inherent reasoning to derive new relationships and reuse of the data is game-changing. The triple store data schemas are inherent in the data itself, there are no primary or foreign keys of the SQL relational model. Extending the data and relations does not impact the original schema and therefore updates are so much easier and efficient.
The use of RDFox also allows the system to naturally scale to many millions of triples while retaining high performance.
The downsides of using the graph approach include a learning curve to understand how to use SPARQL as a query language and exploiting the RDF and RDFS libraries. RDFox does have an add-on module called a faceted search that could shield un inexperienced user from the complexities of SPARQL, which can be found here.
In the future, the use of triple stores could be significantly enhanced if the engineering and academic communities could define template ontologies as open standards to describe equipment and engineered products within its full lifecycle context.
I would be interested in being involved in defining an open standard for an FMEA ontology that is suitable for defining maintenance as I think industrial adoption of the graph technology could be simplified by having access to open-source ontologies.
Additionally, I would welcome any reader’s experience using OWL, RDF triples, linked data or ontologies in the maintenance engineering domain? A future blog may return to this subject and show how we can derive other FMEA type data using this data set using rules and reasoning.
In the next blog, we will describe on-condition maintenance in a deeper way that will address some common misunderstandings. A common criticism I have heard about conducting RCM which may recommend a lot more on-condition inspection tasks, is why do so many inspections get done that do not find any failures — isn’t this wasteful? Does this waste question the effectiveness of RCM? We can answer this question and refute the criticism as well as describe how Predictive Maintenance belongs in the family of on-condition maintenance.