The Real Value of IoT is the Data

Oliver Niedung
DataReply
Published in
9 min readFeb 6, 2023

The next evolution of the Internet of Things is driven by semantics, context, AI, and data exchange. How to embrace it?

The Internet of Things (IoT) is a rapidly growing technology that connects devices to the internet, allowing them to communicate and share data. The promise was that through connectivity, everybody could easily connect, create insight, and improve operations and customer satisfaction through active feedback loops. The cloud was the perfect place to have a home for all the sensor data and with the snap of a finger (aka inference), an AI could make precise predictions. Or, through reinforcement learning (RL), operations can be optimized consistently.

Reality showed that foreseeable and unexpected challenges limited the success:

  • Security: Most companies were secure by non-connectivity, which is still a predominant cause of delay and concern.
  • Cacophony: Depending on whom you are asking for IoT advice, you will get different and often conflicting answers.
  • Transformation: Digital transformation has key dimensions besides the technology transformation: Business model and organizational transformation are equally essential for success.
  • Enterprise-level scalability: There are many successful do-it-yourself IoT projects for a few devices and limited scope. These projects failed to scale large device fleets and use case flexibility.
  • Device complexity: Connected sensors and controllers typically require additional connected devices, bringing in additional managing overhead.
  • Legacy: “Never change a running system” in the industry often means 7–25 years, a lifespan that IT always had trouble with, despite all IT/OT convergency efforts. The legacy often delivers essential data.
  • Strategic importance without executive contribution: IoT projects are crucial to a company’s purpose of existence. However, due to the slow realization of value and a focus on technology rather than transformation, the involvement of C-level executives is often limited.
  • Cost: What started with a cheap and easy proof-of-concept, becomes a cost factor when it matures and scales.

Given this complexity, how will IoT continue?

The value of having data available to enable smart, connected, and near-real-time decisions are undoubted. Getting the data is any IoT solution's key motivation, the true lasting value. Through data, information, and insights are created and that allows better short- and long-term decisions. Information empowers predictive maintenance, faster interventions, and better utilization of machines. If a product breaks down or needs an upgrade, data will provide the solution.

Ensuring secure connectivity with the cloud remains a challenge, but dedicated IoT security services offered by cloud hyper scalers and ISVs (independent software vendors) make it possible to achieve security without a high cost or complex barriers. Additionally, data stored in the cloud is often compliant with ISO 27000 standards.

Ideally, all devices should comply with the seven properties of highly secured devices, and the industry is coming closer to that level.

Cloud Services Strategy Redirection

IoT cloud providers are starting to change strategy: Google is the first to retire Google Cloud IoT Core by Aug. 2023. Given the competition in the cloud and the cost of scaling across fleets of devices, it is difficult for cloud providers to earn money with data ingress. Cloud providers earn money by working with the data, not by receiving and storing the data, this became a commodity.

The risk and cost of providing dedicated and proprietary IoT PaaS (Platform-as-a-Service) became increasingly obvious to PaaS providers and customers.

There is increasing clarity of what is needed to address that risk:

  • Standards: Standards have clearly become more accepted and reduced many efforts, think of OPC UA, MQTT, the I4.0 Asset Administration Shell, or ONNX.
  • Cloud-Native Computing Foundation (CNCF): Having cloud-native solution providers working together to create new standards and better collaboration enables engineers to accelerate automation, leverage new technologies, improve platforms, and simplify migration on top of open-source systems.
  • Rightsizing: Not all data is required for decision-making, but some are essential: Feature selection. Some features are created by combining, transforming, or aggregating existing features: Feature engineering. Typically, by experience, the right mix of time series and events, the best frequency, and potentially hybrid data approaches (edge to the cloud) need to be determined.

The technology focus is moving away: from transferring all data to the cloud to using the data correctly and generating new insights. Becoming a data-driven company is a CEO-committed goal for many companies, which need to enable new business models, reduce costs, attract the right talent, win new customers and finally comply with sustainable development goals (SDGs). As there is no magic wand that easily creates insights from bare time series or makes predictions about the device's health, it became obvious that “the right data” is needed and that it needs to be put in context.

Semantics, Ontologies, and Context

Semantics refers to the meaning or interpretation of data, and how that meaning is represented and understood by software and systems. For example, in natural language processing (NLP), semantics is used to understand the meaning of human language and how it can be processed by machines.

Ontologies represent concepts and relationships between them in a specific domain to enable machine understanding and facilitate communication between systems by providing a common vocabulary and understanding.

In summary: Standardized ontologies help to bridge the gap between the human and machine understanding of the world by providing a shared understanding of the concepts and relationships.

What does this mean for the ability to make decisions based on data?

With semantics, machines, and humans understand the meaning of data, e.g. “30” means “30°Celsius”. The ontology then puts this into a relationship, e. g. the “30°C” is a value from a temperature sensor connected to an industrial controller. This helps to understand and communicate “30”, but it does not yet enable smart decisions.

For decision-enabling insights, the meaning needs to be put in context, e.g.:

  • In which production line?
  • In which production cycle?
  • Who operated the machine?
  • Is the workpiece compliant?
  • How much carbon was generated during this cycle?

If you ask a “meaning” question to an experienced engineer, he/she might ask you for some context. Only after a detailed analysis of the context he/she might give you a precise answer. Computer systems need to get access to this data through context. Context is the surrounding information that gives meaning to data, particularly for decision-making.

Common approaches in computer science that provide context are graphs/semantic nets. There are also many other digital twin approaches to provide context: such as the Industrie 4.0 Asset Administration Shell through submodels.

Industrie 4.0 Asset Administration Shell meta-model as a graph

The Road Ahead

IoT is essential to collect the data required for smart decisions, it will not disappear. But the data must come with semantics, must be represented in standardized ontologies, and must be put into context for understanding and enabling smart decisions taken by humans, machines, or both.

The momentum from a technology perspective:

  1. Data ingress to the cloud and data storage are commodities, and data amounts are growing.
  2. Cloud hyper scalers shift away from dedicated PaaS to CNCF-compliant cloud environments.
  3. There is an increasing requirement and acceptance of (international) standards in IT and OT (Operational Technology).
  4. The convergence of IT and OT continues. IT is innovating rapidly. The world relies on slowly innovating, robust OT to produce the required physical goods.
  5. Digital twins are on the rise. One fundamental benefit is to replace atoms with bits. Through digital twins, OT is capable to be subject to the same innovation speed as IT. Digital twins enable simulation, always-optimized operations through reinforcement learning (RL), real-time handling of production, and full transparency across the entire supply chain.
  6. There is demand and progress for the encapsulation of data in semantics, ontologies, and context.
  7. Smart decisions depend on the availability of data and context.
  8. Artificial intelligence (AI) is on the rise, complexity loses difficult.
  9. OT knowledge is often implicit knowledge by engineers, which often can be made explicit. Solving this could help address the expected lack of experts with the aging workforce.
  10. The industry needs to offer attractive products and services. To achieve this they need to reduce the cost and differentiate. Smart decisions along the value chain and nested in digital feedback loops become irreplaceable.

Technologies that IoT-related data processing companies are currently dealing with:

  • Data platform and data strategy: The main strategy from the last years was to bring the data where the brain is: To the cloud. Large data lakes were built, but they did not deliver the promise of the end of data silos. The latest evolution is the data mesh, which treats data as a product and organizes data-focused teams around their domains.
  • Realtime event streaming/ data-in-motion: Reacting in sub-second intervals to changes even in complex environments is another “data-driven” industry trend. It promises faster data processing, increased efficiency, improved decision-making, better customer experience, improved accuracy, and even better scalability.
  • Machine learning: Whilst most of us are using ML (machine learning) already in our cellphones, video calls, games, internet searches, social media, and online shopping, there is still a massive “implementation gap” from what is possible in the industry and what is already in production. Data pre-processing still consumes most of the effort (typically 80%), and there is (for good or bad) a lot of concern to put AI into closed-loop manufacturing. But even ML-based decision support systems are not yet universally available.
  • Semantic search: IoT solutions are designed to show machine conditions on dashboards and optimize operations within their capabilities. But they typically don’t answer essential questions outside of their scope. For this, companies implement semantic search technologies, that try to understand the meaning/intent/context of a query and return results that are semantically related. This is becoming a powerful way to enable additional use case implementations. Elasticsearch and Azure Digital Twins are prominent implementations.
  • Industrial metaverse: The concept of digital twins in manufacturing is frequently referred to as the “industrial metaverse.” This term encompasses the visual and sometimes interactive experience with digital twins, as well as their virtual representation through a headless graph and process twins. The real strength of digital twins lies in their ability to replace physical objects with digital ones.
  • Trust-based data exchange: In many of my recent activities, I have come across the same pattern: When essential data is exchanged in a trusted manner (in and between public and private sectors and countries), some of the most challenging use cases in the energy sector, supply chains, and sustainable development goals (SDGs) can be solved. Many parties are looking for data collaboration and data-driven value co-creation. The World Economic Forum (WEF) published a good paper about this topic (Data Free Flow with Trust, January 2023) and a secure, sovereign system of data sharing is central to the mission of International Data Spaces.

What is the best thing to do now?

  1. If you haven’t yet, embraced the cloud: Get started immediately. This sounds obvious, but there are still many laggards.
  2. Approach the opportunity “with the end in mind”. Don’t do only technically focused proof-of-concept with the hope, that this can also become an MVP (minimum viable product). Ignore technology — and don’t tolerate others doing it at this stage. Rather consider your customer’s needs, your differentiation, and cost optimization and build the smallest possible MVP. The MVP would typically be in addition to your current offering, not replacing it. Then go into agile development.
  3. Don’t reinvent the wheel. Data ingress and data storage are commodities. Spend no time and money on any extra effort to substitute commodities. Don’t waste time and resources on reinventing the wheel. Data ingress and storage are established commodities, so there is no need to invest additional time or money into building alternatives.
  4. Use standards as often as possible.
  5. Analyze your data assets and derive your data strategy. Determine if there is an opportunity for value co-creation through trust-based data exchange.
  6. Go deep with sales and support to see what customers are really asking for. Focus on the generated insight that helps you to get better and sell better. Consider making your customers become part of your journey, they will often appreciate it.
  7. Take the best path to a solution: Buy vs. build. There is much more value in buying from experts than most people think. Is a potential dependency from a 3rd party really a non-resolvable dependency?

--

--

Oliver Niedung
DataReply

Oliver has worked on industry leading IoT and device centric projects at Microsoft, specialized in digital twins, and is now focusing at Data&AI at Data Reply.