Data Validation / Example of energy Data/

George Sidiras
6 min readMay 29, 2024

--

The Transaction to Digital era is not easy there are difficulties, but the benefits we earn will be a lot. However there are some ethical consideration like privacy ,ownerships and responsibilities of individuals that are going to have in digital world . The analog system of the physical world will transform into a digital system to represent our world as convincingly as possible.

Digitizing analog Media

In addition, one of the benefits of this transition will be the reuse of the data with catalogs and dictionaries. Citizens will be able to share their data as they choose. The data will become valuable because the re usability of the data gives value. Data that is not used loses its value. Data is valuable because it can be used to solve problems, make decisions, and improve our understanding of the world around us. The more reusable a dataset is, the more valuable it becomes. This is because reusable datasets can be used for a variety of different purposes, by different people, at different times

The Data space provably will be the medium for this transaction. We need data , a lot of data that are going the physical entities

Data validation can help to achieve technical interoperability at data spaces

by transforming the input data and imposing the governance needed for the transaction.

Transforming the input data means converting the data into a format that is compatible with the data space. This may involve changing the data type, structure, or encoding. Data transformation can also be used to remove any sensitive data or to comply with data protection regulations. Imposing the governance needed for the transaction means ensuring that the data is shared and used in accordance with the data space’s governance policies. This may involve checking the permissions of the parties involved in the transaction, logging the transaction, and enforcing any other relevant governance rules.

Data Validation

Let’s take the example of

energy communities

to examine the difficulties that arise. My point of view is that energy communities will play an important role in the energy transition, where energy management can be transferred to the local level. Citizens will be able to manage and share their energy records and energy (locally) if they wish. To achieve this, energy communities need to be properly established so that enough people can use them. On the one hand, there are standards such as the CIM (Common Information Model) and the Common Grid Models Exchange Standard (CGMES) to provide a common framework for digitization in the energy market.On the other hand, there are challenges such as the cost of digitization, the different models in different countries, and the upgrades that are made to the CIM.

Now, from a more technical perspective:
The int:net website shows that the current installation is on CGMES version 2.4 and CIM version 1. The latest version of CGMES is 3.0, which supports semantic versioning. This means that each version of CGMES will produce a different dataset. There is a CIMxml version, a JSON-LD version, and tools to convert JSON-LD to JSON. As an example, digital twins can represent energy as two graphs: a forward graph and a reverse graph for the same dataset. A forward graph represents the flow of energy from generation to consumption. It includes all of the energy assets and systems involved in the energy supply chain, such as power plants, transmission lines, distribution networks, and customer premises. The JSON LD will have the header from DCAT. A reverse graph represents the flow of energy from consumption to generation. It includes all of the energy assets and systems involved in energy storage, demand response, and distributed energy resources (DERs).This will be useful for seeing how to implement these validations. In the future, we will need to have the vocabulary for the information model of the IDSA and the DCAT catalogs. Here are some additional thoughts: The different versions of CGMES can be a challenge for energy communities, how to integrate and share data .The use of digital twins to represent energy is a promising approach, as it could help energy communities to better understand and manage their energy systems. The development of the IDSA information model and the DCAT catalogs will be important for ensuring that energy communities can share data in a standardized and interoperable way. Overall, I think that the challenges facing energy communities in terms of data validation and governance are significant, but that there are also promising developments underway. All this data and information have to be validated , and have to be Devops./ DataOps,DataSecOps. Meaning they have to be automated. The SCHAL is goint to for to validation. Finaly I think that validation have made with Neo4j Graph. And finaly my point of view Furthermore, when citizens need to upload data, they must map their files to DCAT catalogs for validation with the SHACL Rules.

SOA (Service-Oriented Architecture) creates a service for different business functions of an organization based on existing applications

SOA does not change the purpose of the application, but rather the way to invoke them based on W3C standards. SOA was originally based on XML, but JSON-LD is now also supported. The Enterprise Service Bus (ESB) is responsible for routing requests to the appropriate services. The ESB has become the heart of SOA. An Enterprise Service Bus (ESB) is a software architecture pattern that enables communication and integration between different applications and systems within an enterprise. It acts as a central hub for routing messages between applications, regardless of their underlying technologies or platforms.

ESB

All this data and information have to be validated , and have to be Devops./ DataOps,DataSecOps. Meaning they have to be automated. The SCHAL is goint to for to validation. Finlay I think that validation have made with Neo4j Graph that we have to store the RDF.

From services to data
The transition from services to data is a shift in focus from the applications and services that we use to the data that they generate and consume. This shift is being driven by a number of factors, including: The increasing availability of data The rise of artificial intelligence (AI) The development of new data technologies We all produce data, and many times the information we are looking for is just one click away. However, data consumers need to be careful to choose the services they use in a way that protects their values. There are technologies that can help us manage all of these problems. Individuals can manage their data assets as they want, perhaps in a distributed pod, and give access to applications that will only have the appropriate data.
Finally, my point of view:
A semantic layer will be mandatory to perform all of these operations and maybe to enrich them with GenI. At this layer, a vector database and Neo4j are useful. The vector database is only necessary if we use GenI (for starting the search). At Neo4j, we could store the rules and governance.

metaDAta

By working together, energy communities, standards developers, and technology providers can overcome these challenges and enable energy communities to play a full role in the energy transition.

--

--