Data validation — you need it. KYVE.

Val Savchuk
6 min readJul 25, 2022

--

Today, I am getting back on board after a huge break. It was quite a tough month and I had no energy for relaxation while writing. But today is a new day, and I want to explain to you the meaning of data validation, how the data is checked, and why it is important for an average user. Yes, my blog is about blockchain technology and that is why we know about trustless data and data validation today. As an example, I will tell you about data validation in KYVE.

Hello, my name is Val, I am from Ukraine, and I am crypto ambassador and enthusiast. You are reading my medium blog right now, feel free to check other interesting articles inside.

YouTube Twitter LinkedIn Medium

What is data validation?

Data validation is the process of examining data whether it fits requirements of the subject. It can involve both or one: senses and forms.

“Data validation is the practice of checking the integrity, accuracy and structure of data before it is used for a business operation.” What is Data Validation? | TechTarget

Why you should care? — Every! Every app you use, uses data: whether it is calendar or computer game. All data gets through validation. It can be general or deep, manual or automated, but it is always there.

“Without validating data, you run the risk of basing decisions on data with imperfections that are not accurately representative of the situation at hand.” What is Data Validation? How It Works and Why It’s Important | Safe Software

Data validation methods

Data validation, as I have already mentioned, can be manual or automated. So, while data validation the next characteristics can be checked: data type, data volume, uniqueness, grammar, structure, meaning, etc. All of this can be a part of data validation.

Any data should be validated before it is accepted by the system — this is the easiest way to avoid mistakes. Another easy way to validation is whitelisting. It makes the process of validation faster. A similar idea is blacklisting. Other restrictions can also be used: data length range, value range, etc. (Source) But this all is only the tip of the validation iceberg.

I think, I should explain it better (Source):

  • Data type validation is one of the most simple one. It means to check if each cell of data is of the correct type and format.
  • Constraint validation is aimed to examine data whether it suits specific range, if it is required, or not.
  • Structured validation takes into account data type validation, but it also includes structure or schema validation for complex data sets.
  • Consistency validation does not check the meaning or logic of data, but data styles for consistency feature.
  • Code validation is for codes with their specific structure and form, according to the requirements of the code.

There are two ways of automated validation: write your own script (if you know how) or use specific software.

Blockchain validation possibilities

The new Web 3 era and blockchain technology brought some great challenges, but even better possibilities for data validation.

Let me remind you what is blockchain first:

“A blockchain is a growing list of records, called blocks, that are securely linked together using cryptography.” (Source)

Blockchain network secures all the data captured inside. It can’t be changed or deleted. Why? Every new piece of data that is locked in a block is added into the chain in a specific sequence, that is based on all other blocks in this chain. So no block can be changed or destroyed without all the chain destroyed. This is both: the strength and weakness.

As data validation is concerned, blockchain data blocks have their specific structure. That is why, there are own build in ways of data validation.

How works build in blockchain validation?

Blockchain data is mainly only transaction data. So the validation means approof of transaction validity and legality. There is also such thing as a “Consensus” to make the process trustless and reliable.

About “Consensus”:

“It’s imperative that all participants in the network come to an agreement on the state of the ledger” How Are Blockchain Transactions Validated? Consensus VS Validation | by HUPAYX

Blockchain has validators in the form of full-node runners. I haven’t told you about nodes before in details, I will, I promise. Nevertheless, I have to explain in short: nodes are the key points of the network. In very simple — computers connected to the network.

Validation process (Source):

Step 1. Data is provided to all available miners.

Step 2. Miners create blocks.

Step 3. A specific miner block is chosen to be added to the network according to the consensus.

  • “Proof of Work” — the miner that proofs to function most effectively creates the block that might be added to the network.
  • “Proof Of Stake” — the miner is selected randomly.

Step 4. The block is provided to nodes for validation.

Step 5. The block is added to the network.

Validation confirms that the transaction fits the protocol requirements, some structural elements (timestamps, version, type), token amounts and balance check in the wallets that take part in transitioning, etc.

But if we talk about data in general, we can get back to the first part of the article. The steps described there can be build in any validation process. Nevertheless, blockchain still has its benefits for every validation: data gathered in blocks, miners, and nodes. They make the process reliable and trustless.

KYVE data validation

KYVE is the startup, which mission is to store limitless amount of any data, with automated validation process.

To store data with no limits, KYVE uses Arweave to store data, but validation is one of its own main features and really essential one, as you already can assume.

“… any errors that may have occurred at previous junctions along a data element’s lifecycle are gracefully caught and handled by KYVE’s built-in and fully configurable validation step.” LITEPAPER | KYVE

What does it mean?

There are validators build in KYVE network. Validators are decentralized. I am talking about nodes. Nodes are organized in pools. Pools have limits for the maximum amount of nodes inside. Not every node is accepted, but the one that has enough staking.

As I have heard on KYVE Team AMA, the validation in the nearest future will take into account different types of data streams. Moreover, if right now we are talking mainly on Web 3 support, it is supposed to work with web 2.

If you don’t understand the differences, here is my article:

Web3 for Newbies. Data Storage in Web3. KYVE Network | by Val Savchuk | Medium.

Anyway, I think web 3 and web 2 will work together separately for a long time in the future, so it would be cool to have universal validation algorithms forx both cases.

Conclusion

Data is always a subject to study, research, develop new.

The more data is developed, the more complex is validation. Moreover, the data volumes, are growing rapidly.

KYVE is the great example of new era of data storage and validation. With every new project ad topic I describe, I see how blockchain gets to every sphere of our life. That’s future!

Here are some official KYVE sources:

Official website: kyve.network

KYVE official medium: https://blog.kyve.network

Official Twitter account: https://mobile.twitter.com/KYVENetwork

--

--

Val Savchuk

Ph.D in Computer sciences. Crypto enthusiastic. Crypto ambassador. Business analyst