Using The DAG Linear Ordering Model Framework In The Tangle

Vincent T.
0xCODE

--

Instead of using a blockchain for immutability and provenance of data, a different architecture can be utilized based on DAG (Directed Acyclic Graph) and the Markov Chain Monte Carlo algorithm called the Tangle. This data structure is used by the IOTA project as their DLT (Distributed Ledger Technology) for IoT (Internet-of-Things).

Like a blockchain, a Tangle is also a distributed database that can store cryptographic data immutably and transparently. The main difference is in their architecture and consensus mechanisms, which favors a DAG in terms of efficiency and speed.

The DAG

In a DAG there is never a closed loop. Each edge is directed from one vertex (also called a node) to another. It can be considered a true DAG if the system can be topologically ordered. This can be accomplished by arranging the vertices in a linear order that are consistent with all edge directions.

Figure 1. A DAG diagram showing the edges and vertices in the network. There is never a closed loop, as all edges are directed toward a vertice. (Source Wikimedia Vector version of Image:Tred-G.png, Public domain, via Wikimedia Commons)

The edges point to a vertice, in only one direction. An edge cannot point back to a vertice from which it originated from. For example you can have a vertex a point to other vertices like b and c. The vertice b cannot point back to a, or else that will create a loop.

A DAG is a type of graph whose edges have only one direction (unidirectional) to connect one vertice to another. In terms of topological ordering, it must follow a sequence such that:

For every edge the start vertex of the edge occurs earlier in the sequence than the ending vertex of the edge

This makes the ordering acyclic, so that once a vertice has been passed it cannot be passed again. This is ideal for scheduling tasks to perform in a business process function, since it is impossible to encounter the same site or node twice.

We can imagine a system that has 5 tasks. The ultimate means to an end is to go to the grocery store in the city, but there are steps needed in order to do that.

Let’s enumerate the steps as tasks (A to E):

Task A — Get inside car
Task B — Drive on freeway going to the city
Task C — Arrive at grocery store
Task D — Get out of car
Task E — Enter the grocery store and start shopping

The process must follow the tasks or nodes as follows:

Figure 2. The arrows follow the proper sequence for a process over time (A -> B -> C -> D -> E).

Each task is dependent on the previous task. It must follow an order, or else the process will fail. You cannot for example get out of the car before getting inside the car.

You also cannot have a cycle or else there will be an infinite loop, thus never completing the process. For example, if you get out of the car (Task D) and then go back inside the car (Task A), that creates a closed loop. In this case you never get to shop in the grocery store (Task E).

Figure 3. An infinite loop is created if Task D recurs back to Task A.

A DAG is more at arriving at the finality of the process through dependencies. If one task is not completed, then the whole process falls apart.

Site Nodes

The Tangle begins at a node called “site” which is like the genesis block in a blockchain network (e.g. Bitcoin, Ethereum). Tangle does not generate blocks, but instead is represented by the sites. Sites are connected by edges and are unidirectional, meaning they can only travel in one direction. The rule is that a site must validate two other sites. The sites (i.e. transactions) that are not validated are considered unconfirmed and are called tips.

Figure 4. The Tangle starts from a point called Site A.

In the diagram (Figure 4) example, confirmed transactions are colored green, while the unconfirmed transaction is colored red. Other sites are needed to validate the “tips”. Each site has a cumulative weight which is used to hinder attacks on the network because as sites get older their weight increases making attacks more costly. For example if the cumulative is:

SUM(Site1 + Site2 … + SiteX) = Cumulative Weight

Let’s say that:

D = 3, F = 1, G = 3 and H = 1 Cumulative Weight = D + F + G + H = 8

Notice also that the edge’s arrows point to their parent, opposite the direction of time. Therefore, it points back to the sites before that have validated them. Initial sites are confirmed by the first site, and proceeding sites requires 2 sites for validation.

The Tangle Algorithm

Tangle uses a type of algorithm called hashcash-lite. As more transactions increase, the greater the security of the network because it requires a ton of raw computing power to overtake the network. Therefore in order to steal a site, you have to process all previous validated transactions that are part of the branch and it becomes impossible due to the cumulative weight as transactions get older.

The more nodes or connected devices in the Tangle, the network also becomes faster rather than slower compared to a typical blockchain. In a blockchain, the network becomes slower because when there are more nodes participating in consensus, it will require more difficulty to produce blocks (for fairness).

Once transactions are confirmed they cannot be changed (i.e. immutable), which is also how a blockchain works. The best way to hack a DAG is an unconfirmed node like H (in Figure 4), which requires two transactions after it to validate. Now there is a window of opportunity for an attack to occur using the double-spending method. According to DAG developers, you need to be successful in winning the race to validate a node before it closes.

The probability of compromising the node decreases since the validation process occurs quickly, not allowing enough time for the node to be attacked. In IOTA’s implementation, they initially have a Coordinator which adds its signature to good nodes, so if a double-spend attack were attempted the Coordinator will have to validate only the legit transaction. Unconfirmed transaction nodes are selected at random using the Markov Chain Monte Carlo algorithm (sampling from a distribution), so hackers would have to know which node to attack in order to attempt a double-spend attack which makes their chances slim.

Markov Chain Monte Carlo is used at the “tips” of unconfirmed transactions. It uses a probability distribution to select which 2 new random transactions to validate first. Each site must have a dependency on other sites in order to be validated. That way there is no fixed determination that could allow hackers to target due to its random nature.

Use Cases

The Tangle is faster than a blockchain because it doesn’t require the same hashing power to generate blocks (e.g. Proof-of-Work system in Bitcoin). A Tangle uses a distributed consensus that is light weight enough to be used by systems like drones, smartphones, EV (Electric Vehicles i.e. Electric Cars), industrial and instrumentation sensors (used in manufacturing and health systems) and a new class of intelligent appliances (e.g. refrigerators, thermostats, home security systems). It also targets IoT devices in particular.

The Tangle is therefore more energy efficient than the mining used in blockchains like on the Bitcoin network. The greater the volume of transactions in a blockchain, the slower the network due to increase in mining difficulty to validate blocks for transactions. This increases competition among miners which increases hash power that requires more computational effort to produce blocks. That requires plenty of energy to perform.

The point of having the Tangle is a light weight protocol with security and scalability for micro-transactions for IoT, that is less energy intensive. While this provides a theoretically sound framework, there are still experts who argue whether the system will work or not. There is always room for improvement as this is all new, so it can be the best thing to ever happen, catastrophic failure or irrelevant to real world applications.

Git also uses a DAG for tracking history and version control in source code. A DAG helps developers track changes, but does not allow previous changes to be overwritten based on commits. There is a timeline that displays all the changes developers made that follows an acyclic ordering.

Figure 5. Example of a GIT repo using a DAG. The commits point back to their parent commits which is why the arrows are in the opposite direction from time. (Source O’Reilly)

Synopsis

The Tangle is one application of a DAG, that utilizes data processing techniques. It can help eliminate loops in the network that could cause problems if not properly handled. This can be good for creating schedules for tasks to complete based on their dependencies.

--

--

Vincent T.
0xCODE

Blockchain, AI, DevOps, Cybersecurity, Software Development, Engineering, Photography, Technology