Taraxa Consensus (2/5): Block DAG and PoS
By Steven Pu.
In the previous article, we see that a single chain topology has a tough tradeoff between throughput and security. In this article, we outline Taraxa’s approach to use a block DAG topology coupled with proof of stake (PoS) to address these issues.
One critical limitation for a single chain topology is that the network can only accept one single block at a time, so what if the network can somehow accept lots of blocks simultaneously? Not only does this help to increase the number of transactions being processed, but it also helps to minimize the waste that a single chain topology incurs when the network discards all but a single branch.
One way to represent this more inclusive approach is instead of building a linear chain, we now build a graph. One thing many may not realize is that, while most think of a blockchain as a linear chain, many classic single chains such as BTC and ETH are actually made up of branches and look more like a tree than a linear chain. This type of tree-like structure is a type of graph. More specifically speaking, it is a directed acyclic graph, or a DAG.
DAG sounds very fancy, but it’s just a fancy name which means the following,
- Blocks (more generally called vertices in math) in this graph are connected
- These connections (more generally called edges in math) have directions — e.g., block A is pointed to block B
- If you follow these directed connections starting from any block, you cannot end up at the same block no matter what path you take — i.e., there are no cycles in this graph
While not necessarily a single chain feature, proof of work (PoW) is often leveraged as a critical part of the network’s consensus algorithm for many single chain networks such as BTC and ETH, whereby nodes solve a cryptographic puzzle to attain eligibility to produce a block, essentially a decentralized lottery.
However, this approach, while simple and elegant, is highly energy intensive. According to the University of Cambridge, Bitcoin alone more than 60TWh of energy annually, beating out entire nations such as Switzerland in total annual energy consumption. Taraxa would like to avoid this level of energy consumption by shifting away from PoW into a proof of stake (PoS) mechanism.
In PoW, the privilege of producing blocks comes from external (to the blockchain) resource investments into expensive customized hardware and tons of electricity. In PoS, it comes from an internal resource of how many coins the account holds. The core economic motivation is similar in both cases, in that those participants who’ve made large investments into the ecosystem are loath to behave maliciously and damage the network’s integrity, thereby lowering its overall value and along with it diminishing their own holdings and future returns.
No free lunch.
It’s great to be inclusive and green, but there are no free lunches in this world, especially not in engineering. Let’s investigate some of the problems we run into while making these design decisions.
Problem #1: Ordering
In a single chain, there is no question of relative ordering between two blocks, since every block is on a linear order each with a clearly defined parent. In a block DAG, we are only sure about relative ordering between a certain subset of blocks, and unsure about others. This is called partial ordering, and in blockchain, that’s simply not acceptable.
In the illustration above, we see that the blue blocks have clear relative ordering, since they are directly or indirectly connected. If block A is pointing to block B, that must mean block A knows about block B at the time of its construction, hence block B must have come before block A. However, in the same illustration, the orange blocks have ambiguous relative ordering, since there are no direct or indirect connections between them.
So now we end up with some of the blocks have clear relative ordering, while others do not.
But wait a minute, why is ordering so important to begin with? Ordering is important because often, when transactions on the blockchain are processed, processing these transactions in different order could produce very different outcomes. For example, suppose A currently has 1 coin, and we have the following three transactions,
- A sends B 1 coin
- A sends C 1 coin
- D sends A 1 coin
Processing these transactions in a different order could produce very different outcomes, for example,
- Order (1, 2, 3): transaction 1 succeeds, 2 fails since A now has no more money, 3 succeeds
- Order (2, 3, 1): transaction 2 succeeds, 3 succeeds, 1 succeeds
In blockchain, ordering is a critically important attribute that the network needs to agree on (see our blockchain primer article), hence having only partial ordering is not acceptable.
Problem #2: Randomness and Delay
By transitioning away from PoW, we lose all its associated functionalities. Here we highlight two of these functionalities: randomness and delay.
Randomness refers to PoW’s process of randomly selecting the next node to produce a block. There is of course no central entity “selecting” the node, it’s simply a matter of luck when and which node will be able to solve the puzzle first (given comparable hashing power).
Why is randomness important? Block producers need to be random because if they weren’t randomly selected and are well-known ahead of block production, they could be targeted for attacks, bribery, and other generally deleterious efforts. Moreover, the block producers themselves would have a heavy incentive to cheat or collude to their own advantage — e.g., inserting their own trades ahead of others, try to create a branch to double-spend. Hence, randomness help make the network fairer and more honest.
Delay refers to the fact that it takes time to solve a PoW puzzle, in BTC for example the average is set ~10 minutes. This delay controls how often blocks are created on the network. With this delay removed along with PoW, we now need to find another way to add in a delay.
But why do we need a delay at all? For one thing, no network can handle infinitely many blocks — each node will run out of bandwidth, or storage, or memory, or CPU — whichever is the first resource to create a bottleneck. There also aren’t an infinite number of transactions to be packed into infinitely number of blocks. The number of blocks produced by the network and the number of transactions contained in each block should be, ideally, adjusted dynamically per network conditions, but in no circumstance will you have a delay of zero.
Problem #3: Block Efficiency
When a node is producing a block, it is deciding which transactions the network will process next. At any given time, each node on the network maintains a list of pending transactions which were sent to the network for processing but haven’t made it into a block yet. When a new block is being produced, a node looks at this pending transaction queue and selects a subset to place into the block. Different networks have over time evolved different conventions on how transactions are selected, most of which are driven by economic incentives.
At any given moment, if there aren’t serious problems with the network or its nodes, then each full node on the network should see very similar pending transaction queues. This is because a blockchain system is a peer to peer (P2P) network, whereby critical information such as transactions and blocks are being constantly circulated (gossiped) to every node on the network with typically very short delays (seconds). Hence, between any two given nodes, there may be a few transactions that one has heard about that the other one hasn’t, and vice versa, they should have largely heard about the same set of transactions, with each building very similar pending transaction queues.
If every node sees relatively the same set of pending transactions, then if they were asked to produce blocks, they will produce blocks that contain relatively the same set of transactions, or similar copies of each other. This is obviously not a desirable outcome, since redundant transactions between blocks are simply wasted work — not very green at all.
In our next article, we’ll describe how Taraxa handles ordering in the block DAG, and the one after that we’ll describe Fuzzy Sharding, how Taraxa’s blockchain randomly selects block producers and defines transaction jurisdiction — addressing randomness, delay, and block efficiency, all done without the overhead of coordination.