The question might seem somewhat obvious: you have a header and a body with transactions, many blockchains have it, what might be so difficult about it? Well, as I mentioned in the previous update, there are some complications and part of the challenge is related to the fact that we’re dealing with sharded architecture, which most blockchains don’t need to deal with.
Sharding architecture #
The sharding design is not fully worked out yet, but it is quite certain that it’ll be hierarchical and look something like this:
So it is a tree with a Beacon chain at the top of the hierarchy, with intermediate shards below it and leaf shards at the bottom. The architecture supports about one million shards in total.
The shards are using a shared namespace for addresses and support cross-shard communication with both the ability to send information between shards and using contracts deployed on any shard on any other shard.
All this implies the need to be able to prove and verify that things exist or have happened on other shards, while only having access to the beacon chain block root. While at it, we’d like for proofs to be efficient too, which means both block header and block body need to be designed in a way that co-locates related information and minimizes the depth of various Merkle Trees.
In addition to cross-chain communication, we have to combine segments produced by individual shards into a global history, which also needs to be represented in a block in some verifiable way.
While the actual number of shards will be larger than on the above diagram, the number of layers is expected to be the same. This also means that both headers and bodies for different kinds of shards will be different. For example, the beacon chain will have no user transactions and leaf shards will have no shards below it. There are many more differences, of course.
Current state #
While the current version is certainly not final, I think it’ll give a good idea of what to expect and how things work together. I’ll visualize it as a tree of trees, where you can think of every tree as a Merle Tree. It is about 95% accurate (there are segment root proofs missing on the diagram due to the difficulty to show them correctly).
Leaf shard block #
Leaf shard blocks are the simplest. They reference the beacon chain in the header (which defines some consensus parameters) and include segment roots and transactions in the body.
Intermediate shard block #
Intermediate shards on top of what is done by leaf shards need to include information about leaf shards in both block header and body.
Beacon chain block #
Beacon chain replaces reference to the beacon chain with actual consensus parameters. It still tracks child shards, but also contains PoT checkpoints in its body for faster verification later.
As you can see, all three block types share some similarities, but also have unique differences dictated by their role.
Now if we need to prove that something was stored in leaf shard, we can do that by generating proofs for the following path:
This way we can reach any piece of block header or body or anything stored in a state of any block. Not only that, information submitted from leaf shards to the intermediate shards and from intermediate shard to the beacon chain is verifiable for integrity (that segment roots correspond to the header). Moreover, since all block roots are added to the MMR, any historical information can also be proven/verified as well. It trees all the way down!
The whole structure and decoding ability was introduced in PR 245 with follow-up fixes in PR 246. There are no builders or owned versions of the data structures yet, but it will be easier to have them now that the structure is known.
There will be more changes to this in the future once/if we have to deal with data availability and potential misbehavior, but this should be sufficient for now.
Block root? #
You may notice that instead of “block hash” I used “block root.” That is simply a reflection of the fact that a header in itself is a Merkle Tree. Our headers are larger than those in blockchains like Bitcoin, so it is desirable to compress the size of the proof when only a small part of it is needed (like timestamp).
Upcoming plans #
I hope you appreciate the ASCII art, I spent a non-negligible amount of time formatting it 😅.
With block layout in this state, I’ll need to write even more boilerplate for builder and owned versions of data structures. Once that is done, I’ll be back to consensus verification in an attempt to get primitive blockchain going. There are still some open questions around plotting, but I think we have a pretty good intuition with Alfonso for how to approach it.
We post even more research updates on our Zulip as they happen, including meeting notes from out 1:1s with Alfonso.
With that, I’ll see you next time with more updates and maybe even more ASCII art 😆.