Status quo

note

State-of-the-art blockchains are yet to resolve the Blockchain Trilemma

Bitcoin and Ethereum de-facto lost decentralization due to their consensus algorithm, they are also very slow.

The majority of modern blockchains today use a variation of Proof-of-Stake consensus, meaning they are permissioned and inherently not scalable to unbounded number of participants. Those that are based on Proof-of-Work are arguably not secure to begin with, regardless of what algorithm they are using due to how cheap and easy it is to purchase compute in large quantities today to attach the blockchain.

There are many blockchains that claim to be scalable, but all of them have a glass ceiling and are unable to scale bandwidth, storage and compute sub-linearly with unbounded number of participants. Those that do not support sharding are not scalable by definition, those that are sharded have other inherent limitations that prevent unbounded growth.

Only security is the property arguably achieved by many blockchains, but it is also debatable how real or useful it is without two other properties.

đźš§ Project Abundance đźš§

The goal here is to find a way to remove existing bottlenecks and unleash the full power of the blockchain.

We need ABUNDANCE:

  • Abundance of consensus participation
    • Without practical limits
  • Abundance of bandwidth, storage and compute
    • With increased consensus participation, it must be possible to process more transactions, persist larger history and do more computation without practical limits
  • Abundance of developers
    • Standard programming languages with familiar tooling, standard and efficient execution environment, infrastructure that allows for a single app to scale to the capacity of the blockchain itself

Consensus

While Proof-of-Work and Proof-of-Stake have solidified as primary ways to achieve consensus in blockchains, they both inevitably lead to centralization and potentially to reduction of security.

With Proof-of-Work, this is primarily driven by the need to access to exotic hardware and cheap electricity, which limits the number of people who can participate. With the growth of the network and difficulty increase, it also becomes impractical for small miners to participate without mining pools due to infrequent and unpredictable rewards. Another complication is the existence of services like Nicehash that allow to buy large amounts of compute for limited amounts of time on an open market, which makes it very practical to own the network and in case of anything that is smaller than Bitcoin arguably fairly inexpensive.

warning

As the result majority of Proof-of-Work blockchains are neither decentralized nor secure in practice.

With Proof-of-Stake currently staked owners of tokens get richer every day, which arguably makes it a permissioned system. Due to being permissioned and requiring on-chain registration before participation, most Proof-of-Stake implementations have to substantially limit the number of consensus participants by imposing things like minimum stake amount as well as only selecting a subset of validators to be active at any time. Due to the nature of consensus implementation, it is also important for consensus nodes to stay online, so those unable to are typically being punished for it on top of simply having their tokens locked and not receiving rewards. This also leads to pooling and centralization. Blockchains like Polkadot support nominated staking that improves scalability of consensus participation to some degree, but it is not a full replacement for being able to participate in consensus individually.

warning

As the result majority of Proof-of-Stake networks arguably are not really that decentralized in a sense of supporting millions or even billions of consensus participants.

Proof-of-Space

The alternative to above that is not talked about quite as much is Proof-of-Space. There are different variations of it, but a shared trait between them all is permissionless participation with low energy usage requirements, while also using the resource that is generic, widely distributed and abundant: disk storage.

The most prominent example of Proof-of-Space consensus is probably Chia. Chia is essentially an energy efficient version of Bitcoin that wastes disk space to store random data just like Bitcoin wastes compute to calculate hashes. It also happens to suffer, just like almost every other Proof-of-Space blockchain from Farmer’s dilemma, which makes incentive compatibility a challenge.

While this is an improvement over Proof-of-Work, it turns out it could be even better.

Proof-of-Archival-Storage

Proof-of-Archival-Storage is a flavor of Proof-of-Space, more specifically Proof-of-Storage, that instead of filling disks with random data stores the history of the blockchain itself. This not only resolves Farmer’s dilemma, but allows for a few interesting side effects:

  • On-chain storage cost no longer needs to be hardcoded, it can be driven by on-chain Automated Market Maker thanks to ability to measure not only demand for storage (size of blockchain history), but also supply (space pledged to the network by farmers), resulting in approximation of real-world price of the hardware, regardless of the token price
  • The history of the blockchain can’t be lost regardless of how large it becomes, in an incentive-compatible and sustainable way, as long as the blockchain is operational
  • Blockchain effectively becomes a Distributed Storage Network since any piece of data uploaded to the network can later be retrieved from the network of farmers

To make pools even more interesting to farmers Autonomys Network was deployed with a voting mechanism built-in that increases the frequency of rewards 10x comparing to just having block rewards, making it possible to receive weekly rewards for even relatively small farmers.

important

As a result, Proof-of-Archival-Storage seems to be the closest ideal consensus mechanism that is both permissionless, distributed and secure.

Everything is a contract

Every contract has an address, which is just a monotonically increasing number. This is in contrast to many blockchains where address might be derived from a public key (in case of end user wallet) or code (of “smart contracts”). There is no separate notion of Externally Owned Account (EOA) like in Ethereum, end user wallets are also just contracts.

The address is allocated on contract creation and doesn’t change regardless of how the contract evolves in the future. This means that externally, all contracts essentially look the same regardless of what they represent. This enables a wallet contract to change its logic from verifying a single signature to multisig to 2FA to use a completely different cryptography in the future, all while retaining its address/identity.

This not only includes contracts created/deployed by users/developers, but also some fundamental blockchain features. For example, there are system contracts like code and state. In most blockchains code of the contract is stored in a special location by the node and retrieved before processing a transaction. Here, code contract manages code of all contracts instead and deployment of a new contract is a call to system code contract. There is no special host function provided by the node. code contract will store the code in the corresponding slot of the newly created contract (see Storage model below for more details), which can be read by execution environment (which knows about system contracts).

A few examples of contracts:

  • a wallet (can be something simple that only checks signature or a complex smart wallet with multisig/2FA)
  • utility functions that offer some shared logic like exotic signature verification
  • various kinds of tokens, including native token of the blockchain itself
  • even fundamental pieces of logic that allocate addresses and deploy other contracts are contracts themselves

It’ll be clear later how far this concept can be stretched, but so far the potential is quite high to make as many things as possible “just a contract.”

This helps to reduce the number of special cases for built-in functions vs. something that blockchain user can deploy.

Storage model

All storage owned by a contract is organized into a container that has slots inside. It forms a tree with the root being the root of contract’s storage, which can be used to generation inclusion/exclusion proofs when processing transactions (see Transaction processing). Having a per-contract tree with storage proofs allows consensus nodes to not be required to store the state of all contracts, just their storage roots. This is unlike many other blockchains where contract may have access to a form of key-value database.

Each slot is managed by exactly one of the existing contracts and can only be read or modified by that contract. Contract’s code and state are also slots managed by contracts (system contracts), even though developer-facing API might abstract it in a more friendly way. It is possible for a contract to manage one of its slots too, like when a token contract owns some number of its own tokens.

A bit wrong, but hopefully useful analogy is cloud server. A server is owned by a provider, but managed by a customer. A provider typically doesn’t have remote access to the server customer orders, all changes to the software that server runs are done by the customer. Similarly, slots owned by a contract are managed by other contracts.

In contrast to most other blockchains by “state” we refer to the inherent state of the contract itself, rather than things that might belong to end-users. The right mental model is to think of it as a global state of a contract.

Let’s take a generic fungible token as an example. System state contract will manage its state, stored in corresponding slot owned by the token contract. State will contain things like total supply and potentially useful metadata like number of decimal places and ticker, but not balances of individual users.

In pseudocode, more traditional blockchains:

fancy_token = {
    // Global state
    totalSupply: 50_000,
    ticker: "FANCY",

    // Individual balances of accoutns in a global hashmap
    balances: {
        alice_wallet: 10,
        bob_wallet: 15,
    },
}

Described storage model with slots:

fancy_token = {
    // Global state owned by `fancy_token` in a slot managed by `state` contract
    state: {
      totalSupply: 50_000,
      ticker: "FANCY",
    },
    code: "...",
}

alice_wallet = {
    // Balance owned by `alice_wallet` in a slot managed by `fancy_token` contract
    fancy_token: {
        balance: 10,
    },
    code: "...",
}

bob_wallet = {
    // Balance owned by `bob_wallet` in a slot managed by `fancy_token` contract
    fancy_token: {
        balance: 15,
    },
    code: "...",
}

The state of the contract (and any other slot) is typically bounded in size and defined by contract developer upfront. Bounded size allows execution environment to allocate the necessary amount of memory and to limit the amount of data that potentially needs to be sent with the transaction over the network (see Transaction processing).

This implies there can’t be more traditional unbounded hashmap there. Instead, balances are stored in slots of contracts that own the balance (like smart wallet owned by end user), but managed by the token contract. This is similar to how contract’s state and code are managed by corresponding system contracts.

Visually, it looks something like this:

WalletCode contractState contractToken contractStateBalanceCodeCodeCodeStateCode  













Contracts do not have access to underlying storage implementation in the form of key-value database, instead they modify slots as the only way of persisting data between transactions.

Contract I/O

Not only contract methods do not have access to general purpose key-value store (even if private to the contract), they don’t have access to any other data except what was explicitly provided as method input. They also can’t return data in any other way except through return arguments. Execution environment will pre-allocate memory for all slots/outputs and provide it to the method to work with, removing a need for heap allocation in many cases.

One can think about contract logic as a pure function: it takes inputs and slots, potentially modifies slots and returns outputs.

Conceptually, all methods look something like this:

#[contract]
impl MyContract {
    /// Stateless compute
    #[view]
    pub fn add(
        #[input] &a: &u32,
        #[input] &b: &u32,
    ) -> u32 {
        a + b
    }

    /// Calling another contract (in this case into itself)
    #[view]
    pub fn add_through_contract_call(
        #[env] env: &Env,
        #[input] &a: &u32,
        #[input] &b: &u32,
    ) -> Result<u32, ContractError> {
        env.my_contract_add(env.own_address(), a, b)
    }

    /// Modifying its own state using the contents of the slot
    #[update]
    pub fn self_increment_by_slot(
        &mut self,
        #[slot] slot: &MaybeData<u32>,
    ) -> Result<u32, ContractError> {
        let old_value = self.value;

        // The slot may or may not exist yet
        let Some(slot_value) = slot.get().copied() else {
            return Err(ContractError::Forbidden);
        };

        self.value = old_value
            .checked_add(slot_value)
            .ok_or(ContractError::BadInput)?;
        Ok(old_value)
    }
}

Environment handle (&Env or &mut Env) allows calling other contracts and request ephemeral state, contract slots can be read and written to, inputs are read-only and outputs are write-only. & or &mut in Rust limits what can be done with these types, there is no other implicit “global” way to read or update ephemeral or permanent state of the blockchain except through these explicit arguments.

Handling everything through explicit inputs and outputs results in straightforward implementation, analysis and testing approach without side effects. In many cases, even heap allocations can be avoided completely, leading to fast and compact smart contract implementation.

#[contract] macro and attributes like #[env] do not impact the code in a method in any way, but help to generate additional helper data structures, functions and metadata about the contract. The macro also verifies a lot of different invariants about the contract with helpful compile-time error messages if something goes wrong. For example, when different methods use different types for #[slot] argument or when type not allowed for FFI is used in an argument.

Method call context

When calling into another contract, a method context needs to be specified. The correct mental model for context is “user of the child process,” where “process” is a method call. Essentially, something executed with a context of a contract can be thought as done “on behalf” of that contract, which depending on circumstances may or may not be desired.

Initially, context is “Null.” For each call into another contract, the context of the current method can be either preserved, reset to “Null” or replaced with the current contract’s address. Those are the only options. Contracts do not have privileges to change context to the address of an arbitrary contract.

The safest option is to reset context to “Null,” which means called contract will be able to “know” who called it, but unable to convince any further calls in it. Preservation of the context allows to “delegate” certain operations to another contract, which while is potentially dangerous, allows for more advanced use cases.

In addition to argument attributes mentioned before, there is also #[tmp], which is essentially an ephemeral storage that only lives for the duration of a single transaction. It can be used for temporary approvals, allowing to use “Null” context for most operations, while also allowing for contracts to do certain operations effectively on behalf of the caller. For example, in a transaction, the first call might approve a defi contract to spend some tokens and then call defi contract to actually do the operation. Both calls are done with “Null” method context, but still achieve the desired impact with the least permission possible.

Metadata

Metadata about contract is a crucial piece used in many places of the system. Metadata essentially describes in a compact binary format all traits and methods that the contract implements alongside recursive metadata of all types in those methods.

Metadata contains exhaustive details about the method, allowing execution environment to encode and decode arguments from/to method calls from metadata alone.

Metadata can also be used to auto-generate user-facing interfaces and FFI bindings to other languages since it contains relatively basic types.

The same metadata can also be used by a transaction handler contract to encode/decode method calls in a transaction. This is huge because with metadata being an inherent part of the contract itself, enables hardware wallets to accurately and verifiably render transaction contents in a relatively user-friendly way, especially on those with larger screens. This means no blind signing anymore and no need to trust the computer wallet is connected to either, the wallet can verify and render everything on its own.

Transactions abstraction

Transactions are a way to run a logic on a blockchain, which in blockchains with smart contracts means to call contracts. There are different ways to do this, but to make things as flexible as possible, little assumptions are made about what transaction actually looks like. There are many use cases that should be supported, it is tough to foresee them all.

Since as mentioned in Contract overview “Everything is a contract,” then the majority of transaction processing must be done by a contract too. To do this, the contract must implement the following “transaction handler” interface that looks like this (simplified for readability):

#![allow(unused)]
fn main() {
pub struct TransactionHeader {
    pub block_hash: Blake3Hash,
    pub gas_limit: Gas,
    pub contract: Address,
}

pub struct TransactionSlot {
    pub owner: Address,
    pub contract: Address,
}

pub type TxHandlerPayload = [u128];
pub type TxHandlerSlots = [TransactionSlot];
pub type TxHandlerSeal = [u8];

#[contract]
pub trait TxHandler {
    /// Verify a transaction
    #[view]
    fn authorize(
        #[env] env: &Env,
        #[input] header: &TransactionHeader,
        #[input] read_slots: &TxHandlerSlots,
        #[input] write_slots: &TxHandlerSlots,
        #[input] payload: &TxHandlerPayload,
        #[input] seal: &TxHandlerSeal,
    ) -> Result<(), ContractError>;

    /// Execute previously verified transaction
    #[update]
    fn execute(
        #[env] env: &mut Env,
        #[input] header: &TransactionHeader,
        #[input] read_slots: &TxHandlerSlots,
        #[input] write_slots: &TxHandlerSlots,
        #[input] payload: &TxHandlerPayload,
        #[input] seal: &TxHandlerSeal,
    ) -> Result<(), ContractError>;
}
}

High-level transaction processing workflow:

TxHandler::authorize()Charge gasTxHandler::execute()Refund gas  






TxHandler::authorize() is a method to be called by execution environment that must, in a limited amount of time, either authorize further processing of the transaction or reject it. It can read the state of the blockchain, but can’t modify it. If authorized successfully, execution environment will charge TransactionHeader.gas_limit gas, call TxHandler::execute() and return unused gas afterward. It is up to the node to decide how much compute to allow in authorization, but some reasonable for reference hardware default will be used to allow for typical signature verification needs. Compute involved in transaction authorization will be added to the total gas usage. seal is where the signature will typically be stored, although it is more of a convention a strict requirement.

TxHandler::execute() is responsible for transaction execution, meaning making method calls. Method calls by convention are serialized into payload (u128 is used to ensure its alignment in memory for performance reasons and to enable zero-copy throughout the system). It is up to the contract how it wants to encode method calls there, though optimized reference implementation of this is provided. While typically not needed, authorization code may also inspect payload to, for example, only allow certain method calls and not others.

This separation should be enough to build all kinds of contracts that would server as a “wallet” for the user: from those that do simple signature verification, to complex multisig wallet with a sophisticated role-based permission system. There is a large space of tradeoffs to explore.

read_slots and write_slots contain the list of slots (see Storage model), which will be read or possibly modified during transaction execution. They will not need to be inspected by most contracts in detail, though can be used constrain interation with a limited set of contracts if needed. This information is crucial for to be able to schedule concurrent execution of non-conflicting transactions, leveraging the fact that modern CPUs have multiple cores. This is primarily enabled by the storage model that makes the storage used by contracts well suited for concurrent execution by avoiding data structures like global hashmaps that are likely to be updated by multiple transactions in a block.

Transaction processing

A transaction submitted to the network will include not only inputs to the method calls, the storage items (code, state, other slots) required for the transaction to be processed alongside corresponding storage proofs. This allows for consensus nodes to not store state of contracts beyond a small root, yet being able to process incoming transactions, leading to much lower disk requirements. This is especially true in the presence of dormant contracts without any activity for a long period of time and generally removes the need to charge “rent” for the state. It is, of course, possible for node to have a cache to reduce or remove the need to download frequently used storage items.

Each method call of the contract has metadata associated with it about what slots will be read or modified alongside any inputs or outputs it expects and their type information. With this information and read_slots/write_slots included in the transaction, execution engine can run non-conflicting transactions in parallel.

For example, balance transfer between two accounts doesn’t change the total issuance of the token. So there is no need to change the global state of the token contract and no reason why such transfers affecting a disjoint set of accounts can’t be run in parallel.

Not only that, storage items used in each method call follow a Rust-like ownership model where contract can’t recursively call its own method that mutates already accessed slots because it’ll violate safety invariants. Recursive calls of stateless or read-only methods are fine though.

The right mental model is that storage access can be used with shared & or exclusive &mut references. It is possible to have multiple shared references to the same slot at the same time. For exclusive access in a recursive call to the slot already being accessed, caller must share it as an input instead, explicitly giving borrowing the data. As a result, multiple calls (in the same transaction or even different transaction) can read the same slot concurrently, but only one of them is allowed to mutate a particular storage item at a time. And any violation aborts the corresponding method, which caller can observe and either handle or propagate further up the stack.

This makes traditional reentrancy attacks impossible in such execution environment.

Conceptually in pseudocode with RwLock it looks something like this:

#![allow(unused)]
fn main() {
fn entrypoint(data: &RwLock<Data>) -> Result<(), Error> {
    // This is the first write access, it succeeds
    let data_write_guard = data.try_write()?;

    // This will fail because we still have write access to the data
    if call_into_other_contract(data).is_err() {
        // This is okay, the data was given as an explicit argument
        modify_data(data_write_guard);
    }

    Ok(())
}

fn call_into_other_contract(data: &RwLock<Data>) -> Result<(), Error> {
    // Only succeeds if there isn't already write access elsewhere
    data.try_read()?;

    Ok(())
}

fn modify_data(data: &mut Data) {}
}

Here is a visual example:

No state (Contract 1)Mutates own state (Contract 2)Reads state (Contract 3)fn compute(...)fn update(&mut self, ...)fn read(&self, ...)  ✅✅❌✅✅













Such a loop will be caught and the transaction will be aborted:

Mutates own state (Contract 1)Reads state (Contract 2)fn update(&mut self, ...)fn read(&self, ...)  ✅Start❌








How can you help?

There are a number of things that are being worked on, but there are also things that we’d really like some help with or to collaborate on. If any of this is interesting to you, join our Zulip chat and let’s discuss it. The list will be expanded over time.

If you have ideas that are not mentioned below, feel free to reach out and share them.

There may or may not be funding available for these things.

RISC-V VM

We need a RISC-V VM. The basic requirements are as follows:

  • Supports an ELF shared library as its input format, must be able to run it straight after compiler without any additional processing
  • Able to run RV64E code with popular extensions (probably RV64EMAC to start, adding vector and cryptographic extensions afterward)
  • Runs in a secure minimal sandbox (like seccomp, possibly in a hardware-accelerated VM)
  • Cross-platform (Linux, macOS and Windows) deterministic execution
  • Has low overhead gas metering
  • High performance (~50% of native speed is highly desirable, which includes gas metering)
  • Low memory usage
  • Quick instantiations to support cheap and frequent cross-contract calls

PolkaVM satisfies some requirements above, but not all and doesn’t 100% align in its design goals, but there might be an opportunity for collaboration.

P2P networking stack

We need a P2P networking stack. There is a prototype already, but it’ll need to be expanded significantly with sharding and blockchain needs in mind. Some requirements:

  • TCP-based
  • Likely libp2p-based (strictly speaking, not a hard requirement, but very desirable for interoperability)
  • Low overhead and high performance
  • Zero-copy whenever possible
  • Support for custom gossip protocols (block and transaction propagation, proof-of-time notifications)
  • Support for both structured (for distributed storage network) and unstructured (for blockchain) architecture

There is a networking stack inherited from Subspace reference implementation, which is focused on distributed storage network needs. It generally works, though bootstrapping time and downloading speeds can be improved. It will likely need to be upgraded with support for various blockchain needs that were previously provided by Substrate framework, but will have to be reimplemented differently.

Funding model for core contributors

It would be really nice to find a sustainable funding model for core contributors and valuable community members. Here are some requirements:

  • Fully on chain without legal entities and jurisdiction constraints
  • Ability to discover and reward valuable contributions without relying on committees/governance
  • Ideally P2P or using small DAOs

The first thing that comes to mind is having multiple rewards addresses specified during farming, with a specified portion of rewards probabilistically ending up in a wallet of the developer/contributor that farmer wants to support. This doesn’t solve the problem of discoverability though. One way or another there should not be a big treasury/governance structure that is responsible for managing funds, it should be more direct and more distributed.