It was a challenging week working on storage access checks for slots, but it is over, and I’m quite happy with how things are looking right now. Some extra refactoring also allowed running tests under Miri and spotted some things that violate the Rust safety rules.
The work from the previous week continued on reworking the way slots are managed by native execution environment to correctly handle recursive method calls and potential access violations. It finally concluded in PR 61 with some follow-up fixes in later PRs.
There were several challenges with it that step from the desire to achieve high performance, while retaining efficiency and maintainability. In the end, the following rules were established: a single recursive call can modify storage, but multiple calls dispatched at once (meant to be parallel, but aren’t right now) have only read-only view. This should fit the expected use cases nicely and help to constrain code complexity. There are a few paragraphs that explain goals and results in more detail in PR 61 if you’re interested to learn more.
I’ve been thinking about address formats some more and decided that for a global system 44 bits for addresses is really
not enough, and it should be way more than that. The addresses were also stored as [u8; 8]
instead of u64
to reduce
alignment requirements for data structures that might contain them, so the question became what should bigger address
look like and how much bigger should it really be. I then looked at RISC-V (planned to be used for a VM) assembly for
different operations on byte arrays. Comparing two addresses is the most common operation here, and turned out that byte
arrays comparison generates way more assembly instructions to do the same job. This is both due to RISC nature of the
ISA and the fact that alignment of the byte array is 1. x86-64 has powerful instructions to read unaligned byte ranges
into XMM registers and do comparison for all bytes at once, while RISC-V assembly (at least the way it is generated by
rustc for riscv64imac-unknown-none-elf
) was comparing bytes one pair at a time.
As the result, I decided that u128
will be the address format, which might be relaxed to a pair of u64
s that 64-bit
to reduce alignment requirement from 16
bytes to 8
(RISC-V assembly is comparing 64-bit halves separately rather
than full 128-bit value at once anyway). This, landed in PR 63, which also included some refactoring for slots
management, given how large a pair of addresses (owner+contract are used to identify a slot) have become.
Based on developer interview with Shamil I have clarified and expanded on documentation in PR 64, which I hope will make it easier to understand.
I did some initial benchmarks with PR 61, turned out it is possible to create an environment instance and call
Flipper::flip
on it about four million per second on a single CPU core, which gives you a good perspective of how much
overhead is happening in typical blockchain environments that can only do orders of magnitude fewer simple transactions
per seconds. But after slot optimizations in PR 64 I got curious if it is possible to do better and squeezed another
million calls per second in PR 65.
Five million calls per second on a single CPU core, ~200 ns per call! I’m sure it is possible to get even lower while
preserving necessary logic and overall architecture. That is basically the baseline, whatever cost above that is a waste
and should be minimized. perf
stats look something like this:
1 122,47 msec task-clock:u # 1,000 CPUs utilized
0 context-switches:u # 0,000 /sec
0 cpu-migrations:u # 0,000 /sec
167 page-faults:u # 148,780 /sec
5 459 682 279 cycles:u # 4,864 GHz
74 201 852 stalled-cycles-frontend:u # 1,36% frontend cycles idle
14 406 036 797 instructions:u # 2,64 insn per cycle
# 0,01 stalled cycles per insn
2 470 197 650 branches:u # 2,201 G/sec
14 684 branch-misses:u # 0,00% of all branches
With storage taken care of for now, there was a small problem that bothered me for a while: inability to run tests under
Miri. Writing unsafe code in Rust is more challenging than in languages like C, and there is quite a bit of unsafe code
due to FFI and performance reasons in the native execution environment right now. So running under Miri was very
desirable, but unfortunately not possible with inventory
crate that was used to make execution environment aware
of all the contracts available, so implicit use of inventory
had to go away.
I still wanted to have an ergonomic API though, and that proved to be its own challenge due to the need to register both
contracts themselves and traits that they implement, but traits as such aren’t types. The best thing I came up with was
to instead use dyn ContractTrait
as a type, but then I discovered that associated constants just like other generics
make traits not object safe. I found several discussions and summarized the conclusion with some links
on Rust forum. And shared in the next post an unstable (and incomplete!) feature that allows to have associated
constants in traits that are object safe, but it doesn’t look likely that it’ll be stabilized any time soon. Ultimately,
I had to split associated constants into a separate trait (implemented on dyn ContractTrait
) and remove : Contract
bound on the ContractTrait
itself, but it seemed like a price worth paying. In the end, PR 66 landed a decent API
that explicitly registers contracts to be used in the native execution environment (system contracts are registered
internally automatically), looks something like this:
#[test]
fn basic() {
let shard_index = ShardIndex::from_u32(1).unwrap();
let mut executor = NativeExecutor::in_memory_empty(shard_index)
.with_contract::<Flipper>()
.build()
.unwrap();
// ...
}
That also meant tests are finally running under Miri 😱
Yeah, Miri wasn’t too happy initially 😅. It took a lot more reading and some help from the Rust community to figure out why, but eventually I was able to make it work in PR 67, which also added Miri tests to CI 😊.
That was the bulk of the things I got done, with some random research and WIP stuff in a local branch that I will talk about next time. Unfortunately, there were no interviews this week, but hopefully next time!
Upcoming plans #
The next steps related to execution environment will be to add a notion of a transaction. So far it was just calling methods on contracts, but the actual blockchain will have inputs serialized into a transaction. While serialization/deserialization is already happening when doing calls from contract methods, the API that developers can use externally wasn’t that. With transaction support and more explicit slots handling (ability to provide them as input and extract afterward for persistence) the workflow will be partially complete and sufficient for further interation into a bigger system with things like transaction pool. Transaction pool, of course, doesn’t exist yet (just like most other things), but it can be fixed 😉.
Based on developer feedback, I would also like to simplify contract API a bit, specifically remove #[result]
and make
it a special case of #[output]
, which will remove some code duplication in execution environment and procedural macro
and will be easier to explain.
Once those are done, I will probably conduct more developer interviews with more people (will try to hunt down some ink! maintainers or users initially). If there is someone I should definitely talk to, let me know.
Also, hopefully more hiring interviews this time.
See you in about a week with more updates!