Authored by Rahul Sharma
A brief overview of the development of the bin_prot serialization library in Rust.
In this post, we'll give a brief overview of the development of a serialization library that Mina protocol uses to communicate with one another.
A decentralized peer to peer network such as the Mina blockchain needs a set of rules (aka protocols) to communicate their state to each other in order to stay in sync. Mina uses a wire protocol - a way for getting data from point-to-point in computer networking - called bin_prot.
What is bin_prot?
Bin_prot is the binary serialization library (aka the wire protocol format) developed by Jane Street, a global proprietary trading firm and one of the major contributors to the OCaml language and its ecosystem. Serialization is a way by which computers can convert objects in memory into bytes, which can then be stored or transmitted. Similarly, de-serialization is the other way around where you re-construct an object from bytes. The serialization format contains rules that ensures that when a given object is de-serialized, the structure is correctly reconstructed - even in different computing environments.
Mina's reference implementation in OCaml uses bin_prot as the low level serialization format for communicating data and the chain state across nodes.
Bin_prot relies on PreProcessor eXtensions (aka PPX's, which allow developers to add new syntax and features to the OCaml language otherwise not possible) from OCaml to automatically generate bin_prot read and write methods at compile time for any supported type by simply annotating them with
Here we define a wrapper module
Bool containing a type
t which is of type bool with the deriving annotations
show on it which generates code at compile time.
When compiled, the above code will generate
bin_read_t functions for the
type t methods on the type
bool due to the
Having these serialization functions generated at compile time removes the need to perform type reflection at runtime to serialize objects, thereby making the serialization phase very efficient and robust.
Bin_prot defines the specification for most of OCaml's data structures and types. At the time of writing of this post, bin_prot is very much tied to the OCaml language and its built-in data types. As more users of bin_prot emerge, the spec is expected to turn into a language-agnostic specification.
Why was bin_prot chosen for Mina?
Taking a look over at the bin_prot documentation on Jane Street's GitHub, bin_prot is regarded as a type-safe binary protocol that is extremely efficient at serialization and de-serialization. This is true even on highly structured objects and values. It can do so at high speeds, and is optimized for size - making it ideal for long-term storage of large amounts of data.
The bin_prot library is also battle-tested, having seen usage in mission-critical financial applications that not only transmit millions of messages a day, but must also do it in real time, under low-latency, and in a crash-proof way. Bin_prot also supports all CPU architectures currently supported by OCaml.
Bin_prot itself is a well-supported protocol in the OCaml community, and it is tailor made for OCaml values. More than that, it is a simple type-driven protocol with first class support for Abstract Data Types (ADTs). Needless, to say, this makes bin_prot an excellent wire protocol for Mina given the OCaml reference implementation and the library's intimate relationship with the language.
Writing bin_prot in Rust
As part of the first milestone for the Rust implementation of the Mina Protocol, ChainSafe will build support for the bin_prot spec. As of this writing, we have implemented a serde compatible bin_prot compliant library - serde-bin-prot . Serde is a framework for serializing and deserializing Rust data structures efficiently and generically. Similar to how bin_prot does it, serde generates serialization and deserialization methods at compile time for supported types.
Given how Jane Street's bin_prot uses PreProcessor eXtensions, it was natural for us to pick Rust's Serde framework, which combines the
#[derive] macro that works very similar to OCaml's PreProcessor eXtensions. We chose Serde because it is the de facto way of serializing Rust data structures, with as little overhead as possible, thanks to the zero-copy philosophy in its design.
Rust trivia: Rust's compiler was originally written in OCaml
Serde is not a serialization library. Rather, it is a framework that developers can use to provide serialization support in any format you want to-and-from Rust's data structures. To support a serialization format in Serde, all we need to do is implement the Serialize and Deserialize traits and the rest is taken care of by Serde's well designed and abstracted API's.
Here's a contrived code snippet on implementing serialization on a Rust
bool type in serde:
From the bin prot boolean spec, a value of
true is serialized as a
0x01 and false as
Serializer struct keeps a reference to a
Write instance which can either be a byte buffer or a file handle. The
ser::Serializer trait from serde is then implemented on
Serializer where we implement the
serialize_bool trait method that writes the byte value according to the spec. Similarly, we have methods for other types.
Testing serde-bin-prot library
An "expect test" is a test where you don't manually write the output you'd like to check your code against - instead, this output is captured automatically and inserted by a tool into the testing code itself. If further runs produce different output, the test fails, and you're presented with the diff.
Apart from having our own test suite testing serialization and deserialization within Rust, it's expected of a wire protocol library to have cross language compatibility. So we looked towards porting the original tests.
The initial ported interop test code was very procedural. We created a byte array buffer and equivalent structs and wrote assert statements for each test case. This felt quite like a DRY (don't repeat yourself) situation and would have made updating the test cases quite a chore in future. So we started exploring ways to allow us to write test cases similar to how the generated expect blocks are in the original bin_prot tests. The expect blocks are generated by the ppx expect library. Instead of writing a expect block generator, we created a macro where you can write the test cases just like how it's written in the original bin_prot tests.
Here's how the macro we wrote looks like:
This macro is a declarative macro which accepts any sequence of
.. representing no bytes followed by byte arrays followed by a
-> and a value that we want to assert against for its byte representation. If the input to the macro matches the pattern the code after
=>is generated for each invocation. As you can see, macros are complex to write and read since they work at the meta language level. But the abstraction that they let you build makes it worth writing them. With that macro in place, we could write our tests very similar to how bin_prot ppx expect does it:
Sweet! The tests now look as close as the original bin_prot expect blocks, despite a few quirks such as having to prefix
0x for any hex ranging
a-f. But the fact that you can create custom DSLs with Rust macros using pattern matching is incredible!
With that our test cases LOC went down by quite a bit
(+219 −340) and the tests look easier to read and maintain/update.
Until next time!
To learn more about Mina, be sure to visit their website!
What does the future look like? Nobody knows, but it starts with infrastructure. Join us at ChainSafe to build the brazen future. Check out the careers page of our website and our open positions, and get in touch with us at firstname.lastname@example.org if you are interested!
Also, be sure to check out and follow ChainSafe's Twitter and YouTube Channel! If you would like to get in contact with one of the Mina implementation team members from ChainSafe, feel free to drop by on our Discord.