Light Client Task Force Call #2
Authored by Colin Schwarz
Light Client Task Force Call #2
Notes and Transcript
Thanks to Cayman Nava for helping to proof read and edit!
On December 4th, 2019, ChainSafe and the Lodestar team sponsored the second Light Client Task Force call. The purpose of these calls is to bring together individuals who are interested in Eth2.0 light clients and start a discussion to drive further collaboration and research in this area. The second call focused on Merkle proof formats and proof negotiation and was led by Lodestar tech lead Cayman Nava. A full recording of the call can be found here.
Below is a TLDR followed by a full transcript of the call. An attendance list is included at the bottom of this article.
TLDR
Sparse Merkle trees: There is still a lot of conjecture around these and research is far from complete. It appears that no one is planning on using them imminently, so the research team can prioritize other areas of the spec before hardening the SSZ sparse Merkle tree spec.
Proof Backing Abstraction: Proofs are generally stable and can be viewed as key-value mappings from indices to data in a tree, abstracted from their specific serialization. More work is needed for tooling to abstract proof backings and provide a high level interface for proof generators and consumers. On-chain backings likely optimize away the tree-structure in favor of statically determined offsets into serialized data, off-chain backings can be more flexible.
Proof Request Landscape: Different actors will likely provide proofs for different pieces of the data. Consensus data is held by beacon nodes / validators. EE data is held by state providers. Multiple parties ranging from block producers to end-users require EE-level proofs, so infrastructure can be shared directly at that level.
Proto's writeup on binary-tree-backed SSZ objects: Representing some SSZ data as a merkle tree with typing overlaid will provide interesting prototyping opportunities for proof generation/consumption and potential performance gains in beacon nodes. There's interest in developing a Rust and Typescript implementation.
Call #3: Will likely be week of Jan 20. There is no specific topic yet, we are open to suggestions!
Transcript
The following transcript has been edited for clarity and conciseness.
Cayman's intro:
I'd like this to be kind of a brain storming, discussion based meeting and less of just updates, so things where we all benefit from being gathered synchronously in this meeting. I think it might be useful to think about if anyone has any priorities for hardening research or if anyone has any pieces or research that should be prioritized over another. So what features do implementers need? This touches on some of what Matt was talking about with the framework that he's thinking about building (see attendance below for more details). What comes to mind to me is sparse Merkle trees as a standard in SSZ, and that's not quite hardened. Does anyone have any specific needs that aren't being met, things that could be pushed forward that may need a little bit more coordination? Is anyone going to be using any sparse Merkle tree proofs in the coming month or so?
Alex "I think a lot of it is maybe unknown at the moment. A lot of people have talked about them, I think both on the Eth2 side and also the Eth1.x side, so there's even some chatter about changing the current Ethereum state accumulator to be one of these sparse Merkle trees, but has anyone heard anything beyond conjecture at this point?"
Will "I'm not sure that they're wanting to change it to be a sparse Merkle tree. I think they're just changing it to act on a binary tree but I think that's still under research. But Matt, you're using a sparse Merkle tree?"
Matt "I'm trying to not use a sparse Merkle tree though. It seems inefficient."
Proto "The problem so far has been that current SSZ doesn't really have a sparse Merkle tree. When you emulate it, you have a binary tree with a depth of ~32 bytes which is a lot, 256 levels. Whereas with the new sparse Merkle tree proposal which changes this and makes it lower so effectively if you have 1000 accounts that's a depth of 10 on average, a million accounts, that's 20, so the number of accounts scales as a binary tree."
Matt "After its defined can it scale? can it re-balance or is it statically defined?"
Proto "Assuming hashes are universal it just balances itself, it's just that there are certain kinds of attack vectors there if you'd like to insert a lot of keys with the same prefix, if you explicitly hash them you'll do more and more work to create a longer and longer prefix. There are different proposals as well that do balancing, it's also about simplicity here, and consistency with parts of the spec."
Cayman "Ya, so balancing is a pretty big divergence from the rest of our SSZ world."
Proto "Ya compare it to say Bitcoin and think, how much work goes into recreating a prefix and what are the average depth of your tree anyways, given a certain number of accounts. It's the kind of thing we like to hear about and see where the spec should go, but if no one is using them right now in the next month we'll just postpone this for a bit and prioritize the other spec work."
Cayman on proof requests
Ya so I guess it sounds like no one's going to be needing that imminently. So that's good, we can free up some time to do other things. So maybe we can go into the meat of the topics today. I wanted to start with thinking about what kinds of proof requests that light clients are making, and thinking about how there's two different approaches to looking at this. There's kind of a spectrum of specialized kinds of requests vs general requests. On one end of the spectrum you're requesting very specific generalized indices, or very specific paths in a tree. On the general side of the spectrum you're saying ‘I want to run this function, this piece of data, give me the proof that can run that function, that will pick the pieces of the data that I need to run the function. I'm curious if anyone has any thoughts or has done any thinking on this topic. It would be nice to have a general solution but it's a lot easier to start with something that's really specialized so I'd like to open it up.
Matt "I guess from an execution environment point of view, the way that I see that we're going to determine what the proof needs to be, is that we'll have some kind of concept of a stateful node or executor who has all of the data that you would expect to be available in a fully stateful node, and they will run some user transaction that doesn't have the proofs, and then just record the keys that it requested while running the transaction, and then after execution, generate the proof from the keys requested."
Will "What do you mean keys requested in this case?"
Matt "I mean every time I'm running a stateless transaction and want to access data, it's the same thing as doing an s load, I'm asking for some key to get a value back from this mapping in the multiproof. So typically if we're running this statelessly we have a multiproof and we look up that key in that multiproof and if we don't then we revert or we panic or something. But since it's fully stateful, we have all the keys that are in existence so we just need to record the keys that we access and then build a proof after the execution completes."
Cayman "Gotcha. So is this the kind of thing that you're thinking this framework would help. Like it would build the proof in whatever format that the EE designer requested or developed?"
Matt
Ya so the two ways that I'm trying to build proofs is A, I wanna be able to build proofs in more of a static way, like we're trying to test something. Say I have some account mapping address to account the objects and I want to be able to change the balance of something, I want to be able to do that with high level language constructs and just say like account.setbalance of this address to this value. Then that statefully updates their balance and then I can generate a proof from that. So that's one way, and then the other way that I'm going to write proofs is that you have a constant already existing state and then you run a transaction to see what keys it touches, and the way that I think that this could be generic is that I'm imagining most proofs to be stable, in that the key will always map to the same values regardless of the structure. So with a sparse Merkle tree this is obviously true because you just follow the general index as the path, but for non sparse trees, like the Merkle Patricia tree, you still have a stable key value mapping. So if we lift that key value mapping and abstract that from the actual proof backing and we pass around those key values, then when it comes to actually serializing that data we can do that in a specific way to the proofs. So we can still talk about the proofs in a lot of the different areas of this tooling, just using the key values. But when it comes to actually executing it, we can give it to a specific back-end library to convert it to whatever the serial format is.
Cayman "OK, the keys here would be generalized indices and the values would be the data?"
Matt "Ya."
Will "Maybe explain that last part where you said then you let whatever back-end library... what did you say?
Matt "I mean you can think of different ways of serializing a proof. I mean we've got a Merkle Patricia tree and an SSZ sparse merkle tree, and both of them have some concept of ‘if I have a key there's a value related to that key,' and there's different ways of parsing that data out of your serialized proof. So obviously for a sparse Merkle tree, just follow the branch all the way down to the bottom to get the full path. With the Merkle Patricia tree its obviously different because some of the branches are compacted, because it doesn't have all the siblings in a sparse manner but the key value is still a stable mapping. It's just that the actual proof backing is different."
Will "I see, so what you're trying to say is just generalize a set of operations or functions and it's agnostic to what the actual proof backing is, so you could just switch different proof engines behind the scenes, but they all follow the same interface?"
Matt "That's kind of what I'm thinking, but I'm curious because I've just been spending a lot of time with sparse trees and I'm curious if people think this is applicable to non-sparse trees as well."
Proto
"One thing with generalized indices, when you start to talk about trees of trees, these SSZ objects, keys start to get more complicated since you need to navigate all the way down, so these generalized indices you're trying to solve conceptually. So for off-chain things that would work just fine. On-chain, these generalized indices may become too long. You have this big massive structure that goes very deep and you don't want to deal with this huge bit field that describes how to navigate the tree. So instead, one of the ideas here is: since everything is so stable and predictable, like an object always matches to the same shape of a tree, you can optimize during compile time a lot of this navigation away, so the interface of your backing should support, in compile time, compress this navigation step that will always be the same. You don't want to do this during runtime or you get overhead in your on-chain environments."
Matt "That makes sense."
Proto "Do note, there's a set of current consensus objects, they're not using sparse trees and they're all relatively organized. All the generalized indices, all the paths in the beacon state, the main consensus object, they don't go deeper than 64 bits so you can just use an integer and navigate the structure really easily, so passing around that sort of index is not so much of a problem."
Cayman "That's interesting, you mention on-chain vs off-chain. I think for light clients, at least consumer type light clients, we have a little more flexibility in the kind of implementations that we have. We're not needing to compile to Web Assembly and be optimizing every single step. We could be doing some of this at runtime potentially, or handle different backings."
Proto "Right, there's more space to try different things for light clients off-chain. You can load a binary tree is memory no problem, for example."
Cayman "So what is the overlap here between these proof generating utilities for light clients vs these proof generating utilities for EEs? It seems like it's roughly the same thing, is there any way that we can share implementations or is there any way that we can share the work that we're doing here where we're not having to do double work?"
Matt "I mean ya. So I guess there's two things that light clients care about. They care about proving proofs against consensus state objects, and then once you've proved those proofs against the state object, the root on an EE which is the largest unit of state an EE has, then you can use the light client to prove a proof that's directly related to the EE. So in that case, we can exactly share the same tooling for EEs with light clients. But I haven't really thought much about proving the consensus state objects or anything. I know that it's something Proto thinks a lot about.
Proto "Right, so what I tried to explain in my post that I added to the agenda is that typing is merely typing. So currently all these clients implemented SSZ, this serialization and merkleization framework as something where you have this object in memory. You can navigate the object and hash the whole thing over and over, and you have some caching where it tries to optimize the hashing. Instead, what you should be doing if you were serious about consuming proofs, serving proofs, or running on partial data, is that you should back your data structure with a tree itself and then interpret it with the typing as if it were an actual regular object. And in a language like Python or TypeScript, Rust also, it won't be too difficult to abstract this away. I tried it in Go to but it is very verbose."
Cayman "Right so you could potentially with something like that, if the EE was using SSZ, you could potentially just merge the Merkle tree from consensus all the way into an EE state."
Proto "Correct, and then if you're really serious about on-chain, you can keep your tree in the serialized format as a backing, and then in compile time you map these typings, this access to the tree, to instructions to how to navigate the backing itself. So you take away the components where you have this abstract binary tree in memory. It should be doable for all the stable proofs. So thats also something really really important that every tree shape is as stable and consistent as possible, so that as much as possible can be precompiled."
Cayman "You're talking about specific types of proofs that..."
Proto "Right so say you program your smart contract and you use all the types you like, so you just act like ‘this is a vector, this is a container or whatever,' you navigate it and then when you compile this, you wouldn't initialize the object or load this whole thing. Instead, you just map two instructions of like how to read the input backing, and without the whole vector or the fill data being there."
Cayman "Gotcha, the offsets would be calculated at compile time."
Proto "As much as possible, and there will be parts that are dynamic of course. Like if you look up a certain index in a tree or something like that. But again, most of the tree should be stable anyway."
Cayman "OK, this bring me to another question. So there's this consensus world and there's the EE world, I've heard some people that have wanted to minimize... I feel like it may be a good idea even to try and minimize the differences between how we're requesting proofs from the consensus world vs the EE world, where we're not having to create these two separate frameworks entirely, and I'm wondering: in what kind of cases does that make sense? EEs are more flexible, they're not known ahead of time necessarily, except that they kind of are, because we're going to be building an Eth2 EE that's more or less enshrined as more of a first class citizen. That's kind of my interpretation. So in what cases does that make sense where we could be blurring the lines, or does it not makes sense? Is it fatally going to be different or is there good overlap there?"
Matt "Are you talking just about the actual proofs, or are you also talking about where you retrieve the proofs?"
Cayman "I mean where you retrieve the proofs seems like it will be different no matter what. The consensus actors have the consensus data and state providers have the EE state, and so I don't know how you could reconcile that, I mean other than having a meganode or something, some software that bundles together a bunch of stuff and serves proofs across the board. Has anyone done any thinking about that, does it make sense to do that? That's a good point, the data is even split between two different sets of people or sets of actors."
Matt "Ya, I just can't really imagine a world where it's all coming from one place because to get the security guarantees that you want to have, most likely as a light client you want to interact directly with a consensus nodes on consensus state objects to prove that. Those consensus nodes are just not going to have the states of the EEs so you're going to have to go somewhere else. So ya, I do think that those will be in different places. As for the actual proofs and keeping those similar, maybe if an Eth2 EE is the consensus proof format, that's really a matter of how efficient we can make the consensus proof format. It would be great if it did, there's no reason to not do it that way, the only thing that matters is how efficient it is in Web Assembly. I mean it sounds like what Proto's describing where you're able to just keep it in memory, that's like exactly what we're looking for. So I just need to build these things and then see if there are other potential solutions."
Proto "Quick question: has anyone started on the server side? Generating a proof off-chain and getting it to someone else essentially."
Cayman "I have a branch of our SSZ where we... this is not even related to a server but its kind of a prototype of something you'd need before then, just to create proofs on consensus SSZ objects. It doesn't do the Merkle tree representation behind the scenes, it just kind of dumbly creates a proof from... it dumbly creates the tree whenever you want to create a proof. But I guess that would be one step, can you create a proof from a consensus object? Can you create an arbitrary multiproof from a consensus object? What do we think would be some steps that we can take to get towards that? So the server: we're talking about a piece of software that can make requests for proofs and generate proofs and then respond with those proofs, is that what you're thinking here?"
Proto "Well hopefully something generic like that will eventually be a thing. Also as a basis for these other execution environments that will have to run similar things. Currently the phase 1 spec, the light client thing in there is very very minimal, so that would be fine to hardcode and create a specialized thing. Although I think things may be changing a little bit, or we may be expounding on this as phase 1 gets more and more stable and standardized."
Will "Proto do you mean for the server just setting up a standard RPC spec..."
Proto "Ya, pretty much. There's only so much consensus information and so many different repos. The current spec is already a little bit outdated and things might change again. The current one, there's one object that tells the light client all the information it needs and just runs some checks that are sufficient to stay in sync. That's not so exciting."
Cayman "Right there's just the sync protocol in the specs repo, no proofs beyond just staying up to the head of the chain."
Proto "Right. So eventually you would want to be able to do state queries or look up certain data, or maybe even run functions. And then if execution environment things get even more interesting once there's an environment that is spec'd by some sort of SSZ structure or Merkle structure and you have a way to run view functions, like in eth1 terminology, if there was a way to run view functions then be able to prove them as a light client and the server serves the answer, that is really cool."
Cayman "Right, so getting to that end result of being able to run transactions as a light client. There's a lot of proofs that you need, a lot of data you need to get to that point. Does that make sense, to start playing around with a basic RPC structure? I mean it sounds like kind of more on the specialized side of requests, like ‘give me these paths into the consensus data' rather than ‘run this function,' more specific."
Proto "Ya, I think since there are not that many people doing this we should just be focusing on a very minimal light client spec first, getting in sync with a beacon node, then implement these kind of state queries. In the meantime, in parallel we can talk about the interface and then there's also a lot of optimizations where you want to be able to serve some kind of proof and don't want to replicate or copy all of your state and therefore you need to process a new block. These kind of changes, this tree structure, does also mean a lot for beacon nodes, regardless of the immediate light client work."
Cayman "Right. I know that I really want to implement this Merkle tree backed beacon state because I think it would help out our Lodestar implementation a lot. One of our bottlenecks is hash tree rooting state which takes a second or two right now, which is kinda crazy, and if we could spread that work out over time, not have to rehash a lot of things because of how things are naturally cached in the tree, that would bring us a lot closer to being production ready."
Cayman "So it sounds like we're maybe a little early, on the light client side, to be requesting data from EEs because a lot of the machinery and thinking hasn't been done to even get to the point where we can get to an EE state root. That sounds like what I'm hearing, is that in line with what you guys are thinking? Is there some way that we can push ahead for light client requests at the EE level now?"
Will "Cayman can we step back just a tiny bit as well? Can you define all the actors in the system in regards to, not thinking about EEs, in regards to beacon state, that would be running a light client and requesting proofs?"
Cayman "Right, so are you talking about the proof requesters?"
Will "Ya."
Cayman "I'm just imagining this end user light client. I'm imagining a Metamask type plugin where you're wanting to stay up to the head of the Eth2 chain and also see your balance updates happen within the Eth2 EE but within the Eth2 world. There may be other light clients out there, that's a good thought, but I've really just been thinking about consumer light clients who would be wanting to request proofs from consensus Eth2 actors, so validators or beacon node operators. So we have the beacon node operators who have consensus data, certain validators have data about certain shards so I guess that's a separate actor there."
Will "Ya, and even validator clients don't necessarily need to be at the full consensus state."
Cayman "So you're talking about validators being light clients on other shards or on certain shards that they're not actually full on?"
Will "Sure, or just even running a validator node that connects to a beacon node on light hardware."
Proto
This is something interesting I have tried. ZRNT, the Go spec, a little state transition for the beacon chain. I tried implementing these Merkle tree backed types, and then when the state transition gets these types, instead of the regular Go types. I was implementing this. The problem was Go is very very verbose here, so you have lots of error handling and no metaprogramming so you have to be really explicit about things. It's error prone as well. I got relatively far, not completed yet. And running the beacon state transition on top of a Merkle tree, that can be incomplete. So I definitely think it's kind of viable, though you would probably need to make some modifications to be able to run the state transition on top of a proof. And then there is the problem of this high frequency of proof data, as well, so you'll need to be connected to some full node that does have the full state. I think it's more interesting for low resource nodes to just implement a cache really really well to minimize the amount of hashing and to minimize the amount of duplicate storage by representing things as a tree. It should be interesting to teams like Lodestar, and Trinity also has kind of something similar. They already did the hashing side, so they have pyrsistent, it's a python library that implements data sharing and they use it to implement this layering to implement SSZ caching. Not 1 to 1 on a real binary tree but close enough. And so that helps them run beacon state hash much faster than they would otherwise be able to.
Cayman "Will, can we step back....?"
Will "Ya so I can name some in the EE world, and I don't know, just as a comparison. So I think the list you gave was good. So in the EE world, one, if you're an end user using something like Web3.js or Ethers trying to test-run some functions. Your wallet providers are gonna need to do that. Block producers, in general, will likely need to do this as well, since the block producers won't be stateful from an EE perspective. There may be cases where they are missing some piece of data or need to do some update on some piece of data so I think that's less spec'd out and that's interesting. Ya I think that covers the cases there so I was just curious to see where the overlap there was, so it seems like wallet providers and client tools, things like Web3-JS, things like that is gonna be the major overlap."
Cayman "Ya, sounds right. I guess this kind of echoes what Matt was saying earlier about once you get into the EE world you can use that implementation directly because we're dealing with the exact same data. So we'll be requesting consensus data up to that point and then being able to use this EE specific tooling to work on the proof at that EE level."
Cayman "Proto, you mentioned that the spec is kind of underspecified on light clients. Is there anything specifically that you're talking about with the new phase 1 PR, or any work that we can be doing to harden that up?"
Proto "So the current light client spec describes something about persistent committees and still pretty much assumes the old sharding approach, and I think we should reiterate that and see how it can be improved with a new proposal. It will be part of like phase 1 work and I guess starting from Thursday I'll be taking a look again with Danny and others and see if we can continue this work to reach phase 1 for like version 0.10 of the Eth2 spec."
Cayman "OK ya, cause I saw that there's something called a light client committee and that kind of replaced the persistent committees and are specifically for helping light clients or being like the subset of validators."
Proto "One of these interesting things, or a normal light client sort of thing is that if you were to maintain a beacon state based on some more minimal data you would want to use this light client information that's specifically compiled to not require as much hashing as the regular state and be more condensed so you don't have these proofs of data all over the place in the state, you could focus on one set so that you compress the data better. So the data of a committee would be packed together for example so you can better prove this data."
Cayman "Gotcha, most of it is shared in the tree."
Proto "Right, so every leaf shares with the other leaves so a multiproof is much more efficient."
Cayman "Alright, is there anything else anyone wants to discuss, specifically regarding proof formats?"
Proto "Last thing, there's writeups where I tried Go and Python, I do kind of want to try out Rust, you might be working on this Matt, and then Will maybe and the others. I don't want to make duplicate work, I do want to make more progress here and further these abstractions and also hopefully create something that's not as much like Go where you have to have this ugly error handling all over the place and a really inconvenient way of running all these proofs. I think it's more desirable to do in Rust and TypeScript and make some of this tooling generally available for people to look into the beacon state, prove data, be secure without running a full node. Also just being in sync with the state and providing block explorer functionality or some insight for validators trustlessly."
Matt "I just sent the repos of where we've worked on multiproofs in Rust. There's not a ton of work so if you don't want to duplicate work just look at that and see where it is and I don't think you'll duplicate too much."
Proto "So when you look at the little write up I did, do you think there's lots of overlap or do you think there's something missing, or maybe others would like to pitch in here?"
Matt "The type writeup or the other writeup on different forms of proofs?"
Proto "The latest one, the type thing."
Matt "No I think it's in the write direction and it's not really something that I've done because imp is really just based on general indices, they're both super raw. So this other one I made, called oof, is just really simple and really inefficient because I was just trying to make something that's easier to test and do diagnostics while building this EE and so we're still in need of something a little bit better engineered and easier to use so these type formats will be pretty useful."
Proto "Ya I think also for Lodestar, making tooling trustless by being able to prove the data you received from some server. And without even running a full light client at first, its already a huge step."
Cayman "Ya I agree. I think as the holiday period comes that's something I want to work on. Starting to prototype out the things from your write up and seeing how we can hook that up into Lodestar. But just more generally you're saying make some tooling for it. So just put it into a library where we have something that can just suck up a proof, turn it into this datastructure that we can interact with. Ya that would be very valuable."
Matt "If there's anyone on this call or anyone watching this recording who wants to get more involved with building multiproofs, especially if it has to do with EEs, definitely reach out to me on Twitter or contact me on Telegram or something. We really need to build more formats and find more efficient ways of doing these things. I don't think there are enough people working on this yet."
Cayman "Agreed, the space is very small, a lot of room for collaboration here. I know I'm also open arms for anyone wanting to collaborate or find any places where we can work together on this stuff."
Cayman's concluding remarks
"It sounds like the discussion has kind of wrapped up. I think we had some good discussion, came to a few points about priority and what makes sense for us to all be working on that might benefit each other and push this space forward. We don't have a time set at this point for the next Light Client Task Force call. I think if we do it in a month from now it's a little too close to the newyear and people might still be out, so I'm thinking we'll bump it back to a month and a week out or a month and two weeks out similar to the phase 2 call. No specific topic yet, if you guys have any topics that you think would be great I would love to hear it but we can take that offline."
Attendance
Cayman: Lodestar tech lead. Prominent researcher in Eth2 light client space.
Colin: Lodestar project manager.
Alex: focused on Eth implementation and Trinity Python client.
Matt: working on an execution environment that executes user smart contracts. Part of that is trying to figure out how to build tools that generate proof.
Proto: recently working on how to abstract the regular typing structure.
Sascha: interested in light clients and how they affect layer 2 solutions.
Will: interested in the intersection of light clients in a stateless world.
Zaki: interested to see how Eth2 light clients compare to Cosmos light clients.
Ansgar: With Quilt team. Interested in intersection with stateless side and light clients.
Greg: CTO at ChainSafe.