Join Vitalik Buterin, from the Ethereum Foundation, as he discusses the topic of 'Multi Provers for Rollup Security'.

Today I will be talking about multiproofs.

So, start off on introducing the problem, right?

So, today, almost all roll-ups are still on what I call training wheels, right?

There's still some kind of mechanism that can basically override the proof and cause whatever outcome it wants inside of the rollup if it decides that the code has a bug in it.

There is some kind of multi-sig, override, governancy thing, whatever.

And in pretty much every rollup that exists today, with very few examples, so Fuel v.1 is 1 of them, and I think some of Starkware's products might be another example.

But with very few exceptions, everything is on some kind of training wheel where even though there is some kind of proof system that is theoretically there, the proof system isn't really in charge.

So there's this page on l2beat.com where if you go to the risk analysis tab it shows you kind of the status of some of these, right?

And for all of these, you have Some of the proof techniques are either in development or they're overridable, right?

So like, you know, if something is upgradable without a delay, that actually means that it's overridable.

So basically every rollup that we have today is not really actually controlled by code.

It's still ultimately controlled by some kind of group of humans where you have N of them and M of them can ultimately push through whatever they want.

The question that I want to ask is, how can we actually move beyond the status quo?

What would it really take to move us to a world where roll-ups are actually trustless or trust minimized, and where the fraud proofs and the ZK snarks that extremely smart people have been spending thousands of hours working on actually mean something.

So, why is almost every rollup using training wheels today?

The answer is basically code risk, right?

So this is just 1 random sample from the GitHub repo of the Privacy and Scaling Explorations team's ZKEVM.

And if you just git clone it and then you go to the circuits repo and then you do like find pipe zarg wc-l and like fancy Linux stuff to figure out the total number of lines of code in the whole thing.

So 34, 469 lines of circuit code that is needed, that is in the circuits of the ZKVM.

And this doesn't even get into the complexity of the circuit compiler itself.

So basically there's a lot of black box demons hiding inside this entire zk circuit to polynomial verification blah blah blah pipeline.

For simple programs it might be possible to make some kind of proof that is bug free.

If All you want to do is just prove that something is a polynomial, then, well, okay, fine, you can prove it.

You can make KCGs and they're simple enough and it's probably fine, right?

Maybe go a step further and start thinking about the shuffle proof in a single secret leader election.

Okay, that looks like the circuit's getting a bit more complex and it's like 10 equations instead of 1, and it's maybe on the edge.

Get to proving an entire EVM and it's just crazy.

So I think 34, 469 lines of code are just not going to be bug free for quite a long time.

And so the question is, well, if we can't make 34, 469 lines of code bug free, then what are we going to do about it, right?

Like, is there a practical near-term and a medium-term alternative that actually can still get us some degree of trustlessness or trust minimization?

So 1 simple option, and I think this is an option that a lot of people are gravitating to already, is this idea of a high threshold governance override, right?

So basically, you know, you have some number of guardians and you have some high threshold on it.

You can make it be 70% of token holders, like whatever, right?

You have some kind of high threshold override where if it's very clear that some kind of bug has happened, then the guardians can override it and they can say, okay, fine, this thing that got accepted by the proving system is actually an invalid state root and the governance is going to replace it with some different state root that it decides is valid.

But because the threshold is very high, it's very unlikely that it would be able to actually push through something incorrect.

Because if you want to push through something incorrect, you would have to actually corrupt 75% of this group of people.

You basically combine a proving system with a high-threshold override, and then you get some level of trust minimization.

So in order for a state root that is correct according to the code to pass through, you only need 3 of the 8 guardians to be honest, but then in order for a state root that's incorrect to pass through, according to the fraud proof to pass through, you would need a 6 of 8 guardians to be dishonest.

So you have some degree of trust minimization, and you get to the point where the code doesn't have absolute power, but at least the code means something, right?

And so the question is, well, how long is it going to take until the roll-ups that are in this room, the roll-ups that are in the CGLA system, will get to the point where they're comfortable at least doing this, right?

At least getting to the point where they're not like purely run by a training wheels, but where they're actually, you know, run by some linear combination of a training wheels and actual code.

But option 1, I think, has a couple of weaknesses, right?

So 1 of those weaknesses is that it still does have some vulnerability to governance, right?

So The vulnerability to governance is not that high, but you can totally imagine scenarios where you actually do mess up and enough of the governors actually do get corrupted at the same time and the system does end up actually freezing or doing something really bad.

Another issue is that I think in general, these kind of community governance actors, there's a lot of complexities involved in choosing them.

There's the social question of which ones different groups of people will trust, the legal question of who is going to be willing to be a governor and what even their risk and responsibilities are, just a whole bunch of issues that actually come up when you try to actually create a set of guardians.

So I think option 1, if you have to do it, then I think it's a good idea to do it.

And I think it's definitely an improvement over the status quo, which is basically where you have a governance committee that generally can just override the prover and make it lead to whatever result it wants.

But ideally, it would be nice if we could have something other than this, right?

So, the idea here basically is that instead of having a multisig of people, you have a multisig of different proving systems.

So The philosophy behind this should be pretty simple.

The Ethereum network to some extent does something similar, right, because we have multiple implementations of the Ethereum protocol.

And so, you know, right now I think both Prism and Lighthouse have somewhere around a third of all the validators.

And so if either Prism or Lighthouse have a bug, but the other clients don't, then the worst case is that the chain stops finalizing for a few hours, and then it comes back to normal.

And the average case might even be that the chain just actually keeps going and ignores them.

So multiple implementations basically allow for a much more resilient network because if 1 implementation has a bug in 1 place, then chances are another implementation will not have a bug in the exact same place.

Especially if that other implementation is created by a different team that has a different philosophy and even just a fundamentally different architecture strategy.

So in this diagram here, the judge hammer and the scales of justice are meant to represent fraud proofs and arbitration, as some of you maybe might have already guessed.

And the spooky looking chip there that Stable Diffusion generated for me 2 days ago is supposed to represent a zero-knowledge proving system.

So doing a multi between a fraud-proof and a ZK rollup is actually a really powerful idea because fraud proof based systems and ZKVMs are just designed so fundamentally differently, they rely on such fundamentally different assumptions that basically the level of correlation between them is going to be very low, right?

Like basically the only way in which you might have some kind of correlation is if either there's some kind of bug or ambiguity in the yellow paper in a particular place or the same people are involved or there's some really clever attack against both of them, but the bar to get there is very high.

Like you're not just going to get the exact same kind of bug in a fraud prover and a ZK EVM by accident.

So, And then even within fraud provers, there's like a bunch of different approaches, right?

So 1 approach, for example, is that you make a fresh new, like basically a new implementation of the EVM that's designed around proving 1 specific computational step.

Another approach is you compile the GEPH source code into some minimal virtual machine like MIPS for example, and you then make a fraud prover for MIPS and you just like push the entire GEPH code through that and you just create a fraud prover of MIPS and have everything go that way, right?

But instead of GIF, maybe you could stick Aragon in there, or maybe you could stick Nethermines, or maybe instead of MIPS, you could use some different machine, right?

So there's a lot of different ways to make a fraud prover.

There's also different ways to make a ZK AVM, right?

So the PSE ZK AVM is kind of a direct compilation.

The Polygon Hermes team, I believe, is doing this clever thing where they first compile the EVM to an intermediate language and then proving that intermediate language only takes like I think about 7, 000 lines of code instead of 37, 000 or or maybe sorry it's like the representation of the EVM in their assembly is what takes 7, 000 lines of code right So there's different approaches and then there's the, I mean, well, there's the zk-sync strategy of like just going straight solidity.

So there are different ways to architect these systems and if you have 3 different proving systems that are architected very differently, then you might have a lot of redundancy.

Option 2B, more complex variants of the multi-prover strategy.

So if someone submits 2 conflicting state routes to 1 particular prover and both state routes pass, then that prover gets turned off, right?

So the idea here is basically that if some prover is able to accept multiple state routes, then clearly something is wrong, right?

Because it's saying yes to 2 conflicting outputs.

And so you shut it off, and either you reduce the size of the multisig, or you replace it with governance that has to choose a different prover.

Another approach is if no successful message gets passed through a particular prover for 7 days, that prover is turned off, right?

So if the prover is deadlocked, if the prover is being unable to accept even things that are valid, then you can shut it off too.

So 1 of the interesting things about these 2 ideas is that they're actually kind of inspired by smart contract wallet designs, right?

So the concept of self-canceling, that's A very close parallel to this concept of vaults that I think Amin Gonsir and a couple of other people were really promoting a few years ago, where basically you have a smart contract wallet where you can initiate a withdrawal, but then that withdrawal takes 24 hours to finish, and before those 24 hours come, you can...

That exact same key can cancel the withdrawal, right?

So the idea basically is that if you get hacked but you still have your key, then you can, like, constantly prevent the hacker from actually taking the money.

And then there would be some third override key where if 1 key clearly keeps cancelling itself, then that third key with maybe a one-week delay can actually take funds out.

So basically this approach is that exact same idea, except instead of being applied to a personal wallet using private keys, it's applied to a rollup using multiple provers.

And then, the second idea, it's like basically, it's like social recovery for provers.

It's like if a prover is clearly not able to do something, then some other mechanism can switch it.

You can actually do some surprisingly interesting and clever stuff here.

This gets us to a third technique, right?

So this is a two-prover plus governance tie-break.

So we're gonna make 2 provers and we're gonna make them very different.

So 1 of them's gonna be ZK and 1 of them is going to be optimistic.

Basically, so we have a 2 of 3 between the ZK, the optimistic, and governance.

So there's actually a bunch of different ways to architect this.

So 1 of them is the thing that I mentioned on Twitter about a month or 2 ago.

Basically the idea being that when you submit a block, there is a 24-hour time window, and after that 24-hour time window, the block gets accepted if there is a snark.

And within that 24-hour time window, if someone opens up a fraud proof, then the fraud-proofing game and the snarking game, they both run, And then if they agree, then whatever they say is accepted as the result.

And then only if they disagree, the governance has to come in and provide the correct answer.

So if you want to take an approach that minimizes the governance's role and makes the governance be a more emergency thing, then you'd probably want to do that.

You'd want to create a system where, by default, you want to have a kind of 2 of 2 between the optimistic and the fraud.

And the way you do this is by having a time window.

And because you have a 2 of 3, you can be more aggressive.

Instead of 7 days, you can make it 24 hours.

And you can say for a block to be accepted, it has to both have a snark and have a 24-hour window pass to make sure fraud proofs didn't come in.

And then if they disagree, then you do some governance thing.

But there is, if you are okay with governance being kind of more regularly active, then there is another approach, which is for every state route that gets submitted, you just let all 3 of these games run, and then as soon as 2 of these games accept the block, then that block gets accepted, right?

And so in the happy case, blocks would actually succeed basically immediately, right?

Because when you have a block, then a ZK proof passes for that block and the governance accepts a block and you have your proof and then the block is finalized within less than an hour.

Basically your only limit is going to be how quickly you can make ZK-SNARKs, right?

Right now it looks like ZK-proving EVM blocks is somewhere in the hours, but in the future I have a lot of faith in you guys, the technology will improve, and we're gonna get ZK Snarks in 12 seconds, right?

So, there's different options, right, is basically what I'm saying.

There's this large design space of different options that you can choose, different trade-offs that you can take, different, depending on how much you value speed versus how much you value minimizing the role of the governance, versus a bunch of other considerations.

There is a lot of options that you can make, and it's probably worth thinking really deeply through these different options and figuring out which 1 of these actually makes sense.

Advantages of this approach is that it actually combines all of the advantages together.

So for this approach, you don't have to trust the governance, because even if the governance is completely evil, even if 7 of 7 get corrupted, it can't contradict the provers if the provers agree.

And you're protected from a bug in either 1 of the 2 provers.

And ideally, if the provers have a very different construction, then the chance of the 2 having simultaneous bugs is going to be very tiny.

So That's the advantage of this kind of design.

1 other interesting thing that's worth talking about here is what does the code look like for the multi-aggregator?

Because the goal of this is to try to minimize the number of lines of code that you have to certify, yes, this is definitely bug free, and if it's not bug free, people are gonna lose 11 and a half billion dollars, right?

So you don't want that to be 34, 469 lines of code, but Maybe it has to be 100 lines of code, maybe it has to be 200.

You want to minimize that as much as possible, formally prove that as much as possible, coordinate on using the same code as much as possible.

So the other question is how do we actually minimize the multi-aggregators themselves?

There's a lot of different possibilities.

1 interesting 1 is that you just literally use a GnosisSafe wallet, right?

Like you just literally throw coins into a GnosisSafe wallet where you have 3 different keys that are owners.

It's just a plain old 2 of 3 Gnosis safe.

And the 3 wallets just are, 1 of them is itself a Gnosis safe of 4 of 7 guardians.

Another is an account that pushes through a message if a snark tells it to push through a message.

And a third 1 pushes through a message if it tells a frog river to push through a message.

So if you do that, then you can even reuse existing code for the thing that actually does the aggregating, which I think is really cool and I think really does reduce the surface area of code that you have to trust unconditionally by quite a bit.

But these are also things that are worth thinking about.

So I think the big 1 is that ZK EVMs are not going to be bug-free for a long time.

And this is something that's probably worth internalizing and accepting.

Basically, what's amazing about the ZK space is that I think it's the 1 part of the crypto space that actually has exceeded expectations in how fast things are going to come.

You know, you got like the merge and it's like, oh, it's going to come in 2015, oh, in 2017, and then like, oops, it came in 2022.

But then we got ZK EVMs and it's like, oh, they're going to come in 2030, maybe 2027, and then like, oops, we have prototypes in 2022, right?

So that's like the good news about ZKEVMs.

But the bad news about ZKEVMs is that I think we're going to have this long period of time during which they exist, but they're like fairly untested, we don't know if there's bugs in them, there might be bugs in some scary proof systems, there might be bugs in circuit compilers, there might be bugs in the ZKVM code itself.

And so for some number of period of time, which I think will easily last quite a few years, we are going to have these kind of circuits that we trust to a high degree, but don't trust completely.

And the fact that we trust them to a high degree means that we should use them.

And it means that we should actually get the benefit from them and not just create systems where we're giving lots of power to a multi-sig.

But the fact that they're not perfect means that we also have to compensate for the possibility that something about them actually breaks.

So with multiple implementations, with governance backups, with multiple implementations and governance backups, we can minimize the chance that bugs are going to actually lead to catastrophic outcomes.

There is a tradeoff space between security versus bugs and security versus bad governance, right?

And I think, like, this year, everyone's optimizing for security against bugs, which is probably correct.

10 years from now, I think everyone should be optimizing for security against bad governance, which is probably going to be correct then.

And between now and 10 years from now, we should slowly move that slider from trusting governance more to trusting the code more as the code becomes more trustworthy.

So I think keeping governance involved is a good idea, but it's also a good idea to keep it only involved in emergencies.

Now, this is intended to be a talk about layer 2s, but I think 1 other interesting thing that's worth mentioning here is that there is also a layer 1 angle to this concept of ZK multiproving.

And the issue here is we want to use ZK EVMs on layer 1.

My vision for the future of running an Ethereum node is basically that you should not need to have a piece of fancy hardware.

You should be able to have a full Ethereum node staking a million dollars of ETH if he wants to running on a phone.

And how do you validate an Ethereum block?

That Ethereum block might contain 3 and 1 half megabytes of data.

Download the 3 and 1 half megabytes of data, hash it, stick the hash into a public parameter, verify the SNARK, done.

Like, so, you know, it would be lovely if we could get to the point where verifying Ethereum blocks and running an Ethereum full node is as kind of simple and low resource and decentralization friendly as that.

But in order to get to that future, we need to have a snark that can verify everything, right?

We need to have a ZK EVM and a ZK Ethereum consensus layer and probably recursive ZK ZK and like ZK everything.

And we're basically gonna have to trust everything to a bunch of polynomial math some weirdos in universities invented.

Dropouts contributed a lot to the ZK space.

3 cheers for dropouts who contributed to polynomials.

But if we want to actually get there on layer 1, then we're also going to go through this period of time where we can't trust 1 implementation to be infallible.

And so then the question is, well, what would a multiple implementations vision for zk-snarks at layer 1 actually look like?

And I think here there's some interesting answers.

So 1 possibility for this is that, okay, we have different clients.

We have 5 execution clients, we have 5 consensus clients, and maybe we're going to also have 5 ZKVM engines.

And so instead of there being 25 client combinations, we're going to go up to 125 client combinations, which makes Ethereum 5 times more diverse and secure.

So then the question is like, OK, somebody creates a block, and then the peer-to-peer network is just going to generate a proof for a block of each type, right?

And like, OK, maybe you have a node running ZKE VM engine A that created a block, but then because the data's all in the clear, once it gets out there, someone else can come along and they can make a ZKSnark that is compatible with ZKEVM engine B, and then people running engine B on their clients are going to be able to verify it.

And so you are going to be able to basically get the same benefits of client diversity, but in this ZKVM world.

But then the question is, well, how do we actually get there?

What is the first part of the Ethereum consensus that we actually are willing to ZK?

Is it actually going to be possible to ZK-proof things quickly enough?

There's a lot of these different issues, but I think realistically, like some kind of multi-proving future like that, and possibly even a hybrid future, right?

Possibly even like we might even see a future where, for example, all the institutional stakers still run regular phonodes, but some home stakers run a couple of experimental ZKProvers.

We might end up going through a couple of those phases, but I do expect that some kind of multiproving and hybrid proving system is going to be the future on layer 1 as well and not just layer 2.

See all Scroll transcripts on Youtube

Rollup Day 2022 - Multi-Provers for Rollup Security w/ Vitalik Buterin