The near and mid-term future of improving the Ethereum network’s permissionlessness and decentralization
I am sitting here writing this article on the last day of the Ethereum developer interoperability in Kenya, where we have made great progress in implementing and organizing the technical details of upcoming Ethereum improvements, notably PeerDAS, Verkle Trees transition, and decentralized methods to store historical data under EIP-4444. From my own perspective, the rapid development of Ethereum and our ability to release large and important features that significantly improve the experience of node operators and (L1 and L2) users is increasing.
The Ethereum client teams are working together to complete the Pectra devnet
Given these increased technical capabilities, an important question to ask is: Are we developing towards the right goals? A hint at thinking about this question comes from a recent series of unhappy tweets by long-time Geth core developer, Peter Szilagyi:
These concerns are valid. They are concerns that many people in the Ethereum community have expressed. These are concerns that I personally have voiced on many occasions. However, I also do not believe that the situation is as hopeless as Peter’s tweets imply; rather, many issues have been addressed through ongoing protocol improvements, and many other issues can be addressed through very realistic adjustments to the current roadmap.
To understand the practical implications of this, let’s take a look at the three examples Peter provided one by one. The goal here is not to single out Peter; these are issues that many community members widely care about, and it is important to address them.
MEV and miner dependency
In the past, Ethereum blocks were created by miners who used relatively simple algorithms to create blocks. Users would send transactions to the public p2p network, typically called the “mempool” (or “txpool”). Miners would listen to the mempool, accept valid transactions, and include them in blocks for a fee. They would include as many transactions as they could, prioritizing those with the highest fees if there wasn’t enough space.
This was a very simple system that was friendly to decentralization: as a miner, you could simply run the default software and earn the same fee income per block as highly specialized mining farms. However, around 2020, people started exploiting what is known as Miner Extractable Value (MEV): income that can only be obtained by executing complex strategies that take advantage of activities happening inside various DeFi protocols.
For example, consider a decentralized exchange like Uniswap. Let’s say at time T, the USD/ETH exchange rate – on centralized exchanges and Uniswap – is $3000. At time T+11, the USD/ETH exchange rate on centralized exchanges rises to $3005. But Ethereum doesn’t have the next block yet. At time T+12, it does. Whoever creates that block can include as their first transaction a sequence of Uniswap purchases that buy all available ETH on Uniswap at prices ranging from $3000 to $3004. This is additional income, called MEV. Applications outside of DEXs also have similar issues. The Flash Boys 2.0 paper from 2019 goes into this in detail.
The problem is that this breaks the story of why mining (or, after 2022, block proposing) can be “fair”: now, there are larger participants with better capabilities to optimize these types of extraction algorithms that can obtain better returns per block.
Since then, there has been a debate between two strategies, which I will call MEV minimization and MEV elimination. MEV minimization comes in two forms: (i) actively researching MEV-free alternatives to Uniswap (e.g., Cowswap), and (ii) building in-protocol techniques like encrypted mempools that reduce the information available to block producers, thus reducing the revenue they can obtain. In particular, encrypted mempools can prevent sandwich attacks and similar strategies that involve placing trades in front of and behind user transactions in order to financially exploit them (“front-running”).
MEV extraction works by accepting MEV, but attempts to limit its impact on centralization by separating the task of selecting which transactions to include in blocks from the role of validators. Validators are responsible for proving and proposing blocks, but the task of selecting block contents is outsourced to professional builders through an auction protocol. Now, individual investors no longer need to worry about optimizing DeFi arbitrage themselves; they simply join the auction protocol and accept the highest bid. This is known as Proposal Builder Separation (PBS). This approach has precedents in other industries: restaurants are able to remain so decentralized in large part because they typically rely on a relatively centralized set of suppliers to conduct various business with economies of scale. So far, PBS has been quite successful in ensuring a fair competition environment between small validators and large validators, at least with respect to MEV. However, it has produced another problem: the task of selecting which transactions to include has become more centralized.
My view on this has always been that MEV minimization is good, and we should pursue it (I personally often use Cowswap!) – though there are many challenges with encrypted mempools, MEV minimization may not be enough; MEV will not drop to zero, or even close to zero. Therefore, we also need some form of MEV extraction. This leads to an interesting task: how do we make the “MEV sandbox” as small as possible? How do we give builders as little power as possible while still allowing them to absorb the role of optimizing arbitrage and other forms of MEV collection?
If builders have the ability to completely exclude transactions from blocks, then attacks become easy. Suppose you have a debt collateral position (CDP) in a DeFi protocol, backed by an asset whose price is rapidly dropping. You either increase your collateral or exit the CDP. A malicious builder might attempt to collude to exclude your transaction, delaying it until the price drops enough to force the liquidation of your CDP. If that happens, you would have to pay a significant penalty, and the builders would benefit greatly. So how do we prevent builders from excluding transactions and carrying out such attacks?
This is where the inclusion lists come in.
Inclusion lists allow block proposers (i.e., stakers) to choose which transactions are required to be included in blocks. Builders can still reorder or insert their own transactions, but they must include the proposer’s transactions. Eventually, the inclusion lists have been modified to apply to the next block rather than the current block. In either case, they strip builders of the ability to completely push transactions out of blocks.
The above is all a background to a complex rabbit hole. But MEV is a complex problem; even the above description misses many important nuances. As an old saying goes, “You may not be looking for MEV, but MEV is looking for you.” Ethereum researchers have converged on the goal of minimizing the sandbox, reducing the harm builders can cause (e.g., by excluding or delaying transactions as a way to attack specific applications).
That being said, I do believe we can go further. Historically, inclusion lists have usually been seen as an “off the beaten path” feature: usually, you wouldn’t think about them, but just in case malicious builders start doing crazy things, they give you a “second route”. This attitude is reflected in current design decisions: in the current EIP, the gas limit for inclusion lists is around 2.1 million. But we can have a philosophical shift in how we think about inclusion lists: see them as blocks and see the role of builders as adding some transactions to collect MEV as an off-the-beaten-path feature. What if builders had a 2.1 million gas limit?
I think this idea of pushing the sandbox to be as small as possible – really pushing towards that direction – is really interesting, and I am in favor of moving in that direction. It’s a shift from the “2021-era philosophy”: in the 2021-era philosophy, we were more excited about the idea that since we now have builders, we can “overload” their role and have them serve users in more complex ways, e.g. supporting ERC-4337 fee markets. In this new conception, the transaction verification part of ERC-4337 must be brought into the protocol. Luckily, the ERC-4337 team has been increasingly excited about this direction.
Summary: The idea of MEV has come back towards empowering block producers, including giving block producers direct power to ensure the inclusion of user transactions. Account abstraction proposals have come back towards eliminating the reliance on centralized relays, or even forwarders. However, there is a good argument that we have not gone far enough, and I think it is very welcome to have pressure to push the development process further in this direction.
Liquidity staking
Today, individual stakers make up a relatively small proportion of all Ethereum staking, with most staking being done by various providers – some centralized operators, as well as other DAOs like Lido and RocketPool.
I did my own research – various polls [1] [2], surveys, face-to-face conversations, asking the question “Why do you – specifically you – not stake individually today?” From my perspective, a strong individual staking ecosystem is the preferred outcome of Ethereum staking, and one of the best things about Ethereum is that we are actually trying to support a strong individual staking ecosystem rather than just surrendering to centralization. However, we are still far from that outcome. In my polls and surveys, there are several consistent trends:
The majority of those who do not stake individually cite the 32 ETH minimum as their main reason.
Among those who cite other reasons, the highest is the technical challenges of running and maintaining a validator node.
The loss of immediate availability of ETH, security risks of “hot” private keys, and the loss of ability to simultaneously participate in DeFi protocols are important but smaller issues.
Liquidity staking research needs to address two key questions:
How do we address these concerns?
If, despite effective solutions to these issues, the majority of people still do not want to stake individually, how do we maintain the stability and robustness of the protocol despite this fact?
Many ongoing research and development projects are aimed at addressing these issues:
Verkle Trees, combined withEIP-4444 allows staking nodes to run on very low disk requirements. Additionally, they enable staking nodes to sync almost instantly, greatly simplifying the setup process and switching between implementations. They also make Ethereum light clients more feasible by reducing the data bandwidth required to provide proofs for each state access.
Research, such as these proposals, transforms the way staking validators are implemented to allow for a larger validator set (with smaller minimum stake requirements) while reducing the overhead for consensus nodes. These ideas can be implemented as part of a single-slot termination. This would also make light clients more secure as they would be able to verify the complete set of signatures instead of relying on the sync committee.
Continued optimizations of Ethereum clients have been steadily reducing the cost and difficulty of running validator nodes, despite the growing history.
Research on punishment caps may alleviate concerns about private key risks and enable stakeholders to simultaneously stake their ETH into DeFi protocols if they wish to do so.
Withdrawal credentials, 0x01, allow stakers to set an ETH address as their withdrawal address. This makes decentralized staking pools more feasible and allows them to compete with centralized staking pools.
However, we can do more. In theory, it is possible to allow validators to withdraw faster: Casper FFG remains secure even if the validator set changes by a small percentage (i.e., its safety) when it is eventually finalized (once per epoch). Therefore, if we make the effort, we can reduce the withdrawal period even further. If we want to significantly reduce the minimum deposit size, we can make a difficult tradeoff in other directions, such as increasing the finality time by 4x, which would allow for a 4x reduction in the minimum deposit size. Single-slot termination will address this by completely surpassing the model of “every participant participating in every epoch”.
Another important part of the whole issue is the economics of staking. A key question is: Do we want staking to be a relatively niche activity, or do we want everyone or almost everyone to stake their entire ETH? If everyone is staking, then what responsibility do we want everyone to bear? If people ultimately delegate this responsibility out of laziness, it could lead to centralization. These are important and profound philosophical questions. Incorrect answers could lead Ethereum down the path of centralization and “recreating traditional financial systems with extra steps”; whereas correct answers could create a shining example of a successful ecosystem with a broad and diverse set of individual stakeholders and highly decentralized stake pools. These questions touch on the core economics and values of Ethereum, so we need more diverse participation.
Hardware Requirements for Nodes
Many key issues of decentralization in Ethereum ultimately boil down to a question that has defined blockchain politics for a decade: How do we want to access running nodes and how do we want to access?
Today, running a node is difficult. On my laptop as I write this article, I have a Reth node that takes up 2.1TB – this is already the result of heroic software engineering and optimization. I would need to buy an additional 4TB hard drive to put in my laptop just to store this node. We all want running a node to be easier. In my ideal world, people can run nodes on their phones.
As I mentioned above, EIP-4444 and Verkle trees are two key technologies that bring us closer to this ideal. If both are implemented, the hardware requirements for nodes could eventually be reduced to less than 100GB, approaching zero if we completely eliminate the historical storage requirement (perhaps only for non-staking nodes). Type 1 ZK-EVM eliminates the need to run EVM computations yourself since you can simply verify if the execution is correct. In my ideal world, we stack all these technologies together, and even Ethereum browser extension wallets like Metamask and Rabby have a built-in node to verify these proofs, sample data availability, and be satisfied with the correctness of the chain.
The vision described above is often referred to as “The Verge.”
This is well known even to those who are concerned about Ethereum node sizes. However, there is an important question: If we offload the responsibility of maintaining state and providing proofs, isn’t that a vector for centralization? Even if they can’t deceive by providing invalid data, isn’t it still a violation of Ethereum’s principles to rely too much on them?
A very recent version of this concern is the discomfort many have with EIP-4444: If regular Ethereum nodes no longer need to store old history, then who does? A common answer is that there are definitely enough big actors (e.g., block explorers, exchanges, layer 2) who have incentives to preserve this data, and compared to the 100PB stored by the Wayback Machine, the Ethereum chain is small. Therefore, it is absurd to think that any history would be lost.
However, this argument relies on the reliance on a few large actors. In my trust model categorization, this is an N-of-1 assumption, but N is very small. This has its tail risks. One thing we can do is to store old history in a peer-to-peer network where each node only stores a small portion of the data. This network would still do enough replication to ensure robustness: every chunk of data would have thousands of copies, and in the future, we could use erasure coding (in fact, it’s already built-in with blob-style storage like EIP-4844) to further increase robustness.
Blobs have erasure coding within blobs and between blobs. The easiest way to make ultra-robust storage for all of Ethereum’s history may well be to just put beacon and execution blocks into blobs.
Image source: codex.storage
For a long time, this work has been somewhat neglected; portal networks exist, but they haven’t received the attention they deserve given their importance in Ethereum’s future. Fortunately, there is now a strong momentum to invest more resources into a minimalistic version of the portal that focuses on distributed storage and accessibility of history. We should build on this momentum and implement EIP-4444 as soon as possible while building a robust decentralized peer-to-peer network for storing and retrieving old history.
For state and ZK-EVM, this distributed approach is more challenging. To build an efficient block, you simply need to have the complete state. In this case, I personally lean towards a pragmatic approach: we define and stick to some level of hardware requirement that requires a “do-everything node”, which incurs a higher cost than simply verifying the chain (ideally decreasing over time) but still low enough to be affordable for hobbyists. We rely on a 1-of-N assumption where we ensure that N is quite large. For example, it could be a high-end consumer laptop.
ZK-EVM proofs may be the trickiest part, as real-time ZK-EVM proofers may require more powerful hardware than archival nodes, even with advancements like Binius and worst-case gas bindings. We can work towards a distributed proof network where each node is responsible for proving, for example, one percent of a block’s execution, and then the block producer only needs to aggregate one hundred proofs. Proof aggregation trees can provide further help. But if that doesn’t work well, another compromise is to allow for higher hardware requirements for proofs but ensure that a “do-everything node” can directly verify Ethereum blocks (without proofs) at a fast enough speed to effectively participate in the network.
Conclusion
I believe that the Ethereum mindset in 2021 has become too comfortable with offloading responsibility to a few large-scale participants, as long as there is some market mechanism or zero-knowledge proof system to incentivize honest behavior from centralized participants. Such systems generally work well in a general case but can catastrophically fail in worst-case scenarios.
We’re not doing this.
At the same time, I think it’s important to emphasize that the current proposals for Ethereum protocols have moved significantly away from this model and are placing more emphasis on the need for a truly decentralized network. Ideas around statelessness, MEV mitigation, single-slot termination, and similar concepts have pushed further in this direction. A year ago, the idea of doing data availability sampling with relayed nodes was seriously considered. This year, we’ve gone beyond the need to do these things and made surprising strides with PeerDAS.
But there is still so much more we can do in this direction, on all three axes I mentioned above, as well as many other important axes. Helios has made great progress in providing “practical light clients” for Ethereum. Now we need to include it by default in Ethereum wallets and have RPC providers offer proofs along with their results so that they can be verified, and extend light client technology to layer 2 protocols. If Ethereum is going to scale through aggregation-centric roadmaps, then layer 2 needs to have the same level of security and decentralization guarantees as layer 1. In an aggregation-centric world, there are still many other things we should take more seriously; decentralized and efficient cross-L2 bridging is one example. Many dapps obtain logs through centralized protocols because native Ethereum log scanning has become too slow. We can improve this through a dedicated decentralized sub-protocol; here’s a suggestion for how to do it.
There are almost an infinite number of blockchain projects that aim to be “super-fast and consider decentralization later.” I don’t think Ethereum should be one of those projects. Ethereum L1 can and should be a powerful foundational layer for L2 projects that adopt the massively scalable approach, using Ethereum as the backbone for decentralization and security. Even in an L2-centric approach, there needs to be enough scalability in L1 itself to handle the volume of operations. But we should deeply respect Ethereum’s unique properties and continue to strive to maintain and improve those properties as Ethereum scales.
Tags
Vitalik
Ethereum
Decentralization
Nodes