Infrastructure

Decentralized infrastructure: What happens when operators lose control

In decentralized systems, the protocol determines correctness, and the operator determines reliability. This shifts operators from writing the rules to following them and changes where the hardest infrastructure decisions live.

Josh Dougall, Ben Adar

May 21, 2026 • 5 min read

How infrastructure as code shaped the assumption of operator control

Traditional web infrastructure operates on one assumption: the team operating the system controls the system. This assumption has shaped the trajectory of infrastructure engineering, bringing us the rise of Infrastructure as Code (IaC) over the last decade.

IaC is the practice of defining and managing software infrastructure. It includes operating servers, networks, and databases through machine-readable configuration files. Tools like Terraform and Ansible let operators declare what the system should look like, and the tooling enforces that desired state automatically.

A survey of 753 leaders and professionals worldwide found that at least 80% were using some form of IaC. Cloud-based infrastructure is experiencing a massive increase in adoption, largely driven by demand for AI. This landscape even affects the most highly distributed web2 environments.

Cloud computing introduced the concept of shared responsibility, but within a well-defined contractual framework. Operational authority is distributed across organizations, but each organization's scope of control is clearly delineated.

The system may be distributed across many machines and multiple regions, but a single organization ultimately owns its deployment pipeline, its access controls, and its operational policies.

Decentralized protocols remove operator control

Decentralized systems break this assumption. The team operating the system no longer control it.

Once infrastructure depends on a decentralized protocol network, control becomes external. These systems have a higher degree of autonomy, as the protocol determines the rules, and operators must adapt to them rather than enforce them.

Who Governs Ethereum's Infrastructure?

As of May 2026, Ethereum has nearly 900,000 active validators operating in more than 80 countries. A node operator's entire job is to ensure uptime. They don’t directly validate transactions, but ensure their machine’s software is live so the protocol’s consensus mechanisms can.

Since so many people depend on Ethereum's stability, the coordination threshold for core changes is very high. A 2024 study found that over 1,100 individuals commented on Ethereum improvement proposals (EIPs). But the same study also found that 10 individuals are responsible for proposing 68% of all implemented Core EIPs, and on average, 10 people per client implementation (the software nodes run) are responsible for 80% of all software changes.

None of these individuals is likely to be on your infrastructure team (Unless you're ChainSafe with both the Lodestar client implementation and a full infrastructure team under your belt). Yet the decisions they make, through governance processes spanning multiple independent organizations and development teams, directly determine how the system your infrastructure serves behaves.

The inversion: From writing the rules to following them

This is an inversion of the traditional operating model of, you run it, you control it. In conventional infrastructure, the operator writes the rules: the IaC defines the desired state, the CI/CD pipeline enforces deployment schedules, and the monitoring stack alerts on deviation from expected behaviour. In decentralized systems, the protocol's rules are determined by its design and governance processes.

Why decentralized networks still go down: Solana's rise

The consequences of this inversion are not theoretical. Solana, one of the highest-throughput blockchain networks, has experienced seven major outages that completely halted block production since its mainnet launch in 2020. Solana prioritizes safety over availability in its consensus design, meaning the network will halt completely rather than risk inconsistent states or double-spending.

It’s only an outage if it comes from the Solana region of San Diego County, otherwise it’s just a sparkling stall in block production

thx
— mert (@mert) February 6, 2024

When a validator consensus bug caused an 18-hour outage in 2022, or when a software bug triggered a block production halt in 2024, node operators could not independently patch or restart to resolve the issue. The validator community's response time has improved significantly since early outages, with the 2024 incident resolved in under five hours compared to much longer downtimes during 2021-2022 when coordination mechanisms were less developed, but the fundamental dynamic remained: recovery required coordinated action across a distributed set of independent validators, not a single operator exercising control over their own systems.

Where operational control ends in decentralized systems

In practice, decentralization shifts the boundary between control and verification.

Infrastructure teams no longer guarantee correctness directly. In a traditional database, the operator ensures consistency through replication configuration, conflict-resolution policies, and integrity constraints that the operator defines and enforces. If the data is wrong, the operator has the authority to trace the issue, correct the state, and prevent recurrence.

Decentralized systems work differently. The operator neither produces nor adjudicates the correct state. The protocol's consensus mechanism does, and correctness can be independently verified by anyone with access to the cryptographic proofs. The operator's role shifts from "ensure this system produces correct results" to "ensure our infrastructure correctly follows and serves what the protocol determines to be correct."

Web3 infrastructure expectations vs. web2 uptime standards

This shift redistributes operational responsibility.

The web3 infrastructure market that has emerged around this redistribution makes the point clearly. Top-tier RPC node providers now routinely commit to 99.90% to 99.99% availability, and enterprise clients hold them to it. The expectations placed on infrastructure operators do not shrink because the underlying system is decentralized.

If anything, they intensify: unreliable RPC endpoints lead to mission-critical unavailability incidents, poor user experience, and existential-threat-level business risk, as applications built on this infrastructure often handle significant financial value. Even the standard web2 uptime benchmarks currently exceed those available in web3 infrastructure, which means web3 infrastructure teams face the additional challenge of closing that gap while operating on systems they do not fully control.

From control to coordination: The new operating model for web3 infrastructure

This creates the defining tension of decentralized infrastructure work. The AWS shared responsibility model, for all its complexity, draws a clean line: AWS operates and controls the components from the host operating system and virtualization layer down to the physical security of the facilities, while the customer manages the guest operating system and application software.

Both sides of that boundary operate within a single contractual relationship with a well-documented scope. In decentralized systems, the boundary is not between you and a provider bound by contract. It is between you and a protocol network governed by community consensus, with a very high coordination threshold for core changes, and where the behaviour of the underlying system can shift due to governance decisions, validator participation patterns, or network conditions that no single operator controls.

That shifts the role of infrastructure from control to coordination. The goal becomes building systems that remain reliable despite external dependencies and dynamic network conditions. Not by enforcing how the system behaves, but by adapting to it. This shift challenges many existing operational assumptions, but it also creates opportunities for more resilient and transparent systems, ones where correctness can be independently verified rather than merely trusted.

Understanding how to operate effectively within this model starts with a specific question: where exactly does your operational control end?

The protocol determines correctness. The operator determines reliability. That boundary is where decentralized infrastructure work actually begins, and where the hardest operational decisions around monitoring, failover strategy, and upgrade coordination live.

About ChainSafe

ChainSafe is a leading blockchain research and development firm specializing in protocol engineering, infrastructure development & operations, and co-development.

ChainSafe creates solutions for developers and teams across web3. As part of our mission to build accessible, improved tooling for developers, ChainSafe embodies an open source, community-guided ethos to advance the future of the internet.

Website | Youtube | Twitter | Linkedin | GitHub