Somtochi Onyekwere’s recent talk at QCon London 2025 delved into Corrosion, Fly.io’s open-source distributed system designed for fast eventual consistency. Corrosion was developed as a replacement for Consul to improve scalability and data dissemination speed across Fly.io’s globally distributed cloud platform.
Fly.io’s platform enables developers to deploy applications across more than 40 regions easily. The platform uses Docker images transformed into Firecracker VMs. Fly.io leverages Anycast for routing users to the nearest server, which involves announcing the same IP address from multiple edge servers. Due to their initial dependency on Consul, they faced a challenge of data latency for read and write operations. Each node used an Attache process to pull data from Consul, which added overhead.
To address these challenges, Fly.io developed Corrosion, a distributed system replicating SQLite data across nodes. Each node in the Fly.io network runs an instance of Corrosion, utilizing SQLite for local data management.
Corrosion uses Conflict-free Replicated Data Types (CRDTs) to manage data synchronization. CRDTs allow independent updates on different nodes. These updates can then be merged without requiring a specific order of operations. This ensures that all replicas eventually converge to the same state. The system employs the SWIM gossip protocol for cluster membership. This allows nodes to be aware of each other’s state. Corrosion uses the QUIC transport protocol for efficient data packet exchange between nodes.
Onyekwere’s talk highlighted how CRDTs are crucial in maintaining eventual consistency. They enable independent updates to be synchronized without a fixed sequence, ensuring eventual convergence. The talk also addressed how the system handles CAP theorem constraints. Corrosion prioritizes availability and partition tolerance, potentially sacrificing immediate consistency. Efficient data routing and dissemination are key to providing fast operations across Fly.io’s global network.
Onyekwere also discussed potential issues with Corrosion, including the lack of built-in authorization or authentication, and challenges with destructive changes or schema management. Key lessons learned from the transition from Consul to Corrosion emphasized optimizing data write/read processes for distributed environments.
Implemented in Rust for its memory safety and efficiency, Corrosion is positioned as a robust solution for modern cloud applications. In particular, it suits applications that demand high degrees of data consistency across distributed networks.