IronTide — Alan Gaudet

01 Premise

BitTorrent looks simple on paper and is brutal in code. Hundreds of peers, untrusted bytes on every wire, an unreliable network, mutable shared state. I picked it as a first serious systems project because there's nowhere to hide — every layer has to actually work, on real torrents, against real swarms, or nothing downloads.

02 Target

libtorrent-rasterbar parity — not feature-for-feature, but the same kind of thing: a library that other applications could embed, configurable transport, pluggable storage, and deterministic behaviour under load. It should work on real torrents against real swarms, and the code should be clear enough that someone else could read it and follow what's happening.

03 Architecture

05 What It Took

Async actors over shared state

A BitTorrent client has dozens of things happening at once: peers connecting and dropping, pieces being verified, trackers being polled, the UI updating. Mutexes everywhere is the obvious approach and a debugging nightmare. Instead, each peer is its own tokio task, each piece-picker its own task, each tracker session its own task, all communicating through typed channels. tokio's select! macro is the load-bearing primitive. I learned quickly that structured concurrency isn't optional here — without it, the system becomes impossible to reason about within a few days.

The wire protocol, byte by byte

Hand-rolling the wire protocol means writing a length-prefixed message framer, a bencode parser that doesn't allocate per-key, peer handshake with extension negotiation (BEP-10), µTP for UDP transport, Mainline DHT for trackerless discovery, and PEX for peer exchange. Bencode taught me to write a streaming parser. µTP taught me that TCP-on-UDP congestion control is harder than it looks — LEDBAT, sequence numbers, selective ACKs. DHT is where you learn what it means to talk to thousands of strangers' implementations of the same RFC, all of which behave slightly differently.

Deterministic concurrency simulation

Concurrent code under tokio is non-deterministic by default — the scheduler picks tasks however it wants, so a test passes 99 times and fails on the build server. I swap tokio for a deterministic runtime in tests: same APIs, but every poll is scheduled by a seeded PRNG. A failing test prints its seed, and you replay it byte-for-byte to watch the bug happen again. FoundationDB and TigerBeetle use this approach. Getting a working version running in this project changed how I think about testing async systems.

Backpressure and the cost of zero-copy

Bytes come off a TCP socket, get verified against a SHA-1 piece hash, and need to land at the right offset of the right file. Naively that's three or four allocations per chunk. With io_uring and Bytes (refcounted buffer slices) you can do it in one. But zero-copy only works if every actor in the pipeline honours it, and any buffering decision becomes a backpressure decision. When the disk is slow, the network task has to feel it — not by dropping packets, but by yielding upstream until the channel drains. Getting this right is what keeps memory usage flat instead of climbing.

Pluggable backends, real abstractions

Storage is a trait. Network transport is a trait. The piece picker is a trait. Sounds like over-engineering until you write the first integration test and realise you need to swap the disk for an in-memory implementation and the network for a deterministic one. The abstractions exist because testing demanded them, not because I planned for hypothetical future backends.

The 17-crate workspace

Splitting the project into 17 crates wasn't aesthetic — each crate is a compilation boundary, an API surface, and a rebuild horizon. When peer-protocol changes, disk-io doesn't rebuild. When the UI changes, the engine doesn't rebuild. That matters when a full build takes 90 seconds instead of 9. It also forces you to think about layering: what does this crate depend on, and what is it allowed to know about? You end up enforcing those boundaries in the dependency graph instead of in your head.

06 What Generalizes

// shared shape with ai platforms

Shared shape with AI platforms

An LLM gateway and a BitTorrent engine look nothing alike from the outside. Under the hood, both are many concurrent sessions, untrusted inputs, streaming bytes through a pipeline with backpressure, and observability on every actor. Building one teaches you how to build the other.

// why rust for this

Why Rust for this

For a one-person project with this much concurrency, I needed the compiler to catch what I couldn't hold in my head. Refactoring a hot path and having the borrow checker tell me immediately what broke is the difference between shipping and stalling.

// how the learning happened

How the learning happened

The first month was bencode and the wire format and not understanding why my parser kept allocating. The second month was the actor architecture and learning that lifetimes and Send/Sync bounds are not theoretical. The third was DHT and discovering that half the peers on the network don't implement the spec correctly and you have to handle their bugs anyway. The fourth was the deterministic runtime and realising that the bug I'd been chasing for two weeks was a race condition that only appeared under one specific scheduler order.

07 Stack

Rust 2024tokioio_uringSlintaxumHTMX

// the code

Read the source, run it locally, open issues.

Codeberg GitHub