Oh yes indeed. Just as Node.js and NIO proved to the world that bare-metal performance is always worth the consequent unreadable code, just as the flood of bespoke NoSQL databases taught us to value purpose-built solutions over general ones, and just as the Reactive Manifesto reminded us that new branding can give a youthful glow to decades-old ideas, we follow in their Chukka-booted footsteps by challenging the comforts of the ubiquitous Transmission Control Protocol and recognizing the well-deserved renaissance of artisanal protocols built on UDP. We didn’t start this fire, we’re just calling it what it is: the NoTCP movement.
The thing TCP aims to provide over raw IP or UDP is reliability. Packets are error checked, explicitly acknowledged or retransmitted as needed, and reassembled in order; the user is notified if reads or writes may have failed. This can be a handy correctness property for certain applications to build on, but we argue that branding this property “reliability” is awfully misleading. The Oxford English Dictionary defines reliable, as an adjective, to mean:
This gives us warm, fuzzy feelings toward TCP that it doesn’t properly deserve. That is, it might inappropriately suggest “TCP never fails” rather than “TCP never silently fails.” Furthermore, failure is defined to some extent by the user: all of TCP’s sequencing and retransmission of lost packets takes time — occasionally a really long time — and we simply can’t wait forever. We typically have some (perhaps limited) control over retransmission timeouts, and our socket APIs should always provide control over user timeouts (for example, POSIX’s setsockopt). Still, we rely on implementation-defined default timeouts and neglect to handle failures, complacent in our ignorance, because “TCP is reliable.”
TCP is a connection-oriented, multiplexing protocol over IP, where a connection is identified by the IP address and port for the local and remote hosts. So suppose we’re conscientiously setting artisanal timeouts for all our TCP messages. If an individual packet is lost, as determined by the retransmit timeout, it’s sent again over the same connection; if a user timeout is reached, the connection is closed.
This might not be the behavior we want. For example:
The unifying theme here is that the application knows its specific needs, with respect to failure modes and recovery, better than TCP could ever anticipate. When we try to handle network failures in an application built on TCP, we invariably end up poorly reimplementing portions of TCP’s own failure handling logic, missing significant optimization opportunities in the process. Layering on TCP in many cases does us more harm than good; this is a variation of the end-to-end argument.
TCP establishes a connection with a three way handshake, which is relatively expensive in terms of latency. Adding to the problem is slow-start, which limits the throughput of a new connection for the sake of network congestion avoidance. If we want to establish multiple connections, even between the same two IPs, we need to pay these costs for each connection. This is common for HTTP/1.1 traffic, for example, where a client will fetch multiple resources from the same server in parallel.
We naturally want to avoid these per-connection costs, but here’s where things get weird: exhibiting a behavior not entirely dissimilar to Stockholm syndrome, we implement protocols that multiplex over TCP — such as HTTP/2 — rather than dropping down a layer. We take multiple, logical streams of data and interleave them in nondeterministic order over a single TCP connection.
The problem now is that TCP gives us an ordering guarantee that’s much stronger than we need: all packets are reassembled in order as they’re received, but we know that the order in which we interleaved them never mattered to begin with. One delayed packet can artificially delay the availability of the data for all streams multiplexed over that connection; this is one form of head-of-line blocking. As before, the application knows its specific needs better than TCP; in this case we’ve traded startup latency for significant steady-state latency, when neither is necessary.
Oh, absolutely. There are plenty of successful, purpose-built protocols that used UDP before it was cool: DNS, NTP and RTP, to name a few. A more recent success is BitTorrent’s uTP. But we think the poster-child for this movement might be Google’s experimental QUIC protocol. You probably haven’t heard of it, but it’s a “reliable,” connection-oriented protocol — in the same sense as TCP — that supports multiplexing without head-of-line blocking, and doesn’t arbitrarily tie a connection to an IP/port 4-tuple. It is encrypted, which provides an interesting, perhaps subtle guarantee: middle boxes can’t snoop the flow and congestion control information (such as ACKs and NACKs). This gives us some freedom to choose different congestion control algorithms best suited to our needs.
Perhaps even QUIC is too fast-food for your refined, tastes, though. Maybe you think its implementation of forward error-correction is just an ECIP rip-off. You might view the layered OSI model as a fundamental mistake, and look for other ways to horizontally compose new, locally-sourced protocols from modular building-blocks. Or maybe you have some clue about how the internet actually works, and you don’t need a manifesto to tell you what you already know.
Sign the manifesto and join the movement: follow @notcp on Twitter.