Speed Advances Via Better Software – Coded TCP

Recently, two technology advances have been announced which promise to deliver significant improvements in speed on crowded or lossy networks:

Both of these advances are based entirely in software, implying that they can be easily rolled out and installed on existing equipment – no hardware changes required. Given that technical details have yet to be announced on either method, it’s worth dissecting the claims made in the announcements to attempt to understand what these are, how they work, and more importantly, why they work. In this blog we will address Coded TCP. In a subsequent post we’ll address WiFox.

Where the Change will Happen
Coded TCP introduces changes to the way TCP behaves. It therefore requires changes both on the client and on something upstream. Although the most obvious place to make the upstream change is on the server, the logistics of rolling out TCP stack changes to every server across the entire Internet suggest that it may be more effective to create in-line proxies to apply the TCP changes for transmission to the client. This is essentially a Layer 4 solution.

What Problem is Solved
Coded TCP addresses a subtle problem deep in the internals of TCP itself. TCP implements reliable delivery through acknowledgement of received data, sending an ACK packet back to the sender. It’s fairly smart about this, and has a number of internal timers and state machines which have been designed over time to address problems as they’ve been identified. Once such problem was “delayed duplicates,” in which a network change would suddenly increase latency, delaying the delivery of either a packet or its ACK. The sender, unaware of the sudden change, would assume that its original packet had been lost, and re-transmit. However, if the receiver had already sent an ACK to the first packet, then the sender would assume that the ACK was in response to the re-transmission (rather than the original packet), and it would incorrectly assume that the network had gotten faster rather than slower. The incorrect time adjustment then locks the sender into a cycle of inappropriately rapid retransmissions for every packet, with corresponding false interpretation of the network timing. In response to this problem, TCP was re-engineered to be more patient in reacting to a missing ACK, to resume the transmission more slowly, and to require more frequent ACKs. This slow resume feature is now itself a problem in chronically lossy networks (such as cellular and Wi-Fi). A lost TCP packet can result in a significant delay in retransmission, often in the order of several seconds, and usually requiring reception of multiple duplicate ACKs. These delays are deadly to streaming media. When the primary concern for networking was low bandwidth and cost per packet, the caution in TCP made sense. Today, when bandwidth is cheap and the primary concern is rapid delivery, the caution leads to usability problems for users accustomed to real-time interactivity.

How the Problem is Solved
Since details haven’t been released, a lot of this section is speculation, but, assuming that the problem described above is the one being solved, it’s probably accurate.

Coded TCP leverages the fact that TCP is a stream-based protocol, delivering large blocks of data that are artificially subdivided into packets. The stream nature means that the data can be divided and encoded (not encrypted, there’s a very big difference) in a way that each packet contains not only its own payload, but also information about the packets before and after. Although there is likely going to be additional overhead per packet – which would historically have been derided as inefficient – the advantage is that a missing packet’s payload can be re-created by the receiver, based on the context of the other packets. It’s similar in concept to RAID 5: add the overhead of parity so that an array can lose a drive without losing any data. While it loses efficiency, it more than makes up for it in reliable delivery, as the stream can continue without the significant pauses that a dropped packet would normally create.

The Future
Coded TCP has been tested under real-world conditions, but the complexity of its deployment, along with its commercialization, will likely limit its adoption. I’d expect this technology to show up first on smartphones, which already have a great deal of control from carriers, and which already have an extensive back-end infrastructure to translate between cellular signaling and straight IP. The back-end could easily include a Coded TCP proxy, which would translate the TCP stream before cellular transmission, with decode happening on the smartphone. A less likely scenario is inclusion of a Coded TCP proxy on Wi-Fi APs, especially given that special client software would be required to take advantage. This scenario is only likely to happen if the technology becomes standardized, or the company behind it is bought by Microsoft.

One thought on “Speed Advances Via Better Software – Coded TCP

Leave a Reply