Conventional wisdom has it that the buffer in a router should be sized as per the bandwidth-delay product (the so-called BDP rule). That is, the buffer must be able to store a number of bits equal to two times the product of the average roundtrip time (RTT) and the bottleneck link capacity. This has persisted, even when this “rule” was devised in the days when links in the Internet had some semblance of “similar” capacity, and when the link capacities themselves were much, much smaller. Over the last decade , however, link capacities have grown exponentially, from 10 Mb/s to 100 Mb/s to 1 Gb/s, 10 Gb/s and beyond! Plus, the variability in link bandwidths in different parts of the network is now so large that the notion of a “bottleneck link capacity” that would make sense in different parts of the network no longer holds. In other words, there is never a single right answer for buffer size in a general purpose communication system.
Yet, as memory has become cheaper, the temptation has been to just provision very large static buffers in all parts of the network (sized, per the BDP rule, using the maximum possible bandwidth that the hardware might ever be used in), irrespective of what the actual bottleneck link capacity is likely to be. This leads to very large or “fat” buffers from the access network segments all the way to the core network.
These “dark buffers” lurk in the shadows, sitting empty most of the time, but on filling-up they destroy TCP’s congestion avoidance mechanism, which requires systematic packet drops to function correctly. In other words, “fat” buffers disrupt TCP’s systematic packet drop process, leading to low TCP throughput, and, consequently, to very poor application performance, particularly for interactive and low-latency applications, and also have implications for application performance in the cloud. This has serious consequences in a world where new and interesting applications are what operators and content providers are banking on to garner new sources of revenue!
To delve into some of the details behind bufferbloat, we invited Dr. Jim Gettys (well-known in the world software community as Mr. X-Man, developer of the X-Windows system, who first brought buffer bloat to the attention of the world in summer 2010) to talk to us in an episode of our signature series “Conversations with Experts,” and explain his discovery of buffer bloat, what that means for the Internet, and his suggestions on how to mitigate and ultimately solve buffer bloat, by a complete redesign exercise.
(Note: Since this episode was recorded when we caught up with Jim at an industry conference, there is hotel background noise in the video. However, the actual discussions with Jim are clear enough to be easily understood by the user.)
Bufferbloat – How “Fat” Buffers Are Killing Internet Performance As We Know It!: Dr. Jim Gettys, Bell-Labs in Conversation with in Conversation with Vishal Sharma, Principal Technologist, Metanoia-Inc.
In this Conversations with Experts episode, we focus on the problem of buffer bloat, which is the existence of excessively large buffers at different points in modern communication systems, specifically end-user machines, broadband routers, & metro and core switches and routers.
We begin by asking Jim to explain buffer bloat, and its relationship to packet drops. It turns out that large buffers essentially inhibit timely packet drops, thus disrupting TCP’s congestion control mechanism by disturbing the periodic packet drops that TCP relies on to function normally.
Jim then goes on to explain some of his experimentation with teleconferencing systems that lead to the discovery of buffer bloat about a year back, and describes what he observed, and the conclusions he came to.
Thereafter, Jim proposes two solutions to buffer bloat. The first, is a mitigation strategy, which involves reducing significantly the sizes of buffers found in many parts of our communication systems. He explains the myth of the bandwidth-delay product, and the problems with statically sizing buffers using that thumb-rule.
Given that it is impossible today to “right size” a buffer for proper TCP operation, using any statically-sized buffer, the second (more lasting) solution is to have buffers whose depth varies dynamically, achieved using Active Queue Management (AQM), i.e., solutions like Random-Early Discard (RED), and it’s variants. We explore why AQM has hitherto not been actively adopted by the operator community, and discover some interesting facts that have made providers hesitant to deploy it.
After explaining the AQM puzzle, Jim discusses how people can actively contribute to the ongoing efforts to help address buffer bloat, and be a part of its effort to build a good home router.
More details on the Buffer Bloat community’s activities are here, while Jim’s blog is here.
Bio:
Dr. Jim Gettys is currently with Alcatel-Lucent Bell Labs. Research, and works on immersive, interactive applications, and their requirements. He is best-known as one of the two original developers of the X-Windows system at MIT, and, later, for his work on the specification of the HTTP v1.1 protocol, and his contributions to the One-Laptop Per Child (OLPC) initiative.
Jim has had a long and distinguished career at a number of marquee companies and institutions over the last 30 years, including DEC, HP, MIT, Princeton, Harvard, and One-Laptop Per Child.
Prior to his current role at Alcatel-Lucent/Bell Labs., Jim was Software Architect and Vice-President of Software at One-Laptop Per Child, where he played a seminal role in reviewing and overhauling much of standard Linux software, in order to make it run faster and consume less memory and power.
He previously served on the GNOME Foundation Board of Directors, and has also worked at the World Wide Web Consortium (W3C). Jim is also the editor of the HTTP/1.1 specification in the Internet Engineering Task Force. Jim also helped establish the handhelds.org community, from which the development of Linux on handheld devices can be traced.
Jim won Bob Metcalfe’s 1997 Internet Plumber of the Year award on behalf of the group who worked on HTTP/1.1., and is one of the keepers of the Flame (USENIX's 1999 Lifetime Achievement Award) on behalf of The X Window System Community at Large.
Jim has BS, MS, and Ph.D. degrees in Earth and Planetary Sciences from MIT.