Bufferbloat: The Hidden Bottleneck

Bufferbloat is a network performance degradation that causes high latency and jitter in data communications. It originates when network gateways have excessively large buffer queues. Routers use buffer queues to cache packets when there’s a network congestion event. Once the egress interface can send more traffic, the router then transmits the cached packets. However, cached traffic in buffer queues will have an increased latency (queuing delay). As we’ll see, queueing delay can degrade data transfers as well as voice and video calls, which are time sensitive by nature. 

Bufferbloat

Bufferbloat Causes

Bufferbloat happens when the egress interface of a router, like the Internet connection, lags behind the ingress interface, represented by the LAN side. In a home connection, when the LAN generates more traffic than the Internet’s speed, it will cause network congestion. In this case, the router starts queuing network packets instead of discarding them. Let’s see why routers with excessively large buffer queues confuse the TCP congestion avoidance algorithm, causing bufferbloat and network performance degradation. First, let’s have a look at how TCP works.

The TCP congestion avoidance algorithm

The TCP congestion avoidance algorithm actively applies to a TCP socket connection, overseeing its internal and dynamically managed buffer referred to as the Congestion Window Size (CWS). The CWS is used to calculate how many packets the sender should transmit on the network before receiving acknowledgement of receipt by the receiver. Once the sender receives acknowledgement, it will send another set of packets. 

The CWS size, and consequently the resulting TCP throughput, is regulated by the packet drop rate. When a packet is dropped, the network is telling the sender that it cannot sustain a certain speed. As a result, the sender reduces the amount of traffic that is transmitted unacknowledged, and speed rate as a result. The following image show the correlation between the Congestion Window Size and the round trips. As you can notice, packet loss causes a drop in the congestion window size and, consequently, a reduction in TCP throughput.

TCP congestion algorithm

Bufferbloat of TCP connections

When a router with large queues caches too many packets that otherwise would have been discarded, it signals the sender that the network can sustain a throughput that is higher than what it can actually send. As a result, the sender will further saturate the router’s queue. This will continue until the router eventually has to drop a large amount of traffic. This causes a higher reduction of transmitted packets than it would if the queues were smaller, exacerbating the issue. For this reason, gateways with large queues cause more harms than benefits. Reconfigure a router with this problem, if possible.

If you want to read more about TCP performance, read the white paper on network performance and end-user experience. We also wrote on how the congestion avoidance works, and what’s the relationship with the Matis equation.

How Can I Detect Bufferbloat?

There are many tools available to test if a device (e.g. your Internet gateway) is causing bufferbloat. One simple test to detect bufferbloat is the following. It requires running two ping tests to an Internet website:

  1. The first test is when the network is not congested, that’s the unloaded test.
  2. The second test when the network is congested (e.g. starts a large download), that’s the loaded test.

If there’s a big difference between the two, then probably the Internet gateway is susceptible to bufferbloat.

One publicly available resource that can detect bufferfloat is the fast.com speed test by Netflix. Similar to standard speed tests, it generates data traffic to assess internet speed while also evaluating bufferfloat. In its comprehensive report, this test computes both unloaded and loaded timing. In the provided screenshot, it’s evident that the tested internet gateway is vulnerable to bufferbloat. Notably, the discrepancy between unloaded and loaded timing measures approximately 900 milliseconds.

Connection with bufferfloat

In this other screenshot, you can see a fast.com test with a home connection where bufferbloat is not present. The difference between the loaded and unloaded is approximately 46 milliseconds.

Connection without bufferbloat

Solution to Bufferbloat

Unfortunately you can’t reconfigure the majority of home gateways to avoid using long queues. The only option you have available is to purchase a gateway without buffers. There’s a whole list of home gateways that implement Smart Queue Management that provide protection from bufferbloat. You can search online for different vendors and device models. As for enterprise grade networking hardware, this falls under the Quality of Service section of a router or switch’s configuration. Refer to the vendor documentation for more details.

The impact of bufferbloat to work from home users

It’s important to recognize and address bufferbloat, especially for work from home users. If not, it will impact workers productivity, job performance, and drive support costs especially for large organizations with many remote users. NetBeez provides an easy way to remotely diagnose performance degradation issues via Windows and Mac software clients. Check out the remote worker network monitoring solution. If you want to learn more, request a demo.

decoration image

Get your free trial now

Monitor your network from the user perspective

You can share

Twitter Linkedin Facebook

Let's keep in touch

decoration image