The quiet plan to make the internet feel faster

A few months ago, I downgraded my internet, going from a 900Mbps plan to a 200Mbps one. Now, I find that websites can sometimes take a painfully long time to load, that HD YouTube videos have to stop and buffer when I jump around in them, and that video calls can be annoyingly choppy.

In other words, pretty much nothing has changed. I had those exact same problems even when I had near-gigabit download service, and I’m probably not alone. I’m sure many of you have also had the experience of cursing a slow-loading website and growing even more confused when a “speed test” says that your internet should be able to play dozens of 4K Netflix streams at once. So what gives?

Like any issue, there are many factors at play. But a major one is latency, or the amount of time it takes for your device to send data to a server and get data back — it doesn’t matter how much bandwidth you have if your packets (the little bundles of data that travel over the network) are getting stuck somewhere. But while people have some idea about how latency works thanks to popular speed tests, including a “ping” metric, common methods of measuring it haven’t always provided a complete picture.

The good news is that there’s a plan to almost eliminate latency, and big companies like Apple, Google, Comcast, Charter, Nvidia, Valve, Nokia, Ericsson, T-Mobile parent company Deutsche Telekom, and more have shown an interest. It’s a new internet standard called L4S that was finalized and published in January, and it could put a serious dent in the amount of time we spend waiting around for webpages or streams to load and cut down on glitches in video calls. It could also help change the way we think about internet speed and help developers create applications that just aren’t possible with the current realities of the internet.

Before we talk about L4S, though, we should lay some groundwork.

There are a lot of potential reasons. The internet is a series of tubes vast network of interconnected routers, switches, fibers, and more that connect your device to a server (or, often, multiple servers) somewhere. If there’s a bottleneck at any point in that path, your surfing experience could suffer. And there are a lot of potential bottlenecks — the server hosting the video you want to watch could have limited capacity for uploads, a vital part of the internet’s infrastructure could be down, meaning the data has to travel further to get to you, your computer could be struggling to process the data, etc.

The real kicker is that the lowest-capacity link in the chain determines the limits of what’s possible. You could be connected to the fastest server imaginable via an 8Gbps connection, and if your router can only process 10Mbps of data at a time, that’s what you’ll be limited to. Oh, and also, every delay adds up, so if your computer adds 20 milliseconds of delay, and your router adds 50 milliseconds of delay, you end up waiting at least 70 milliseconds for something to happen. (These are completely arbitrary examples, but you get the point.)

In recent years, network engineers and researchers have started raising concerns about how the traffic management systems that are meant to make sure network equipment doesn’t get overwhelmed may actually make things slower. Part of the problem is what’s called “buffer bloat.”

Right? But to understand what buffer bloat really is, we first have to understand what buffers are. As we’ve touched on already, networking is a bit of a dance; each part of the network (such as switches, routers, modems, etc.) has its own limit on how much data it can handle. But because the devices that are on the network and how much traffic they have to deal with is constantly changing, none of our phones or computers really know how much data to send at a time.

To figure that out, they’ll generally start sending data at one rate. If everything goes well, they’ll increase it again and again until something goes wrong. Traditionally, that thing going wrong is packets being dropped; a router somewhere receives data faster than it can send it out and says, “Oh no, I can’t handle this right now,” and just gets rid of it. Very relatable.

While packets being dropped doesn’t generally result in data loss — we’ve made sure computers are smart enough to just send those packets again, if necessary — it’s still definitely not ideal. So the sender gets the message that packets have been dropped and temporarily scales back how its data rates before immediately ramping up again just in case things have changed within the past few milliseconds.

That’s because sometimes the data overload that causes packets to drop is just temporary; maybe someone on your network is trying to send a picture on Discord, and if your router could just hold on until that goes through, you could continue your video call with no issues. That’s also one of the reasons why lots of networking equipment has buffers built in. If a device gets too many packets at once, it can temporarily store them, putting them in a queue to get sent out. This lets systems handle massive amounts of data and smooths out bursts of traffic that could have otherwise caused problems.

It is! But the problem that some people are worried about is that buffers have gotten really big to ensure that things run smoothly. That means packets may have to wait in line for a (sometimes literal) second before continuing on their journey. For some types of traffic, that’s no big deal; YouTube and Netflix have buffers on your device as well, so you don’t need the next chunk of video right this instant. But if you’re on a video call or using a game streaming service like GeForce Now, the latency introduced by a buffer (or several buffers in the chain) could actually be a real problem.

There are currently some ways of dealing with this, and there have been quite a few attempts in the past to write algorithms that control congestion with an eye toward both throughput (or how much data is being transferred) and lower latency. But a lot of them don’t exactly play nice with the current widely used congestion control systems, which could mean that rolling them out for some parts of the internet would hurt other parts.

This is the trick of internet service provider, or ISP, marketing. When users say they want “faster” internet, what they mean is that they want there to be less time from when they ask for something to when they get it. However, internet providers sell connections by capacity: how much data can you suck back at once?

a:hover]:shadow-highlight-franklin dark:[&>a:hover]:shadow-highlight-franklin [&>a]:shadow-underline-black dark:[&>a]:shadow-underline-white”>Bit versus byte

Talking about the amount of time it takes to download files brings up another problem with how internet services are marketed. Usually, we think of file sizes in terms of bytes — a song is 10 megabytes, and a movie is 25 gigabytes. But ISPs rate connections in bits.

If you miss the distinction, you’d be forgiven for thinking that a service plan that gives you a gigabit per second would let you download a movie in 25 seconds. However, bits are eight times smaller than bytes — one gigabit (Gb) is equivalent to 125 megabytes (MB), or 0.125 gigabytes (GB). So that movie is going to take over three minutes to download instead, assuming perfect conditions.

(By the way, give yourself a prize if you realized that a lowercase b versus an uppercase one is how you distinguish between the two units in their abbreviated forms.)

There was a time when adding capacity really did reduce the amount of time you spent waiting around. If you’re downloading a nine-megabyte MP3 file from a totally legal website, it’s going to take a long time on 56 kilobit per second dial-up — around 21 and a half minutes. Upgrade to a blazing-fast 10Mbps connection, and you should have the song in less than 10 seconds.

But the time it takes to transfer data gets less and less noticeable as the throughput goes up; you wouldn’t notice the difference between a song download that takes 0.72 seconds on 100Mbps and one that takes 0.288 seconds on 250Mbps, even though it’s technically less than half the time. (Also, in reality, it takes longer than that because the process of downloading a song doesn’t just involve transferring the data). The numbers matter a bit more when you’re downloading larger files, but you still hit diminishing returns at some point; the difference between streaming a 4K movie 30 times faster than you can watch it versus five times faster than you can watch it isn’t particularly important.

The disconnect between our internet “speed” (usually what people are referring to is throughput — the question is less about how fast the delivery truck is going and more about how much it can carry on the trip) and how we experience those high-bandwidth connections becomes apparent when simple webpages are slow to load; in theory, we should be able to load text, images, and javascript at lightning speeds. However, loading a webpage means several rounds of back-and-forth communication between our devices and servers, so latency issues get multiplied. Packets getting stuck for 25 milliseconds can really add up when they have to make the journey 10 or 20 times. The amount of data we can move at one time through our internet connection isn’t the bottleneck — it’s the time our packets spend shuffling between devices. So, adding more capacity isn’t going to help.

L4S stands for Low Latency, Low Loss, Scalable Throughput, and its goal is to make sure your packets spend as little time needlessly waiting in line as possible by reducing the need for queuing. To do this, it works on making the latency feedback loop shorter; when congestion starts happening, L4S means your devices find out about it almost immediately and can start doing something to fix the problem. Usually, that means backing off slightly on how much data they’re sending.

As we covered before, our devices are constantly speeding up, then slowing down, and repeating that cycle because the amount of data that links in the network have to deal with is constantly changing. But packets dropping isn’t a great signal, especially when buffers are part of the equation — your device won’t realize it’s sending too much data until it’s sending way too much data, meaning it has to clamp down hard.

a:hover]:text-gray-63 [&>a:hover]:shadow-underline-black dark:[&>a:hover]:text-gray-bd dark:[&>a:hover]:shadow-underline-gray [&>a]:shadow-underline-gray-63 dark:[&>a]:text-gray-bd dark:[&>a]:shadow-underline-gray”>Image: Apple

L4S, however, gets rid of that lag between the problem beginning and each device in the chain finding out about it. That makes it easier to maintain a good amount of data throughput without adding latency that increases the amount of time it takes for data to be transferred.

No, it’s not magic, though it’s technically complex enough that I kind of wish it were, because then, I could just hand-wave it away. If you really want to get into it (and you know a lot about networking), you can read the specification paper on the Internet Engineering Task Force’s website.

L4S lets the packets tell your device how well their journey went

For everyone else, I’ll try to boil it down as much as I can without glossing over too much. The L4S standard adds an indicator to packets, which says whether they experienced congestion on their journey from one device to another. If they sail right on through, there’s no problem, and nothing happens. But if they have to wait in a queue for more than a specified amount of time, they get marked as having experienced congestion. That way, the devices can start making adjustments immediately to keep the congestion from getting worse and to potentially eliminate it altogether. That keeps the data flowing as fast as it possibly can and gets rid of the disruptions and mitigations that can add latency with other systems.