The RuntimeApr 23, 2026·19 min read·

Six ways a webpage keeps a connection — polling to WebTransport

Every real-time feature you ship picks one of six transports — short polling, long polling, SSE, WebSocket, fetch streaming, or WebTransport. They differ on direction, latency, binary support, reconnection, and server cost. Here's each one at the protocol level, with the exact bytes on the wire and a decision tree for picking.

Pick a transport to see how it differs on the axes that matter:

WebSocket
direction
full-duplex
protocol
RFC 6455 over TCP (HTTP upgrade)
latency
~0 in both directions
binary
yes (text or binary frames)
reconnect
manual — no built-in retry
server
dedicated server or a framework (Socket.IO, ws)
use when: chat, collaborative editing, multiplayer games, trading UIs.

Every real-time feature you build — live notifications, a chat, a collaborative editor, an LLM token stream, a live scoreboard, a multiplayer game, stock tickers — picks one of six transports. They all solve the same problem ("the server has new data, how does the browser hear about it?"), but they solve it with radically different protocols, costs, and failure modes.

Most teams pick by habit. "Socket.IO is what we use". "We just hit the endpoint every 10 seconds". That works until it doesn't. The polling burns battery on mobile. The drops on cellular handoff. The doesn't make it through a corporate proxy. The fetch-stream stalls behind a buffering reverse proxy.

This post walks each transport at the protocol level. Exact bytes on the wire. Exact events and APIs. Exact failure modes. Then the decision tree that tells you which one to actually use.

tl;dr

Short polling: setInterval(() => fetch(url), 5000). Simple, stateless, wasteful. Default for non-critical updates. Long polling: server holds the request open until there's an event. Mostly obsolete. Server-Sent Events: new EventSource(url) — one-way server-push over HTTP with automatic reconnect and resumable via Last-Event-ID. Text only. WebSocket: full-duplex over a single TCP connection via RFC 6455. Upgrade handshake flips an HTTP/1.1 GET into a frame-based socket. Manual reconnect. fetch + ReadableStream: one-shot streaming HTTP response — what LLM APIs use. WebTransport: HTTP/3 / QUIC — reliable streams plus unreliable datagrams. Games and media; Baseline 2026. Pick based on direction, latency, and binary-vs-text — the decision tree at the bottom of this post is the short answer.

Why any of this is complicated

HTTP is request / response. The browser asks, the server answers. The TCP connection may or may not stay open (HTTP/1.1 keep-alive, HTTP/2 multiplexing), but the semantic unit is a round trip. The server doesn't speak unless spoken to.

That's wrong for anything that happens at the server's rhythm — a new message arrived, a price changed, another user moved the cursor. The server knows something the browser doesn't, and HTTP gives it no way to say so.

Every transport in this post is a different way of hacking around that limitation. Some keep the request open. Some upgrade the connection to a different protocol. Some use a different protocol entirely. The trade-offs aren't stylistic. They're about what your server can support, what your network path allows, and what latency your users will tolerate.

The timeline, visualised

server events
polls (every 5s)
0s15s30s
wasted polls in this window: 3 / 6 — the browser asked and the server had nothing.

This is the whole taxonomy in one picture. Short polling wakes up on a fixed cadence; most wakes find nothing. Long polling holds the connection open until there's an event to deliver. SSE and WebSocket keep one persistent connection open and fire events the instant they happen.

The cost model differs. Short polling is stateless — the server handles any number of clients, because each request is independent. SSE and WebSocket require the server to hold an open connection per client. On a 1M-user service, that's a different architecture.

1. Short polling

The minimum-viable version:

setInterval(async () => {
  const res = await fetch("/api/messages?since=" + lastId);
  const messages = await res.json();
  for (const m of messages) handleMessage(m);
  if (messages.length) lastId = messages.at(-1).id;
}, 5000);

The mechanics are exactly what you'd expect — a normal HTTP request every 5 seconds.

Latency: average half the polling interval. With a 5s interval, events are delivered 2.5s after they happen, on average.

Cost on the server: every client makes 12 requests per minute, most with empty responses. For 1000 concurrent users, that's 12k requests per minute — trivial for any HTTP server, actually.

Cost on the client: each poll wakes the CPU, runs JS, does DNS / TLS / TCP if the connection was idle. On mobile, this drains battery noticeably. Chrome throttles background tabs — setInterval slows to once per minute when the tab is hidden.

Proxy-friendliness: perfect. It's just HTTP. Every proxy, firewall, VPN, and corporate network supports it.

When to use: when data doesn't need to be real-time. Notification-badge counts. "Has someone else logged in?" checks. Background sync. If you're picking a transport for the first time and don't have a requirement more specific than "updates sometimes", short polling is almost always the right default.

When not to use: when you need sub-second latency, or when there are thousands of concurrent users all polling at once (request storms at every 5s cadence).

2. Long polling

A clever trick that tries to combine HTTP's simplicity with push-style latency.

The client sends a request. The server doesn't respond immediately — it holds the connection open until either an event arrives or a timeout (usually 30–60s) expires. When the server responds, the client immediately re-issues the request, holding open the next "slot".

async function longPoll() {
  while (!stopped) {
    const res = await fetch("/api/events?wait=true&last=" + lastId);
    const events = await res.json();
    for (const e of events) handleEvent(e);
    if (events.length) lastId = events.at(-1).id;
    // loop: immediately re-request
  }
}

Latency: ~0 when an event fires. The server resolves the held request immediately.

Cost on the server: one held connection per client. This is where long polling becomes problematic. A traditional thread-per-request server (Rails, old Python, old PHP) runs out of threads at a few hundred clients. Event-loop servers (Node, Go, async Python) can handle tens of thousands of held connections, but the RAM per connection still adds up.

Why it's mostly obsolete: SSE does the same thing with a nicer API and built-in reconnection, without requiring a re-request on every event. The only remaining use case is "the client's network blocks SSE and WebSocket". Some corporate proxies and older CDNs do this. For everyone else, skip it.

3. Server-Sent Events — the one every team should consider first

SSE is a tiny, beautiful, under-used protocol. It's one-way server push over a long-lived HTTP response.

const source = new EventSource("/api/events");
 
source.addEventListener("message", (e) => {
  const data = JSON.parse(e.data);
  handleEvent(data);
});
 
source.addEventListener("price", (e) => {
  // typed event
});
 
source.addEventListener("error", () => {
  // automatic reconnection happens — no action needed
});

The browser opens a long-lived HTTP request with Accept: text/event-stream. The server keeps the connection open and writes events as plain text. Here's the smallest complete example — one message:

data: hello

Three rules, and that's the whole format:

  1. Every line is fieldname: value. Only four field names mean anything: data, event, id, retry.
  2. A blank line dispatches the buffered event. Whatever fields came before the blank line get bundled into one MessageEvent fired on the client's EventSource.
  3. A line starting with : is a comment. The client ignores it. Servers emit them as keep-alives through buffering proxies that would otherwise close an idle connection.

Step through each field below — left side is the exact bytes on the wire, right side is what the browser's EventSource dispatches:

wire bytes (text/event-stream)
event: price
data: {"symbol":"BTC","usd":67413}

each line ends with LF (\n); a blank line dispatches the event
dispatched on EventSource
new MessageEvent("price", {
  data: '{"symbol":"BTC","usd":67413}',
  lastEventId: "",
})

The authoritative parse rules, straight from the spec:

For each line in the event stream: "data" appends the field value to the data buffer followed by a U+000A LINE FEED. "event" sets the event type buffer. "id" sets the last event ID buffer (unless the value contains U+0000 NULL). "retry" sets the event stream's reconnection time, if the value consists only of ASCII digits.

The killer SSE feature: automatic reconnection with resumption

When the browser drops, it automatically reconnects to the same URL. It sends a Last-Event-ID header containing the last id: field it saw. The server reads that header and replays events from where the client left off.

GET /api/events HTTP/1.1
Last-Event-ID: 42
Accept: text/event-stream

No client-side reconnection code. No "did I miss messages while I was offline?" state machine. The browser handles it and the server sees the resumption point in a standard header.

What SSE is NOT good for

  • Binary data. SSE is UTF-8 text only. Base64-encoding is possible but wasteful.
  • Client → server. SSE is one-way. If you also need to send, do it via a normal fetch POST on a different endpoint.
  • HTTP/1.1 connection-limit per host (6 connections). If a user opens five tabs, the sixth one's SSE connection will queue behind the others. HTTP/2 multiplexing fixes this — serve SSE behind HTTP/2 or HTTP/3.

Server-side SSE

A minimal Node/Express handler:

app.get("/api/events", (req, res) => {
  res.writeHead(200, {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    "Connection": "keep-alive",
  });
 
  const lastId = req.headers["last-event-id"];
  for (const e of replayFrom(lastId)) send(e);
 
  const sub = subscribe((e) => send(e));
  req.on("close", () => sub.unsubscribe());
 
  function send(e: Event) {
    res.write(`id: ${e.id}\n`);
    res.write(`event: ${e.type}\n`);
    res.write(`data: ${JSON.stringify(e.payload)}\n\n`);
  }
});

The only trap: most reverse proxies and CDNs buffer responses by default. You must disable buffering for SSE endpoints — set X-Accel-Buffering: no for nginx, configure Cloudflare to pass through, disable Vercel Edge Buffering for the route. Without this, events are buffered for minutes and the client sees a silent stream.

4. WebSocket — the full-duplex default

When you need the client to send too, WebSocket is the industry default.

const ws = new WebSocket("wss://example.com/chat");
 
ws.addEventListener("open", () => ws.send("hello"));
ws.addEventListener("message", (e) => handle(e.data));
ws.addEventListener("close", (e) => reconnect(e.code));
ws.addEventListener("error", (e) => log(e));
 
ws.send(JSON.stringify({ type: "typing", room: 42 }));

The handshake

WebSocket is defined by RFC 6455. The connection starts as HTTP/1.1 and gets "upgraded" to a different protocol mid-connection.

HTTP/1.1 GET with Upgrade header
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Origin: https://example.com
Sec-WebSocket-Protocol: chat, json-v1
  • Sec-WebSocket-Key is a random 16-byte nonce, base64-encoded. Browser generates it.
  • Sec-WebSocket-Version: 13 is the only version the browser supports (RFC 6455).
  • Sec-WebSocket-Protocol negotiates sub-protocols (e.g. chat). Optional.

From the RFC:

The WebSocket Protocol enables two-way communication between a client running untrusted code in a controlled environment to a remote host that has opted-in to communications from that code. [...] The opening handshake is intended to be compatible with HTTP-based server-side software and intermediaries, so that a single port can be used by both HTTP clients talking to that server and WebSocket clients talking to that server.

The magic bit: the server proves it speaks WebSocket by computing base64(SHA1(client_key + "258EAFA5-E914-47DA-95CA-C5AB0DC85B11")) and returning it in Sec-WebSocket-Accept. The magic GUID is a spec-defined constant. It exists purely to force the server to do the computation — a random HTTP server that doesn't know about WebSocket can't accidentally return the right value.

If the handshake succeeds (101 Switching Protocols), the same TCP connection is now a WebSocket. The browser stops speaking HTTP; the protocol is frame-based from here on.

The frame format

Every WebSocket message is a frame. The byte layout:

opcode 0x1text
UTF-8 string payload. Most common for JSON-based APIs.
byte 0
1
FIN=1
000
RSV 1/2/3 = 0
0001
opcode 0x1
byte 1
1
MASK=1
0000101
len = 5
bytes 2–5
4-byte masking key (random, applied XOR to payload)
payload
"hello"
invariants
  • · client → server frames MUST have MASK=1.
  • · server → client frames MUST have MASK=0.
  • · control frames (close / ping / pong) MUST have FIN=1 and payload ≤ 125 bytes.
  • · fragmented messages use FIN=0 on all frames except the last.

The first byte packs FIN (1 bit — is this the last frame in a message?) plus 3 reserved bits plus the opcode (4 bits). The opcode selects the message type:

  • 0x1 — text (UTF-8 payload)
  • 0x2 — binary
  • 0x0 — continuation (for fragmented messages)
  • 0x8 — close
  • 0x9 — ping
  • 0xA — pong

The second byte is the mask bit (1 for client → server, 0 for server → client) plus a 7-bit length. Longer payloads use 16-bit or 64-bit extended lengths in the next bytes.

Every client → server frame is masked with a random 4-byte key XORed into the payload. This exists for one very specific security reason: without masking, a WebSocket client could be used to smuggle attacker-controlled bytes to a cache proxy that interprets them as HTTP responses. Masking makes the bytes unpredictable.

Ping, pong, and the "is it still alive?" problem

WebSocket has a built-in ping / pong mechanism (opcodes 0x9 and 0xA). The server sends a ping; the browser responds with a pong with identical payload bytes. The browser does this automatically for you — you can't intercept it.

The trap: browsers do NOT send pings to the server. If your server never sends pings, a dead TCP connection can sit in the connected state for minutes (until the TCP keep-alive timeout fires, which is typically 2 hours by default on Linux).

You need an application-level . Send a {type: "heartbeat"} message every 20–30 seconds; if the server doesn't respond within 60s, close and reconnect.

Reconnection

WebSocket has no native reconnection. When the connection drops, ws.onclose fires with a close code. You reconnect yourself:

function connect() {
  const ws = new WebSocket(url);
  ws.addEventListener("close", (e) => {
    const backoff = Math.min(30000, 1000 * 2 ** attempts);
    attempts++;
    setTimeout(connect, backoff);
  });
  ws.addEventListener("open", () => { attempts = 0; });
}

Exponential backoff (1s, 2s, 4s, ..., cap at 30s) is the standard. Add to prevent thundering-herd reconnects when a server comes back online.

Unlike SSE, there's no Last-Event-ID equivalent built in. If you need resumable streams, you implement the protocol yourself — the client sends its last-seen sequence number after reconnect, the server replays from there.

When to use WebSocket

  • Chat and messaging.
  • Collaborative editing (cursor positions, text operations).
  • Multiplayer games.
  • Trading / crypto UIs that stream bid/ask updates at high rate.
  • Any feature where client and server exchange messages at the same cadence.

When NOT to use WebSocket

  • Server-only updates, text-only payloads → use SSE. Automatic reconnection for free.
  • One-shot streaming responses → use fetch + ReadableStream.
  • Server can be entirely stateless → use short polling.

5. fetch + ReadableStream

This is the transport every LLM API uses (OpenAI, Anthropic, Google). It's a normal HTTP response, but the server sends the body in chunks over time instead of all at once.

const res = await fetch("/api/stream");
const reader = res.body!.getReader();
const decoder = new TextDecoder();
 
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const chunk = decoder.decode(value, { stream: true });
  handleChunk(chunk);
}

The response uses Transfer-Encoding: chunked under the hood (HTTP/1.1) or streaming data frames (HTTP/2 / HTTP/3). The browser hands the client Uint8Array chunks as they arrive on the wire.

From MDN:

This is good for memory efficiency, because the browser doesn't have to buffer the entire response in memory before the caller retrieves it. This also means that the caller can process the content incrementally as it is received.

Differences from SSE

SSE is also an HTTP response that stays open — in that sense it's a specialised fetch-stream. The differences:

  • Browser API: EventSource for SSE; fetch + stream reader for generic streams. EventSource handles reconnection; fetch doesn't.
  • Format: SSE defines a specific text format (field lines + blank-line delimiters). Generic streams carry whatever bytes you like — raw binary, JSON lines, Protobuf.
  • Lifecycle: SSE is designed as a persistent pipe. Fetch-stream is a one-shot; when the server closes, you're done.

Use cases

  • Streaming LLM tokens. OpenAI and Anthropic both stream via SSE (text/event-stream with data: prefixed JSON events) — but the client-side pattern is still "fetch + body.getReader() + parse chunks", because the browser EventSource constructor doesn't support POST or custom headers. You're reading an SSE stream over a generic fetch stream.
  • Long-running downloads with progress. The server sends partial data; the client updates a progress UI.
  • Server-generated data that isn't event-shaped. A CSV export of 100M rows the server wants to stream rather than buffer.

The JSON-lines pattern

A common format when you want multiple "messages" over a fetch-stream: newline-delimited JSON.

{"type":"token","text":"Hello"}
{"type":"token","text":" world"}
{"type":"done","total":2}

The client decodes UTF-8 chunks and splits on \n:

const reader = res.body!.pipeThrough(new TextDecoderStream()).getReader();
let buffer = "";
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  buffer += value;
  let nl = buffer.indexOf("\n");
  while (nl !== -1) {
    const line = buffer.slice(0, nl);
    buffer = buffer.slice(nl + 1);
    if (line) handleJSON(JSON.parse(line));
    nl = buffer.indexOf("\n");
  }
}

You'll notice this is roughly equivalent to implementing SSE by hand with less robust framing. If you're inventing your own line format on top of fetch-stream, ask whether SSE would have been the better choice.

6. — the -native option

WebTransport is the newest transport on the list. It went Baseline 2026 in March, which means all major browsers now support it. It's built on HTTP/3, which runs on QUIC, which runs on UDP.

From MDN:

primary source · MDN — WebTransport

The WebTransport interface provides functionality to enable a user agent to connect to an HTTP/3 server, initiate reliable and unreliable transport in either or both directions, and close the connection once it is no longer needed.

Two APIs in one connection:

  • Streams (reliable, ordered): functionally similar to WebSocket messages, but you can have many concurrent streams over one connection without head-of-line blocking. Each stream is independent.
  • Datagrams (unreliable, unordered): functionally similar to UDP packets. Drop late datagrams on purpose. Ideal for real-time game state and media — the newest position is what matters, not the one from 80ms ago.
const transport = new WebTransport("https://example.com:4433/rt");
await transport.ready;
 
// Send unreliable datagram
const writer = transport.datagrams.writable.getWriter();
writer.write(new Uint8Array([1, 2, 3]));
 
// Receive datagrams
const reader = transport.datagrams.readable.getReader();
while (true) {
  const { value } = await reader.read();
  handleDatagram(value);
}
 
// Bidirectional reliable stream
const stream = await transport.createBidirectionalStream();

Why this matters (for some apps)

  1. No head-of-line blocking between streams. A lost TCP packet on WebSocket blocks every message in the same connection. QUIC streams are independent — a drop on one doesn't stall the others.
  2. Survives network changes. QUIC has connection IDs, not IP-and-port tuples. When the user moves from WiFi to cellular, the QUIC connection survives (if the server supports migration).
  3. Unreliable datagrams. The only browser transport that explicitly supports drop-on-late delivery. For games, position updates at 60 Hz — if you're 100ms behind, the old packets are useless.

Why most apps shouldn't use it

  1. Server support is thin. You need an HTTP/3 server with QUIC. Most web frameworks don't yet. There are Rust (h3, quiche) and Go (quic-go) libraries, but wiring them into a Node / Python / Java stack is work.
  2. Operational unfamiliarity. Running HTTP/3 behind a load balancer is newer territory. Many CDNs and LBs don't yet support WebTransport passthrough.
  3. WebSocket is usually enough. For chat, collab, trading — WebSocket's head-of-line blocking rarely matters. Reach for WebTransport only if you actually need the latency or the datagrams.

Picking one — the decision tree

start
Who initiates the messages?

The shortest answer: if server-only and text → SSE; if bidirectional → WebSocket; if you can justify the ops load → WebTransport; otherwise short polling. Most teams over-engineer this. A 10-second polling loop is an acceptable start for most features and can always be upgraded when the UX demands it.

The transport is not the hard part — the app protocol is

Regardless of which transport you pick, you still design an application-level protocol on top. A few concerns that apply to every transport:

Message format

JSON is the default. Fast to parse, human-readable, universally supported. Add a type discriminator:

type Msg =
  | { type: "ping"; t: number }
  | { type: "chat"; text: string; from: string }
  | { type: "typing"; room: number }
  | { type: "presence"; user: string; status: "online" | "away" };

Binary formats (Protobuf, MessagePack, FlatBuffers) save 30–60% bytes on high-frequency payloads but require schema coordination. Use them when message rate is >100/s per client.

Heartbeats

Every long-lived transport needs one. Browsers and NAT boxes time out idle connections; corporate proxies drop quiet streams after 30–60s.

  • SSE: server sends a comment line (: keep-alive\n\n) every 15s.
  • WebSocket: server or client sends a ping (opcode 0x9) every 20–30s.
  • WebTransport: QUIC handles this at the protocol level.

Reconnection with resumption

The general algorithm:

  1. On connect, client sends its last-seen sequenceId.
  2. Server replays events from sequenceId + 1 onwards.
  3. If the server can't replay that far (event log trimmed), it tells the client to refetch full state.

SSE gives you this for free via Last-Event-ID. For the others, you implement it. Every real production WebSocket service has this; the ones without it lose messages on reconnect and blame "flaky network".

Backpressure

What happens when the server produces faster than the client consumes?

  • SSE / fetch-stream: TCP . The server's res.write() blocks until the client drains. Simple.
  • WebSocket: ws.send() buffers in ws.bufferedAmount. If you don't check, you'll OOM. Poll bufferedAmount and back off when it grows.
  • WebTransport: writable stream exposes explicit backpressure signals. Use them.

Authentication

  • Short polling / fetch-stream: cookies or bearer token on every request. Standard.
  • SSE: cookies. The EventSource constructor doesn't let you set headers. If you need a bearer token, either use the EventSourcePolyfill library or switch to fetch-stream.
  • WebSocket: cookies work. You can also send the token as a sub-protocol, or in the first message after open. Don't put tokens in the URL — they end up in logs.
  • WebTransport: cookies.

The summary

Most features don't need real-time. Build with short polling first; upgrade to SSE or WebSocket when a user can feel the latency.

When you do upgrade: for server → client text use SSE. For bidirectional use WebSocket. For game-grade latency consider WebTransport. For streaming responses that are fundamentally one-shot (LLM tokens, large downloads) use fetch + ReadableStream.

The failure modes are in the operational details: keep-alives through proxies, application-level heartbeats, explicit reconnection logic, backpressure handling. Whichever transport you pick, budget time for those.

Primary sources