Skip to main content

How a Browser Really Talks to a Service-Backed Web App

· 17 min read
Pere Pages
Software Engineer
A browser and a backend service exchanging requests across the network

Type a URL, hit Enter, and a page appears. Between that keystroke and the first pixel sits a surprising amount of machinery — DNS, TCP, TLS, HTTP, load balancers, and the services behind them. This post walks the whole path, end to end.

That single "load this page" gesture fans out into dozens of network hops, an identity dance, several databases (some of which are copies of each other), a handful of third-party APIs, and a quiet stream of telemetry that nobody in the UI ever sees.

This post walks the whole round trip: everything that travels upstream (the request, from browser to origin) and everything that comes back downstream (the response), and how all the moving parts are orchestrated. We'll model the system with C4 diagrams (context and container) and trace concrete flows with sequence diagrams.

The mental model I want you to leave with: a request is not a straight line. It's a pipeline of narrow gates (edge, gateway, auth), a fan-out to dependencies (DBs, APIs), and a fold-back into a single response — all stitched together by a correlation ID so we can reconstruct the story afterwards.


1. The cast of characters (C4 System Context)

Before tracing calls, let's see who is involved. The C4 System Context diagram zooms all the way out: our system as one box, the people who use it, and the external systems it depends on.

Two subtle but important points already show up here:

  • The user talks to some external systems directly. Map tiles are usually fetched by the browser straight from the maps provider (they're just images at predictable URLs, and you don't want them proxied through your backend). The redirect to the identity provider also happens in the browser.
  • The app talks to other external systems on the server side. Geocoding, payments, and anything that requires a secret key must go through the backend so the key never reaches the browser. This split — who calls what from where — is a recurring design decision.

2. What's inside the box (C4 Container)

The Container diagram opens up our system into the deployable/runnable pieces and the data stores. This is the level where "replicated databases," caches, and gateways become visible.

The important shapes here:

  • The edge is a container too. The CDN isn't just a cache; it terminates TLS, runs a Web Application Firewall (WAF), and can execute edge functions. It's the first thing your traffic actually hits.
  • The gateway is the single front door for the API. Everything the SPA calls goes through it. That's where you centralise cross-cutting concerns: rate limiting, token verification, request routing.
  • Writes and reads take different paths. The primary database accepts writes; one or more read replicas serve reads and receive changes asynchronously from the primary. This is the single most important detail for understanding scale — and the source of the trickiest consistency bug we'll discuss below.

3. Upstream: the anatomy of a request

Let's follow a request from the moment of user intent to the moment the application service starts doing real work. Each step is a gate that can transform, reject, or reroute the request.

3.1 Name resolution — where is app.example.com?

The browser can't send anything until it has an IP address. It asks a DNS resolver, which walks the hierarchy (root → TLD → authoritative) unless a cached answer exists at the OS, browser, or resolver level. For a CDN-fronted app the returned address is usually an Anycast IP: the same address is announced from many locations, and network routing quietly sends you to the nearest point of presence (PoP).

Takeaway: the first "hop" is a lookup, and it's heavily cached. Cache misses here add real latency, which is why DNS TTLs are a deliberate tuning knob.

3.2 Transport + TLS — opening a secure channel

The browser opens a connection to the edge and performs a TLS handshake (or uses QUIC for HTTP/3, which folds transport and crypto setup together). Crucially, TLS is terminated at the edge, not at your origin. From the CDN inward, traffic often rides a separate, internally trusted channel (frequently mTLS between your own components).

Two performance notes worth internalising:

  • Connection reuse matters enormously. HTTP/2 and HTTP/3 multiplex many requests over one connection, so the handshake cost is paid once.
  • HTTP/3 over QUIC survives network changes (Wi-Fi → cellular) without a full re-handshake, because the connection is identified by a connection ID rather than the IP/port tuple.

3.3 The edge — CDN, cache, and WAF

Now the request is inside the CDN. Three things can happen:

  1. Static asset, cache hit: the edge answers immediately. Your origin never sees the request. This is how index.html, JS bundles, CSS, and images get served fast worldwide.
  2. Dynamic request: it's passed through toward the origin (possibly after an edge function tweaks headers, does geo-routing, or handles auth cookies).
  3. Rejected by the WAF: obviously malicious patterns (SQL injection strings, known bad IPs, requests that trip a rate rule) are blocked before they cost you anything.

3.4 Load balancing — picking a healthy instance

Dynamic traffic that survives the edge reaches a load balancer. An L4 balancer routes by IP/port; an L7 balancer understands HTTP and can route by path or header. Either way it spreads traffic across healthy instances, using health checks to stop sending requests to a node that's failing its readiness probe. This is what makes horizontal scaling and zero-downtime deploys possible.

3.5 The API gateway — the front door for your API

The gateway (or reverse proxy — Nginx, Envoy, Kong, a managed API gateway) is where the request meets your rules:

  • Routing: /api/orders → order service, /api/maps → geo service.
  • Rate limiting / throttling: per-user or per-IP quotas.
  • Authentication: validate the caller's token here, so downstream services can trust it.

3.6 Authentication — proving who's calling

For a SPA the standard is OIDC Authorization Code flow with PKCE. The login itself is a browser redirect dance with the identity provider (diagrammed in §5). By the time a normal API request arrives, the SPA is attaching an access token (a signed JWT) in the Authorization: Bearer … header.

The gateway verifies that token without calling the IdP on every request:

  • It fetches the IdP's public signing keys (JWKS) once and caches them.
  • For each request it checks the signature, expiry (exp), audience (aud), and issuer (iss) locally. That local verification is what keeps auth cheap. The IdP is only contacted during login and token refresh, not on the hot path.

3.7 Into the application — and the fan-out

Finally the request reaches the application service, which runs your business logic. This is rarely self-contained; it fans out to dependencies:

  • Cache first (Redis): check for a hot value before touching the database. A cache hit can end the story here.
  • Reads → replica: ordinary queries go to a read replica to spare the primary.
  • Writes → primary: inserts/updates go to the primary, which then streams the change to replicas asynchronously.
  • External APIs → server-side: geocoding, payments, email. These carry secret keys and must originate here, never in the browser. Each of these is itself a small upstream/downstream trip, wrapped in timeouts, retries with backoff, and circuit breakers so one slow dependency can't hang the whole request.

4. Downstream: assembling and returning the response

The request has fanned out. Now everything folds back into a single response that travels the same gates in reverse.

4.1 Assembly and serialization

The service combines the pieces — the DB rows, the external API results, the cached fragments — into a response object and serializes it (usually JSON; for server-side-rendered pages, HTML). It sets a status code that means something: 200 for a read, 201 Created for a new resource, 4xx for client mistakes, 5xx for its own failures.

4.2 Back through the gateway

On the way out the gateway adds the polish:

  • CORS headers so the browser will accept a cross-origin response.
  • Security headers (Content-Security-Policy, Strict-Transport-Security, etc.).
  • Caching directives (Cache-Control, ETag, Vary) that tell the CDN and browser whether and how long they may reuse this response.

4.3 Compression and caching at the edge

The CDN typically compresses the body (Brotli or gzip) and, if the caching headers allow, stores the response so the next identical request never reaches your origin. This is why setting cache headers correctly is a backend concern with front-of-house consequences: a good Cache-Control policy is free performance.

4.4 TLS back to the browser

The edge re-encrypts and streams the response to the browser over the existing secure connection. The multiplexed transport means this response can arrive interleaved with other in-flight ones.

4.5 The browser's downstream work

Receiving bytes is not the end — the browser now has its own pipeline:

  • First load: parse HTML, build the DOM, fetch and execute JS, apply CSS, paint. The SPA framework (React) boots and hydrates.
  • Subsequent data: a fetch() resolves, state updates, React re-renders the affected components. No full page reload.
  • Client-side fan-out: the freshly rendered page may itself trigger more requests — map tiles pulled directly from the maps provider, analytics beacons, lazy-loaded chunks. The round trip we just traced often spawns several smaller ones.

5. The login flow, in full (sequence)

Authentication deserves its own trace because it's the one flow where the browser talks to an external system directly and the redirects can be confusing. This is OIDC Authorization Code + PKCE — the correct choice for a public client like a SPA that can't keep a secret.

Why PKCE? The code_verifier is a secret the browser generates and keeps; only the party that started the flow can exchange the authorization code for tokens. It closes the door on an attacker who intercepts the code in the redirect. The access token authorises API calls; the id token describes the user; the refresh token (if issued) buys new access tokens without re-prompting for credentials.


6. The initial page load, in full (sequence)

This is the "first paint" story — resolving the name, opening the connection, and pulling static assets from the edge.

Note the caching split that a Vite build gives you for free: index.html is small and revalidated often, while hashed asset files (app.4f1a2b.js) are immutable and cached essentially forever. Change the code, the hash changes, the URL changes, the browser fetches the new file — no cache-busting gymnastics required.


7. An authenticated data request, in full (sequence)

Now the interesting one: a logged-in user creates something. This single request touches the WAF, the gateway, a write to the primary, an external API, the cache, a consistent read, and the tracing backend.

The replication gotcha worth its own paragraph

Look at the two database interactions. We wrote to the primary, then immediately needed to read the same record. The naïve version reads from a replica — but replication is asynchronous, so the replica may not have the write yet. The user creates an order and the confirmation screen shows "order not found." This is read-after-write inconsistency caused by replication lag.

Common fixes, roughly in order of preference:

  • Route reads to the primary immediately after a write (session or request "stickiness"), then let subsequent reads drift back to replicas.
  • Read your own writes from the cache you just populated.
  • Wait for the replica to catch up to a known log position before reading, if your database exposes that. The general rule: replicas are for other people's reads, not for reading back the thing you just wrote in the same request.

8. The invisible layer: observability, logging, and resilience

Everything above describes the happy path of one request. In production the hard question is "what happened to request #48291 that took 9 seconds?" — and you can only answer it if you built for it. These concerns run across every hop, not at any single one.

Distributed tracing

The browser (or the edge) mints a trace ID and puts it in the W3C traceparent header. Every component — gateway, service, database wrapper, external client — reads that header, creates a span for its own work, and passes the context onward. Each span is exported (over OTLP) to a collector and then to a backend like Jaeger or Tempo. The result is a flame graph of the entire request: you can literally see the 9 seconds and which span owns them.

The golden rule: propagate the trace context on every outbound call, or the trace breaks and you get orphaned spans.

Metrics

Alongside traces, each component emits aggregate numbers — the RED signals (Rate, Errors, Duration) for services, saturation for resources. These flow to a metrics backend (Prometheus-style) and drive dashboards and alerts. Metrics tell you that something is wrong and how widespread; traces tell you where; logs tell you why.

Structured logging

Logs should be structured (JSON, not free text) and should carry the same correlation ID as the trace. Shipped to a central aggregator (Loki, ELK, or a hosted equivalent), they let you filter to a single request across every service it touched. A log line without a correlation ID in a distributed system is nearly useless.

Resilience patterns

Because the fan-out means many things can fail independently, each external call should be wrapped in:

  • Timeouts — never wait forever; a hung dependency shouldn't hang your request.
  • Retries with exponential backoff + jitter — for transient failures, but only on idempotent operations.
  • Circuit breakers — after repeated failures, stop calling a dead dependency and fail fast, giving it room to recover.
  • Bulkheads — isolate resource pools so one saturated dependency can't starve the others.

9. Orchestration, in one paragraph

Zoom back out and the shape is clear. A request narrows through a series of gates — DNS, TLS at the edge, WAF, load balancer, gateway, auth — each of which can transform or stop it. It then fans out from the application service to caches, databases (writes to the primary, reads from replicas), and external APIs, each call defended by timeouts and circuit breakers. The results fold back into one serialized response that retraces the gates outward, gathering caching, compression, and security headers on the way. Threaded through all of it is a single correlation/trace ID that turns this sprawl into a reconstructable story. Nothing here is truly synchronous end to end: the edge caches, the database replicates asynchronously, the browser renders progressively, and telemetry is emitted as a side effect the user never sees. Understanding a modern web app means holding those two truths at once — the user experiences one clean round trip, and the system executes a choreographed fan-out with a lot of safety nets.


Appendix: a quick reference of who talks to whom

ConcernWho initiatesFrom whereWhy there
Static assetsBrowserEdge/CDNCacheable, no secrets
Map tilesBrowserMaps provider (direct)Public URLs, high volume
SSO loginBrowserIdentity provider (redirect)User must authenticate directly
Token verificationGatewayLocal (cached JWKS)Cheap, keeps IdP off hot path
Geocoding / paymentsApp serviceExternal API (server-side)Requires secret keys
ReadsApp serviceRead replicaOffload the primary
WritesApp servicePrimary DBSingle source of truth
Read-after-writeApp servicePrimary DBDodge replication lag
Traces / metricsEvery componentOTel collectorReconstruct the request
LogsEvery componentLog aggregatorDebug with correlation ID