chunkloris: uvicorn (websockets impl)

part of the chunkloris per-chunk amplification survey. this page is the per-server record for uvicorn (websockets impl) under websocket text frames.

at a glance

  • server: uvicorn (websockets impl) uvicorn 0.32.1, websockets 13.1, starlette 0.41.3
  • runtime: python-3.13
  • ecosystem: python
  • concurrency model: event-loop
  • parser: websockets (pure-python frame decoder)
  • delivery granularity: per-frame
  • chunk-limit helper: none exposed by the framework
  • verdict: per-frame β€” the parser/dispatcher boundary delivers one event per protocol frame (h2 / h3 DATA frame, or ws data frame). cpu cost under paced mode b is measurable per frame.
  • scaling exponent (mode a): 1.00 (wall time vs N, log-log slope across common cells)

measurements

all cells run on a 1-vcpu docker container. cpu cost is derived from the target container’s cgroup v2 cpu.stat usage_usec delta around each cell.

modeNwall (s)server cpu %Β΅s / framebasisok
A-ws-bridge50,0000.237β€”4.740wallβœ“
A-ws-bridge100,0000.466β€”4.660wallβœ“
A-ws-bridge250,0001.173β€”4.690wallβœ“
B-ws-paced-100us50,0005.229β€”4.580server-cpu-overheadβœ“
B-ws-paced-100us100,00010.616β€”6.160server-cpu-overheadβœ“
B-ws-paced-100us250,00026.252β€”5.010server-cpu-overheadβœ“

what this means

the parser/dispatcher path on this server delivers one event per protocol frame (a websocket text frames DATA frame or ws frame), so an attacker who sends a request body as N one-byte frames consumes roughly N Γ— (mode-b Β΅s/frame) of server cpu on a single core.

what to do today

  • consider imposing a per-connection frame-rate cap before delivering frames to the application.
  • the asgi spec does not provide a per-frame batching primitive at the application boundary; the closest workaround is to read from the receive channel and buffer at the application layer before doing per-message work.

reproducer

the full reproducer for this server is in the paper repo. the docker container pins uvicorn (websockets impl) uvicorn 0.32.1, websockets 13.1, starlette 0.41.3 and constrains the test container to a single cpu (--cpus=1). the prober script implements mode a (bridge-coalesced) and mode b (paced 100 Β΅s) per the methodology section.

see the draft pdf for the full per-framework discussion.

on this page