chunkloris: haproxy (with lua applet for body sink)

part of the chunkloris per-chunk amplification survey. this page is the per-server record for HAProxy (with Lua applet for body sink) under http/1.1 chunked transfer encoding.

at a glance

  • server: HAProxy (with Lua applet for body sink) 3.0.23
  • runtime: C
  • ecosystem: c
  • concurrency model: event-loop
  • parser: haproxy proto_http.c chunked decoder
  • delivery granularity: per-chunk
  • chunk-limit helper: none exposed by the framework
  • verdict: per-chunk — the parser/dispatcher boundary delivers one event per wire chunk. cpu cost under paced mode b is measurable per chunk.
  • scaling exponent (mode a): 0.50 (wall time vs N, log-log slope across common cells)
  • scaling exponent (mode b): 1.00

measurements

all cells run on a 1-vcpu docker container. cpu cost is derived from the target container’s cgroup v2 cpu.stat usage_usec delta around each cell.

modeNwall (s)server cpu %µs / chunkbasisok
A-bridge-coalesced50,0000.0010.100wall
A-bridge-coalesced100,0000.0010.100wall
A-bridge-coalesced250,0000.0020.200wall
B-paced-100us50,0005.2106.600server-cpu-overhead
B-paced-100us100,00010.4409.800server-cpu-overhead
B-paced-100us250,00026.0807.600server-cpu-overhead

what this means

the parser/dispatcher path on this server delivers one event per chunked-transfer-encoding chunk, so an attacker who sends a body as N one-byte chunks consumes roughly N × (mode-b µs/chunk) of server cpu on a single core. amplification scales linearly with N until the framework’s max_request_body_size (or equivalent) is hit.

what to do today

  • if this server runs as an origin behind nginx with the default proxy_request_buffering on, the per-chunk attack shape does not reach this server — nginx delivers one content-length-framed body to the upstream in a single recv().
  • if deployed direct-exposed, behind haproxy with default streaming, or behind any reverse proxy with proxy_request_buffering off, the per-chunk cost reaches this server.
  • there is no framework-level chunk-count limit in the default config; use a frontend buffer, transport-layer rate limiting, or a wrapping middleware that imposes a chunk-count cap before draining the body.

reproducer

the full reproducer for this server is in the paper repo. the docker container pins HAProxy (with Lua applet for body sink) 3.0.23 and constrains the test container to a single cpu (--cpus=1). the prober script implements mode a (bridge-coalesced) and mode b (paced 100 µs) per the methodology section.

see the draft pdf for the full per-framework discussion.

on this page