chunkloris: HAProxy (with Lua applet for body sink)

part of the chunkloris per-chunk amplification survey. this page is the per-server record for HAProxy (with Lua applet for body sink) under http/1.1 chunked transfer encoding.

at a glance

server: HAProxy (with Lua applet for body sink) 3.0.23
runtime: C
ecosystem: c
concurrency model: event-loop
parser: haproxy proto_http.c chunked decoder
delivery granularity: per-chunk
chunk-limit helper: none exposed by the framework
verdict: per-chunk — the parser/dispatcher boundary delivers one event per wire chunk. cpu cost under paced mode b is measurable per chunk.
scaling exponent (mode a): 0.50 (wall time vs N, log-log slope across common cells)
scaling exponent (mode b): 1.00

measurements

all cells run on a 1-vcpu docker container. cpu cost is derived from the target container’s cgroup v2 cpu.stat usage_usec delta around each cell.

mode	N	wall (s)	server cpu %	µs / chunk	basis	ok
`A-bridge-coalesced`	50,000	0.001	—	0.100	wall	✓
`A-bridge-coalesced`	100,000	0.001	—	0.100	wall	✓
`A-bridge-coalesced`	250,000	0.002	—	0.200	wall	✓
`B-paced-100us`	50,000	5.210	—	6.600	server-cpu-overhead	✓
`B-paced-100us`	100,000	10.440	—	9.800	server-cpu-overhead	✓
`B-paced-100us`	250,000	26.080	—	7.600	server-cpu-overhead	✓

what this means

the parser/dispatcher path on this server delivers one event per chunked-transfer-encoding chunk, so an attacker who sends a body as N one-byte chunks consumes roughly N × (mode-b µs/chunk) of server cpu on a single core. amplification scales linearly with N until the framework’s max_request_body_size (or equivalent) is hit.

what to do today

if this server runs as an origin behind nginx with the default proxy_request_buffering on, the per-chunk attack shape does not reach this server — nginx delivers one content-length-framed body to the upstream in a single recv().
if deployed direct-exposed, behind haproxy with default streaming, or behind any reverse proxy with proxy_request_buffering off, the per-chunk cost reaches this server.
there is no framework-level chunk-count limit in the default config; use a frontend buffer, transport-layer rate limiting, or a wrapping middleware that imposes a chunk-count cap before draining the body.

reproducer

the full reproducer for this server is in the paper repo. the docker container pins HAProxy (with Lua applet for body sink) 3.0.23 and constrains the test container to a single cpu (--cpus=1). the prober script implements mode a (bridge-coalesced) and mode b (paced 100 µs) per the methodology section.

see the draft pdf for the full per-framework discussion.

chunkloris: haproxy (with lua applet for body sink)

at a glance

measurements

what this means

what to do today

reproducer

related pages

more in infosec