chunkloris: hyper 1.5 + h2 0.4 (h2c)

part of the chunkloris per-chunk amplification survey. this page is the per-server record for hyper 1.5 + h2 0.4 (h2c) under http/2 (h2c) data frames.

at a glance

server: hyper 1.5 + h2 0.4 (h2c) hyper 1.5 / h2 0.4
runtime: rustc 1.85
ecosystem: rust
concurrency model: event-loop
parser: h2 crate (Rust)
delivery granularity: per-chunk
chunk-limit helper: none exposed by the framework
verdict: batches correctly — the implementation coalesces wire units before waking the application, either via an explicit per-stream frame credit, a pipelined reader, or a similar batching primitive. mode b cpu cost is in the band you would expect from a per-recv() batched delivery.
scaling exponent (mode a): 0.82 (wall time vs N, log-log slope across common cells)
scaling exponent (mode b): 1.00

measurements

all cells run on a 1-vcpu docker container. cpu cost is derived from the target container’s cgroup v2 cpu.stat usage_usec delta around each cell.

mode	N	wall (s)	server cpu %	µs / frame	basis	ok
`A-h2-bridge`	50,000	0.203	16.0	0.650	server-cpu-cgroup	✓
`A-h2-bridge`	100,000	0.365	13.5	0.492	server-cpu-cgroup	✓
`A-h2-bridge`	250,000	0.759	16.0	0.486	server-cpu-cgroup	✓
`B-h2-paced-100us`	50,000	5.375	3.8	4.086	server-cpu-cgroup	✓
`B-h2-paced-100us`	100,000	10.780	3.7	3.994	server-cpu-cgroup	✓
`B-h2-paced-100us`	250,000	26.988	3.4	3.670	server-cpu-cgroup	✓

what this means

the implementation batches wire units before waking the application (either via an explicit per-stream frame credit, a pipelined reader, or an equivalent primitive). the cpu cost under paced mode b is in the band you would expect from a per-recv() batched delivery.

what to do today

if this is an h2 origin, prefer a frontend that terminates h2 into h1 with proxy_request_buffering on upstream.
consider imposing a per-stream DATA-frame credit (count, not bytes) before forwarding the body to the application handler.
HTTP/2 byte-level flow control (WINDOW_UPDATE) does not bound the number of frames; configure stream-frame-rate limits where the implementation exposes them.

reproducer

the full reproducer for this server is in the paper repo. the docker container pins hyper 1.5 + h2 0.4 (h2c) hyper 1.5 / h2 0.4 and constrains the test container to a single cpu (--cpus=1). the prober script implements mode a (bridge-coalesced) and mode b (paced 100 µs) per the methodology section.

see the draft pdf for the full per-framework discussion.