chunkloris: kestrel
on this page
part of the chunkloris per-chunk amplification survey. this page is the per-server record for Kestrel under http/1.1 chunked transfer encoding.
at a glance
- server: Kestrel
9.0 (Microsoft.AspNetCore 9.0; bundled in mcr.microsoft.com/dotnet/aspnet:9.0) - runtime: dotnet-9.0
- ecosystem: dotnet
- concurrency model: ThreadPool + async/await + System.IO.Pipelines
- parser: Http1ChunkedEncodingMessageBody (managed, in Kestrel.Core)
- delivery granularity:
coalesced-by-pipe - chunk-limit helper: none exposed by the framework
- verdict: batches correctly β the implementation coalesces wire units before waking the application, either via an explicit per-stream frame credit, a pipelined reader, or a similar batching primitive. mode b cpu cost is in the band you would expect from a per-
recv()batched delivery. - scaling exponent (mode a): 1.00 (wall time vs N, log-log slope across common cells)
measurements
all cells run on a 1-vcpu docker container. cpu cost is derived from the target containerβs cgroup v2 cpu.stat usage_usec delta around each cell.
| mode | N | wall (s) | server cpu % | Β΅s / chunk | basis | ok |
|---|---|---|---|---|---|---|
A-bridge-coalesced | 50,000 | 0.072 | β | 1.440 | wall | β |
A-bridge-coalesced | 100,000 | 0.124 | β | 1.240 | wall | β |
A-bridge-coalesced | 250,000 | 0.321 | β | 1.280 | wall | β |
B-paced-100us | 50,000 | 5.472 | β | 9.440 | server-cpu-overhead | β |
B-paced-100us | 100,000 | 10.385 | β | 3.850 | server-cpu-overhead | β |
B-paced-100us | 250,000 | 25.900 | β | 3.600 | server-cpu-overhead | β |
CONTROL-content-length | ? | 0.001 | β | β | unknown | β |
CONTROL-content-length | ? | 0.001 | β | β | unknown | β |
CONTROL-content-length | ? | 0.001 | β | β | unknown | β |
parser path β source citations
- chunked-decoder pump β
src/Servers/Kestrel/Core/src/Internal/Http/Http1ChunkedEncodingMessageBody.cs PumpAsync (~L104) -> Read (~L195) -> ParseChunkedPrefix (~L249) -> ReadChunkedData (~L381)β source
what this means
the implementation batches wire units before waking the application (either via an explicit per-stream frame credit, a pipelined reader, or an equivalent primitive). the cpu cost under paced mode b is in the band you would expect from a per-recv() batched delivery.
what to do today
- this server already batches application delivery. measure parser cpu separately under paced mode b if cpu cost matters for your deployment; the application boundary is decoupled from the wire framing rate.
reproducer
the full reproducer for this server is in the paper repo. the docker container pins Kestrel 9.0 (Microsoft.AspNetCore 9.0; bundled in mcr.microsoft.com/dotnet/aspnet:9.0) and constrains the test container to a single cpu (--cpus=1). the prober script implements mode a (bridge-coalesced) and mode b (paced 100 Β΅s) per the methodology section.
see the draft pdf for the full per-framework discussion.