chunkloris: kestrel

part of the chunkloris per-chunk amplification survey. this page is the per-server record for Kestrel under http/1.1 chunked transfer encoding.

at a glance

  • server: Kestrel 9.0 (Microsoft.AspNetCore 9.0; bundled in mcr.microsoft.com/dotnet/aspnet:9.0)
  • runtime: dotnet-9.0
  • ecosystem: dotnet
  • concurrency model: ThreadPool + async/await + System.IO.Pipelines
  • parser: Http1ChunkedEncodingMessageBody (managed, in Kestrel.Core)
  • delivery granularity: coalesced-by-pipe
  • chunk-limit helper: none exposed by the framework
  • verdict: batches correctly β€” the implementation coalesces wire units before waking the application, either via an explicit per-stream frame credit, a pipelined reader, or a similar batching primitive. mode b cpu cost is in the band you would expect from a per-recv() batched delivery.
  • scaling exponent (mode a): 1.00 (wall time vs N, log-log slope across common cells)

measurements

all cells run on a 1-vcpu docker container. cpu cost is derived from the target container’s cgroup v2 cpu.stat usage_usec delta around each cell.

modeNwall (s)server cpu %Β΅s / chunkbasisok
A-bridge-coalesced50,0000.072β€”1.440wallβœ“
A-bridge-coalesced100,0000.124β€”1.240wallβœ“
A-bridge-coalesced250,0000.321β€”1.280wallβœ“
B-paced-100us50,0005.472β€”9.440server-cpu-overheadβœ“
B-paced-100us100,00010.385β€”3.850server-cpu-overheadβœ“
B-paced-100us250,00025.900β€”3.600server-cpu-overheadβœ“
CONTROL-content-length?0.001β€”β€”unknownβœ“
CONTROL-content-length?0.001β€”β€”unknownβœ“
CONTROL-content-length?0.001β€”β€”unknownβœ“

parser path β€” source citations

  • chunked-decoder pump β€” src/Servers/Kestrel/Core/src/Internal/Http/Http1ChunkedEncodingMessageBody.cs PumpAsync (~L104) -> Read (~L195) -> ParseChunkedPrefix (~L249) -> ReadChunkedData (~L381) β†’ source

what this means

the implementation batches wire units before waking the application (either via an explicit per-stream frame credit, a pipelined reader, or an equivalent primitive). the cpu cost under paced mode b is in the band you would expect from a per-recv() batched delivery.

what to do today

  • this server already batches application delivery. measure parser cpu separately under paced mode b if cpu cost matters for your deployment; the application boundary is decoupled from the wire framing rate.

reproducer

the full reproducer for this server is in the paper repo. the docker container pins Kestrel 9.0 (Microsoft.AspNetCore 9.0; bundled in mcr.microsoft.com/dotnet/aspnet:9.0) and constrains the test container to a single cpu (--cpus=1). the prober script implements mode a (bridge-coalesced) and mode b (paced 100 Β΅s) per the methodology section.

see the draft pdf for the full per-framework discussion.

on this page