Treni

Leaderboard

Side-by-side benchmark results for T4 and G5 experiment sets.

Lower time is better.

G5 Foundation (Canonical)

MetricValue
Baseline pipeline mean2407.974 ms
Runtime warm request mean (3-run)82.707 ms
Runtime warm request p99 (3-run)91.738 ms
Baseline/runtime ratio (pipeline)29.11x

G5 Cold First-Hit — True TTFT (3-run Means, 2026-02-17)

ModelPre-fix TTFTPost-fix TTFTSpeedup
qwen27574.564 ms1774.951 ms15.535x
donut67360.388 ms572.485 ms117.663x
bart77520.798 ms743.652 ms104.243x
minilm23.342 ms22.698 ms1.028x

All values above are runtime-instrumented timing.ttft_ms.

Internal vs External Routing (G5, 2026-02-17)

MetricInternalExternalRatio
Mean latency94.849 ms97.927 ms1.032x
TaskInternalExternal
general_short150.767 ms152.274 ms
receipt_extract80.732 ms81.270 ms
search_grounded46.945 ms57.237 ms
summarize_short100.950 ms100.928 ms

Internal routing is faster across all tasks (ratio > 1 means internal wins).

Historical Legacy Mixed-Mode Context

SetRuntime HTTP request meanRuntime HTTP request p99
T4 (2026-02-15)146279.609 ms156769.1 ms
G5 (2026-02-15)77449.605 ms83346.187 ms
G5 registry-cached single run (2026-02-16)82.913 ms91.877 ms

Parity Health

SetCheckedFailedStrict
T430true
G530true

On this page