Treni Experiment Docs

Thesis

Paper Objectives and Thesis

Results

Findings Changelog Leaderboard Routing Comparison

Artifacts

Canonical G5 Artifact Set Benchmark Status Raw Artifacts

Roadmap

TODO

Live execution checklist and next actions.

Priority Order

Current Checklist

Track A: Cold/Hot Foundations

True TTFT instrumentation in runtime request path.
3x cold-first-hit repeatability set (G5).
3x warm steady-state repeatability set (G5).
Cold bottleneck fix: per-model tensor lookup index cache.
Cold rerun after fix with artifact pack.
Add stage-level cold decomposition metrics (tokenizer load, index build, tensor upload, first decode step).
Optimize remaining Qwen cold-first-hit stages after decomposition.

Track B: Internal vs External Routing

Minimal external baseline harness.
Matched task set and budgets.
Internal vs external run and report (G5).
Add explicit failure-amplification tests (timeouts/retries under load).

Track C: Agentic Loop Capability

Freeze 3 loop scenarios and success criteria.
Implement evaluators (success rate + steps-to-convergence).
Run internal vs external loop benchmark.
Publish trace-backed capability report.

Expansion

Full A100 run set.
Full H100 run set.
Paper-grade figure/table package.

Immediate Next Actions

Implement stage-level cold instrumentation for Qwen/BART/Donut.
Produce a short cold-stage breakdown report from 3-run measurements.
Patch next cold bottleneck and rerun cold validation set.

Raw Artifacts

Direct JSON and report files for each benchmark set.

On this page

Priority Order Current Checklist Track A: Cold/Hot Foundations Track B: Internal vs External Routing Track C: Agentic Loop Capability Expansion Immediate Next Actions