Traceway

Open-source observability for LLM applications. Record every call, debug failures, build test datasets from production data.

Traceway records traces and spans from your LLM application, so you can see exactly what happened on every request: which model was called, what the input was, what came back, how long it took, and how much it cost.

It runs as a single Rust binary. You can self-host it locally with SQLite, or deploy it to the cloud with Turbopuffer and Postgres.

Why Traceway

LLM applications are hard to debug. A single user request might chain multiple model calls, retrieve documents, invoke tools, and format the final response. When something goes wrong — a hallucination, a slow response, an unexpected refusal — you need to see the full picture.

Traceway gives you that picture. Every step in your pipeline becomes a span, organized into a trace. You see the exact prompts sent, the exact completions returned, the token counts, the latency, and the cost. No sampling, no aggregation — every request, recorded in full.

What you get

Traces and spans

Every LLM call, tool invocation, retrieval step, and custom operation recorded as a structured span inside a trace. Spans form a tree, so you can see parent-child relationships between steps.

Cost and latency tracking

Token counts and estimated costs per span, rolled up per trace. Traceway ships with a pricing table covering 50+ models across OpenAI, Anthropic, and other providers. Latency is recorded per-span with millisecond precision.

Datasets

Collect input/output pairs from production spans into datasets. Use them as regression test suites. Supports two kinds of datapoints: generic key-value pairs and LLM conversation threads.

Evaluations

Run your dataset through a model configuration and score the results. Four scoring strategies: exact match, substring contains, LLM-as-judge, or no scoring (manual review). Compare multiple eval runs side-by-side to measure the impact of prompt or model changes.

Your app  ──SDK──>  Traceway API  ──>  Storage (SQLite or Turbopuffer)
                        │
                   Dashboard UI
                 (platform.traceway.ai)

API server — Rust binary, runs on port 3000. Handles trace/span ingestion, dataset CRUD, eval execution, and real-time events.
Proxy — Optional. Sits in front of your LLM provider (port 3001) and automatically records spans. You point your OpenAI base URL at it instead of api.openai.com.
Dashboard — SvelteKit SPA at platform.traceway.ai. Shows traces, spans, datasets, eval results, analytics, and settings.
Storage — SQLite for local dev, Turbopuffer for cloud. Auth data lives in Postgres (cloud only).

Why Traceway

What you get

Traces and spans

Cost and latency tracking

Datasets

Evaluations

Capture rules

Review queue

Real-time events

Proxy

Architecture

Next steps

Getting Started

Tracing

TypeScript SDK

Vercel AI SDK

REST API

Hosting Options

On this page