Session Replay
Group traces by session ID to replay multi-turn conversations and track user journeys.
What are sessions?
A session groups multiple traces that belong to the same user interaction. When a user has a multi-turn conversation with your LLM application, each turn typically creates a separate trace. By tagging those traces with the same sessionId, Traceway links them together so you can replay the entire conversation in order.
Sessions are not a separate data model — they're a view over traces that share a sessionId. Any trace can optionally belong to a session, and a session is created implicitly the first time a trace with that sessionId is recorded.
Setting session IDs in the SDK
Pass a sessionId when creating a trace:
import { Traceway } from 'traceway';
const tw = new Traceway();
// Each turn in the conversation uses the same sessionId
const sessionId = 'session_abc123';
const reply1 = await tw.trace('chat-turn', async (ctx) => {
const response = await ctx.llmCall('gpt-4o', {
model: 'gpt-4o',
provider: 'openai',
input: [{ role: 'user', content: 'What is Kubernetes?' }],
}, async (span) => {
const result = await callLLM(messages);
span.setOutput(result);
return result;
});
return response;
}, { sessionId });
// Later, the user asks a follow-up
const reply2 = await tw.trace('chat-turn', async (ctx) => {
const response = await ctx.llmCall('gpt-4o', {
model: 'gpt-4o',
provider: 'openai',
input: [
{ role: 'user', content: 'What is Kubernetes?' },
{ role: 'assistant', content: reply1 },
{ role: 'user', content: 'How does it compare to Docker Swarm?' },
],
}, async (span) => {
const result = await callLLM(messages);
span.setOutput(result);
return result;
});
return response;
}, { sessionId });With the low-level API:
const trace = await tw.createTrace('chat-turn', {
sessionId: 'session_abc123',
});Use a stable, unique identifier for session IDs. UUIDs, database session tokens, or user_id:timestamp composites all work well. The only requirement is consistency — every trace in the same conversation must use the exact same string.
Viewing sessions in the dashboard
The Sessions page in the Traceway dashboard groups traces by their sessionId. Each row shows a session with its aggregate metrics and the number of traces (turns) it contains.
Click a session to see all of its traces in chronological order. The session detail view shows:
- The full sequence of traces, ordered by
started_at - Each trace's spans, expandable inline
- The input and output of every span, so you can read the conversation turn by turn
This makes it straightforward to replay the exact back-and-forth a user had with your application.
Aggregate metrics
Traceway computes the following metrics at the session level by rolling up data from all traces in the session:
| Metric | Description |
|---|---|
| Total tokens | Sum of input_tokens + output_tokens across all spans in all traces |
| Total cost | Sum of estimated cost across all spans |
| Span count | Total number of spans across all traces |
| Trace count | Number of turns (traces) in the session |
| Duration | Time from the first trace's started_at to the last trace's ended_at |
| Status | Failed if any trace in the session contains a failed span |
These metrics help you understand the total resource consumption of a user journey, not just individual requests.
Multi-turn conversation replay
The session replay view arranges traces chronologically, letting you read through a conversation the same way a user experienced it. For each turn you see:
- The user's input — the messages sent to the LLM
- The model's response — the completion output
- Intermediate steps — tool calls, retrieval spans, or any custom spans within that turn
- Timing and cost — how long each turn took and what it cost
This is particularly useful for agent-style applications where each turn may involve multiple LLM calls, tool invocations, and branching logic. The session view flattens this into a readable timeline.
Use cases
Debugging conversation flows
When a user reports that the assistant gave a wrong answer, you can look up their session and trace through the entire conversation to find where things went wrong. Was the context window missing relevant history? Did a tool call return bad data? Did the system prompt change between turns?
Tracking user journeys
Session metrics show you how users actually interact with your application. You can identify patterns like:
- Average number of turns per session
- Sessions where cost spikes unexpectedly
- Drop-off points where users stop engaging
Identifying failure patterns
Filter sessions by status to find conversations that contain failed spans. Common patterns include:
- Rate limit errors mid-conversation
- Context window overflow on later turns
- Tool calls that fail after the model generates malformed arguments
Sessions with many turns can accumulate large context windows. Monitor input_tokens across turns to catch conversations approaching model context limits before they cause failures.
Filtering sessions
You can filter the sessions list by:
- Date range — find sessions from a specific time period
- Status — show only sessions with failures
- Cost threshold — find expensive sessions
- Tags — filter by any tags applied to the traces in the session
Combine these filters to answer questions like "which sessions in the last week cost more than $1?" or "which failed sessions involved the gpt-4o model?"