Lesson 556 of 1596
Observability: Logs, Traces, And Soul Timelines
A long-running agent is a black box unless you instrument it. Logs tell you what; traces tell you why; the soul timeline tells you whether the runtime is healthy at all.
Creators · Tools Literacy · ~7 min read
Why agents need their own observability
A web service that's slow is obvious — pages don't load. A soul that's quietly drifting — choosing the wrong skill, looping on the same heartbeat, burning model budget while you sleep — is invisible until you check. OpenClaw is opinionated here: every heartbeat emits a structured log, every skill call emits a trace span, and every soul has a timeline view in Mission Control. Use them or you're flying blind.
Three layers, three questions
Compare the options
| Layer | Question it answers | Where it lives |
|---|---|---|
| Logs | What happened in this heartbeat? | stdout / file / log drain (Loki, Datadog) |
| Traces | How long did each step take, and which step was the bottleneck? | OTLP endpoint (Jaeger, Honeycomb, Vercel Observability) |
| Soul timeline | Is this soul still healthy as a long-running thing? | Mission Control UI / Grafana dashboard |
| Audit log | Did the soul actually do what we authorized? | Append-only file in /var/openclaw/audit (lesson 1) |
What to surface in logs
OpenClaw's structured logs include heartbeat ID, soul slug, model used, token count, skill calls, duration, and outcome. JSON-shaped, one line per event. The default level is info — keep it there. Cranking to debug spams useful patterns into noise; bumping to warn hides exactly the boring-success events you need to spot the abnormal one.
One line of OpenClaw heartbeat log. Grep-friendly, Loki-friendly, eyeball-friendly.
{ "ts": "2026-04-27T08:00:00.123Z", "level": "info", "event": "heartbeat.complete", "soul": "inbox-triage", "heartbeat_id": "hb_2k4n9", "interval_s": 900, "actual_duration_s": 12.4, "model": "qwen3.5:8b", "tokens_in": 4218, "tokens_out": 612, "skills_called": ["gmail.list", "gmail.label"], "approvals_pending": 0, "outcome": "success" }Traces: where the time actually went
A heartbeat looks like a single event in logs but is a tree of work — model call, skill call, sub-skill call, return. OTLP traces give you that tree. OpenClaw exports OpenTelemetry by default; point it at a collector (Jaeger locally, Honeycomb or Vercel Observability for hosted) and you get flame graphs of every heartbeat. The first time a soul feels 'slow,' a trace shows you it's the model — or it's that one skill that's quietly making three round-trips. Don't guess.
The soul timeline
Mission Control's soul-timeline view is the long-running version of the trace. It plots heartbeats over hours and days — interval, duration, outcome, token spend. Patterns you can only see here: a soul whose duration is creeping up day over day (memory bloat), a soul whose token-per-tick has 10x'd since you swapped models, a soul whose interval drifts because heartbeats run longer than the gap between them.
Sketch your dashboard before you build it
- 1Top row: number of healthy souls, number with pending approvals, number with errors in last hour. Big numbers, no charts.
- 2Per-soul row: last heartbeat timestamp, last duration, last token cost, status dot (green / yellow / red).
- 3Trend chart: tokens-per-day for each soul, last 7 days. Spot a soul whose model swap doubled its cost overnight.
- 4Heartbeat-anomaly chart: actual_duration vs interval, log scale. Anything trending toward 1.0 is a soul that's about to overlap itself.
- 5Audit feed: scrolling list of skill calls, last 50. The chaos-monkey check — does what's happening match what you authorized?
Alerting on heartbeat anomalies
The high-leverage alerts are not 'soul errored' — those are loud and self-announcing. The ones that catch real problems are the silent failures: a soul that hasn't ticked, a soul whose tick is taking longer than its interval, a soul whose token spend doubled overnight without a model change. Wire these as paging-grade alerts; everything else is dashboard-grade.
Compare the options
| Alert | Condition | Why it matters |
|---|---|---|
| Heartbeat missed | No heartbeat.complete event in 2x interval window | Soul is dead, hung, or the host is down — and you wouldn't notice otherwise |
| Tick > interval | actual_duration_s > interval_s for 3 consecutive heartbeats | Soul is overlapping itself; ticks are queuing; cost will runaway |
| Token spend spike | Daily tokens_in for a soul > 2x rolling 7-day median | Model swap, prompt regression, infinite tool loop, or context bloat |
| Pending approvals piling up | approvals_pending > 5 for over an hour | Soul is stuck waiting for a human; needs attention or the gate needs tuning |
| Repeated skill error | Same skill returning error in 5 consecutive heartbeats | Skill is broken, credentials expired, or the upstream API changed |
Apply: instrument one soul this week
- 1Pick the soul that runs most often.
- 2Tail its logs for one full heartbeat — read every field. If anything's missing that you'd want at 3am, raise a feature request or add a log line.
- 3Wire OTLP to a free Honeycomb / Jaeger / SigNoz collector. Look at one trace.
- 4Sketch your one-screen dashboard on paper before you open Grafana.
- 5Set the 'heartbeat missed' and 'tick > interval' alerts. Skip the rest until you've used the dashboard for a week.
Key terms in this lesson
The big idea: a long-running agent without observability is a long-running mystery. Wire logs, traces, and the soul timeline before you trust a soul with anything that matters.
End-of-lesson quiz
Check what stuck
8 questions · Score saves to your progress.
Tutor
Curious about “Observability: Logs, Traces, And Soul Timelines”?
Ask anything about this lesson. I’ll answer using just what you’re reading — short, friendly, grounded.
Progress saved locally in this browser. Sign in to sync across devices.
Related lessons
Keep going
Creators · 10 min
Codex With Custom Tools And MCP
Codex's real power shows when you connect it to your own tools — internal APIs, datastores, ticketing systems — usually via Model Context Protocol.
Creators · 10 min
Debugging A Heartbeat Loop: Observability, Replay, And Failure Modes
Heartbeats fail in ways reactive agents never do — silent drift, soul-state thrash, infinite loops. Debugging them takes different tools and a different mental model.
Creators · 11 min
Building Your First OpenClaw Skill
Walk through the file layout, the SKILL.md progressive-disclosure pattern, the tool-call interface, and how to test a skill locally before sharing it. The other refrain echoed by both OpenClaw maintainers and Claude Code skill authors: write the test (the example output you want) before the procedure.
