You probably don't need a tracing platform on week one. You need disciplined logs. Tracing platforms (Langfuse, Helicone, Arize, Datadog) become high-leverage when traffic is real and the team is multiple people. Before that, JSON logs in S3 will tell you everything you need to know.
What to log per run
- run_id, user_id (if applicable), timestamp
- Input — the trigger that started the run
- Every model call with its prompt, response, tokens, latency, cost estimate
- Every tool call with arguments, result (or error)
- Final output and exit reason (done, budget exceeded, error)
What to skip
- Secrets — redact before logging. Always.
- PII you can avoid — log the smallest amount that helps you debug.
- Anything 'in case it helps later' — bloat kills usefulness.
Knowledge check
0/1 answered1. Before adopting a tracing platform, the highest-leverage logging investment is:
Discussion
0 commentsBe the first to start the conversation.