Skip to content

Raju Ghosh — Notes

AI Observability

rajughoshdevai/notes

AI Observability#

Making LLM workloads measurable with the same rigor as any other distributed system.

The gap#

Traditional APM tools don't know about:

Token counts (input / output / cached)
Model-level latency budgets
Tool-call trees in agents
Eval scores over time
Prompt version drift

What I'm building#

OpenTelemetry GenAI semantic conventions in our Gemini API layer
Grafana dashboards for per-tenant token spend
Tempo traces across multi-hop agent calls (OCR → productGPT → retrieval)
Alerts on eval regression, not just latency

Pages#

(placeholder) OTel GenAI semconv — the short version
(placeholder) Grafana dashboards for LLM cost attribution
(placeholder) Tracing agents: spans that actually help on-call