Skip to content

Notes#

Structured, evergreen notes organized by topic. These pages get updated in place as my understanding evolves.

  • LLMOps


    Production AI reliability — evals, guardrails, prompt versioning, token-cost budgets, fallbacks.

    Browse notes

  • AI Observability


    LLM metrics in Grafana, OpenTelemetry GenAI semconv, traces across agent hops.

    Browse notes

  • SRE


    Multi-cloud Kubernetes, FluxCD, Karpenter, DR drills, incident write-ups.

    Browse notes