2026-06-07 — Dan Billings

Tracing requests across three GPUs and two operating systems: Jaeger without containers

Dan Billings — 2026-06-07

This post outlines the architecture and telemetry pathways required to configure full distributed tracing across a home LLM cluster composed of a macOS client (Hermes Agent), an Arch Linux service backend (Honcho API, nomic embeddings, PostgreSQL, and Jaeger), and a Windows/WSL2 inference runner (llama-server with a 5090).


1. Introduction: The Latency Problem


2. System Architecture & Observability Map (PlantUML)

This diagram shows how components communicate and how they export trace data back to the central Jaeger instance.

Your browser does not support inline SVG. View the full-size PNG diagram.

Open full-screen PNG diagram


3. Telemetry Configuration on danarch


4. Instrumenting the macOS Edge (dans-mac-mini)


5. Interpreting the Jaeger Rich Traces