Self-hosted AI inference, fully observable.
WES (thermally-honest MPG for AI), 18 observation patterns, instant model fit checks, and programmable APIs for Ollama, vLLM, and llama.cpp. Install in 60 seconds — no sudo, no account, nothing to configure.
curl -fsSL https://wicklee.dev/install.sh | bash
What Wicklee does
- See every node — Live GPU temp, VRAM usage, and inference throughput across your entire fleet — auto-detected, zero configuration.
- Thermal Intelligence — Monitor thermal thresholds and health signals across your entire fleet. Prevent hardware degradation with real-time alerts and WES-aware health telemetry.
- Understand your costs — WES — the MPG for AI — scores each node on tok/s per watt, thermally adjusted. Wattage-per-Token is the metric cloud providers don’t surface. Now you have it for your local fleet.
Sovereign by design
Telemetry never leaves your hardware. Self-hostable, air-gap friendly, and built to complement your existing Prometheus / Grafana / Datadog stack rather than replace it.
Built for agents & LLMs
Best-node selection API, reactive automation webhooks, performance CI/CD, and an MCP server so AI agents can query your fleet's status, efficiency, and model fit directly.