Self-hosted AI inference, fully observable.

WES (thermally-honest MPG for AI), 18 observation patterns, instant model fit checks, and programmable APIs for Ollama, vLLM, and llama.cpp. Install in 60 seconds — no sudo, no account, nothing to configure.

curl -fsSL https://wicklee.dev/install.sh | bash

What Wicklee does

See every node — Live GPU temp, VRAM usage, and inference throughput across your entire fleet — auto-detected, zero configuration.
Thermal Intelligence — Monitor thermal thresholds and health signals across your entire fleet. Prevent hardware degradation with real-time alerts and WES-aware health telemetry.
Understand your costs — WES — the MPG for AI — scores each node on tok/s per watt, thermally adjusted. Wattage-per-Token is the metric cloud providers don’t surface. Now you have it for your local fleet.

Sovereign by design

Telemetry never leaves your hardware. Self-hostable, air-gap friendly, and built to complement your existing Prometheus / Grafana / Datadog stack rather than replace it.

Built for agents & LLMs

Best-node selection API, reactive automation webhooks, performance CI/CD, and an MCP server so AI agents can query your fleet's status, efficiency, and model fit directly.

Documentation · Pricing · Blog · Metrics reference