Troubleshooting
When something isn’t working, run doctor first. It catches the vast
majority of issues in one round-trip. Beyond that, here are the most common
failure modes and what to do about each.
“No espctl tools available” / “Failed to start MCP server”
Your client can’t even spawn the MCP server.
Check:
- Is the absolute path to
espctlcorrect in your client config? Runls -l /path/to/espctlto confirm. - Does it have execute permission?
chmod +xif not. - Run
espctl mcp servein a terminal manually. What does it print to stderr? Common issues:cannot find store at <path>— the store doesn’t exist or has wrong permissions.- dynamic linker errors — the binary was built against newer libc than your system has; rebuild from source or grab a different release.
- For Claude Desktop on macOS specifically: GUI apps don’t inherit your
shell’s env vars. List every env var explicitly in
claude_desktop_config.jsonrather than relying on~/.zshrc.
doctor reports control_plane: error
Your MCP server is running fine but can’t reach the build server.
Check:
curl ${CONTROL_BASE_URL}/health— does it return 200 with a JSON body?- Is
CONTROL_BASE_URLactually a URL? Common mistakes: missinghttp://orhttps://scheme, trailing slash, or pasting an SSH alias instead of a routable hostname. - DNS —
digornslookupthe host. If it fails, you may need to use the IP form (http://<your-server-ip>) until DNS resolves. - Firewall — outbound port 80/443 must be reachable from your machine.
doctor reports control_plane: ok but builds still fail
The MCP server can reach the build server, but builds aren’t producing output.
Check:
- Is
MCP_AUTH_SECRETset and correct? Builds need it;doctoronly needs the build server to respond to/health. Without the secret, you’ll see “401 Unauthorized” in the response to/grant/request. If you suspect the secret was revoked, get a fresh access key from the control plane. - Is your machine’s clock in sync with the build server? Permissions have short TTLs; if either side’s clock is off by more than ~30 seconds, every permission expires before it can be used.
WebRTC connection establishes but immediately closes
on_open fires but the connection drops within seconds, or on_open
never fires at all.
Likely causes:
- Connection negotiation failed. No candidate pair worked. The peer
connection state goes to
Failedafter ~5 seconds and the data channels never open. Cause: network restrictions or firewalls block all UDP and the fallback servers aren’t configured or reachable. - Network restrictions on both sides. Direct peer-to-peer is impossible;
forces a relay through fallback servers. Make sure the build server
returns at least one relay entry in
ice_servers. - Relay credentials expired. Relay credentials rotate per-session; if your client cached one from an earlier session, it’s stale. Open a fresh session.
- Browser blocked WebRTC. Some corporate browser policies disable
WebRTC entirely. Check
chrome://webrtc-internals/(Chrome) for the connection candidate dump.
Fix pattern: Always implement a fast-fail in your client that watches
for RTCPeerConnection.connectionState === 'failed' in parallel with
waiting for on_open. Wrap connect() in a 3-attempt retry loop with a
2-second delay between attempts.
Build hangs in pending for a long time
The permission was issued, but no build machine picked up the job.
Likely causes:
- No build machine is currently free to respond — the job auto-assigns shortly.
- No build machine has the requested target’s toolchain (e.g.
esp32p4may not yet be on every machine). If the job is still unassigned after a few minutes, try a more common target to distinguish “no machines at all” from “specific-target toolchain missing”.
Build fails with a compiler error
This is the easy case. Ask your AI assistant:
Run
parse_build_errorson the latest build, then run thediagnose-build-errorprompt against the result.
You’ll get a structured “what’s wrong, why, here’s the fix” rather than a 500-line log dump.
Send queue full / firmware download stalls
Throughput drops dramatically partway through a firmware download (only
matters for large *.bin files over a relay connection).
Cause: Production build machines cap the send queue at 128 KB. Combined with a 500 ms round-trip relay, this caps throughput at ~256 KB/s, not the multi-MB/s you’d see on a direct peer-to-peer connection.
Fix: This is by design (preventing memory exhaustion when the receiver can’t keep up). If your firmware is large enough that it matters, prefer a direct peer-to-peer connection over a relay. Direct connections aren’t affected as severely because the round-trip time is much lower.
Still stuck
- Ask your AI assistant to read the
install://overviewresource — it returns the same env-var table from inside the MCP server, which lets you cross-reference what the server thinks its config is. - File an issue on this project’s repository with the output of
doctorattached.
See also
doctor— health-check tool.- Environment Variable Index — every env var in one place.
Browser MCP Console (esphome.cloud/mcp/esp-idf)
Connect returns “Failed to fetch”
The browser can’t reach the control-plane API. Check:
- Are you behind a firewall that blocks WebRTC ports?
- Try a different network or disable VPN/proxy
Connect returns 429 “Too Many Requests”
You’ve exceeded the daily anonymous build limit or the per-session concurrency cap.
- Daily limit: Wait until tomorrow, or sign in for increased quotas
- Concurrency cap: Click Cancel on any stuck connection, wait 2 minutes for in-flight jobs to expire, then click Connect again
Connect button says “Connecting…” forever
The grant was issued but WebRTC negotiation stalled. Click the Cancel button next to “Connecting…” to release the slot, then try again. This usually fixes itself on retry.
Build fails instantly (0 log lines)
The project files didn’t reach the build machine. Your AI agent must
include a project_bundle (base64-encoded git bundle) with the build
tool call. If the bundle was included, verify it decodes cleanly —
base64 must not contain line breaks.
Build succeeds but firmware download is slow
Throughput over a relay connection is gated by a 256 KB send-buffer backpressure threshold and a 128 KB SCTP pending queue cap in production. Direct peer-to-peer connections are faster when available.