Troubleshooting
Common breakages and how to unstick them.
"QR never appears"
Symptoms. GET /v1/instances/:id/qr returns 200 with empty body, or the dashboard's QR panel stays blank.
Causes (in order of likelihood):
- Instance is already in
qr_requiredand the QR string expired. QR strings rotate every ~30s. The session manager generates a new one and re-emits aqr.updatedevent. Wait one rotation. - Baileys is rate-limited by WhatsApp. Repeatedly recreating sessions for the same number triggers temporary rate limits. Wait 5β10 minutes.
MAX_SESSIONS_PER_WORKERexceeded. The backend rejects new socket creation past this cap (default 50). Check/readyzβ it surfaces the count.
"sessionInvalidated after a successful pair"
WhatsApp explicitly told us the session is dead. Most common reasons:
- The user opened WhatsApp Web in a browser and that session bumped ours off (only one Web session per number).
- The user manually unlinked the device from Linked devices on their phone.
- The number was logged in from a fresh install of WhatsApp on another phone.
What to do. The auth-state is wiped automatically; the instance moves to disconnected. Issue a new QR fetch. If you see this repeatedly for the same number, it's almost always a "two Web sessions" situation β make sure no one is opening web.whatsapp.com with the same number.
"429 rate_limited"
The per-customer token bucket is empty. Defaults: 60-burst, 1 req/sec sustained.
What to do.
- Honor the
Retry-Afterheader. It tells you when the bucket will have a token. - If you're hitting this from a sustained workload (not a burst), talk to us β we'll raise your limit.
- Don't shard across multiple API keys to escape the limit; the bucket is per-customer, not per-key.
"Webhook never arrives"
Walk this checklist:
- Is the endpoint URL public-DNS?
localhostand RFC1918 are rejected at create time. (Use ngrok for local dev.) - Is your endpoint returning 2xx? Check
GET /v1/webhook-deliveries?status=failedβlast_errorandlast_response_statustell you what we saw. - Did you pause the queue? If your endpoint has been down, the worker may have gotten ahead of you and burned its retry budget. Failed deliveries past attempt 6 are not retried automatically.
- Are you filtering server-side? If you set
events: ['message.received']on the endpoint, you won't getinstance.connectedevents β by design.
"Webhook signature doesn't verify"
Top culprits:
- You're verifying against a parsed/reserialized body. The signature is over the raw bytes. In Express, capture them with
verify: (req, res, buf) => { req.rawBody = buf }on the JSON parser. - Clock drift. The server enforces a 5-minute timestamp tolerance. If your endpoint host's clock is more than that from real time, every webhook fails. Run NTP.
- Wrong secret. Each endpoint has its own signing secret. If you have multiple endpoints, make sure you're verifying with the one that matches
X-WhatIsUp-Endpoint-Id.
"The dashboard says my instance is connected but messages aren't being sent"
Double-check the obvious before going deeper:
- Are you using the right
instance_idinPOST /v1/instances/:id/messages? It's easy to copy the wrong one. - Is the recipient
toan MSISDN with country code, no+, no spaces? - Look at the audit log (Activity tab on the dashboard). Every send leaves a row.
If all three are fine, the issue is downstream of the API:
- The send was queued but the WhatsApp socket dropped before the message left. Check
/readyzβ if Baileys' health probe is failing, the session is in a bad state even if the row saysconnected. - The send hit a Baileys rate limit (separate from our rate limiter). These show up as
message.failedwebhooks witherror.code: rate_limited.
"Restarting the backend lost my sessions"
Auth-state is supposed to survive restarts. If it didn't:
- Ephemeral filesystem. Containers without a persistent volume on
data/sessions/lose the auth-state on every redeploy. Mount a volume. - Permission mismatch. If the volume is owned by a different uid than the Node process, Baileys silently fails to read the existing state and falls back to "fresh pair". Check the file permissions.
Where to get help
- The audit log (
audit_eventsin the dashboard) records every state-changing action. /readyzdoes deep checks: DB transaction, Redis ping, Baileys probe.- Open an issue at github.com/aneps/zappi β include the
correlation_idfrom the response header if you have one.