WhatIsUp.dev
Get started

Architecture

WhatIsUp.dev is one Node process, one Postgres, one Redis. That's the whole runtime topology for v1, and it's a deliberate choice: every additional service is something you have to deploy, migrate, observe, and pay for.

The three pieces

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Backend (Fastify)                                 β”‚
β”‚   β€’ REST API β€” instances, messages, webhooks        β”‚
β”‚   β€’ Baileys session manager (in-process sockets)    β”‚
β”‚   β€’ SSE stream for QR + state events                β”‚
β”‚   β€’ BullMQ queue producer                           β”‚
β”‚   β€’ Webhook delivery worker (same process for v1)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                            β”‚
             β–Ό                            β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚ Postgresβ”‚                 β”‚  Redis  β”‚
        β”‚ (Kysely)β”‚                 β”‚ (BullMQ)β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Postgres holds customers, API keys, instances, webhook endpoints, webhook deliveries (with payload bodies for replay), and an append-only audit log. Migrations run via a tiny in-house runner with an advisory lock and a _migrations ledger β€” no Prisma, no schema generators, no surprise drift.
  • Redis is BullMQ's job board. Outbound webhook deliveries land there, the worker drains them with retries (exponential backoff, jitter), and the queue keeps the API path snappy β€” POST /v1/messages returns the moment the row is staged, not the moment WhatsApp acks.
  • Baileys is loaded inside the Fastify process. Each connected instance owns a WebSocket to WhatsApp Web. We persist auth-state to disk per instance so a restart can resume without a fresh QR scan.

Process Model A vs B

We picked "Process Model A": webhook worker runs in-process. Pros: single binary, single deploy, no Redis-cross-process visibility issues. Cons: a slow customer endpoint can pin Node event-loop time that should be servicing API requests.

The mitigation is the per-host concurrency cap β€” a customer with a sluggish endpoint can only block their own deliveries, not the worker globally. When this stops being enough, splitting the worker into its own process is a one-file change because the queue, the delivery repo, and the Baileys session manager are all already isolated behind interfaces.

Customer / instance / API key model

customer
   β”œβ”€β”€ api_key (one or more)
   β”œβ”€β”€ instance
   β”‚     β”œβ”€β”€ webhook_endpoint  (delivery target)
   β”‚     β”‚     └── webhook_delivery (per-event row)
   β”‚     └── audit_event
   └── audit_event

A customer is a billing-and-isolation unit β€” every row in every other table FKs back to a customer. An instance is one phone number / WhatsApp connection. An API key is scoped to a customer; it can optionally be bound to a single instance for least-privilege use cases (e.g. a marketing app that should only be able to send from one number).

Trust boundaries

BoundaryWhat's enforced
API key β†’ customerHMAC-SHA256(prefix|key) with env pepper; no plaintext stored.
Outbound webhook URLPublic-DNS only; loopback / link-local / RFC1918 / cloud-metadata IPs all rejected at create and delivery time (DNS-rebind defense). HTTPS-only enforced when NODE_ENV=production.
Webhook signing secretAES-256-GCM at rest with key rotation (SECRETS_KEY + optional SECRETS_KEY_PREVIOUS).
Cross-customer accessEvery query filters by customerId. No "global admin" path.
Audit logAppend-only audit_events table with ON DELETE SET NULL so history outlives the row it referenced.

What's deliberately not here

  • Multi-region. Single primary Postgres, single Redis. Adding a read-replica is a small DNS swap; multi-region writes is a different product.
  • A blob store for media. Inbound attachments live in an in-process LRU cache and are served via signed proxy URLs that expire. Past v1, this becomes S3/R2.
  • A separate worker process. See above. The seams are in place; the split is deferred.