Esta página está disponible solo en inglés por ahora.

Architecture

WhatIsUp.dev is one Node process, one Postgres, one Redis. That's the whole runtime topology for v1, and it's a deliberate choice: every additional service is something you have to deploy, migrate, observe, and pay for.

The three pieces

┌─────────────────────────────────────────────────────┐
│   Backend (Fastify)                                 │
│   • REST API — instances, messages, webhooks        │
│   • Baileys session manager (in-process sockets)    │
│   • SSE stream for QR + state events                │
│   • BullMQ queue producer                           │
│   • Webhook delivery worker (same process for v1)   │
└────────────┬────────────────────────────┬───────────┘
             │                            │
             ▼                            ▼
        ┌─────────┐                 ┌─────────┐
        │ Postgres│                 │  Redis  │
        │ (Kysely)│                 │ (BullMQ)│
        └─────────┘                 └─────────┘

Postgres holds customers, API keys, instances, webhook endpoints, webhook deliveries (with payload bodies for replay), and an append-only audit log. Migrations run via a tiny in-house runner with an advisory lock and a _migrations ledger — no Prisma, no schema generators, no surprise drift.
Redis is BullMQ's job board. Outbound webhook deliveries land there, the worker drains them with retries (exponential backoff, jitter), and the queue keeps the API path snappy — POST /v1/messages returns the moment the row is staged, not the moment WhatsApp acks.
Baileys is loaded inside the Fastify process. Each connected instance owns a WebSocket to WhatsApp Web. We persist auth-state to disk per instance so a restart can resume without a fresh QR scan.

Process Model A vs B

We picked "Process Model A": webhook worker runs in-process. Pros: single binary, single deploy, no Redis-cross-process visibility issues. Cons: a slow customer endpoint can pin Node event-loop time that should be servicing API requests.

The mitigation is the per-host concurrency cap — a customer with a sluggish endpoint can only block their own deliveries, not the worker globally. When this stops being enough, splitting the worker into its own process is a one-file change because the queue, the delivery repo, and the Baileys session manager are all already isolated behind interfaces.

Customer / instance / API key model

customer
   ├── api_key (one or more)
   ├── instance
   │     ├── webhook_endpoint  (delivery target)
   │     │     └── webhook_delivery (per-event row)
   │     └── audit_event
   └── audit_event

A customer is a billing-and-isolation unit — every row in every other table FKs back to a customer. An instance is one phone number / WhatsApp connection. An API key is scoped to a customer; it can optionally be bound to a single instance for least-privilege use cases (e.g. a marketing app that should only be able to send from one number).

Trust boundaries

Boundary	What's enforced
API key → customer	HMAC-SHA256(prefix\|key) with env pepper; no plaintext stored.
Outbound webhook URL	Public-DNS only; loopback / link-local / RFC1918 / cloud-metadata IPs all rejected at create and delivery time (DNS-rebind defense). HTTPS-only enforced when `NODE_ENV=production`.
Webhook signing secret	AES-256-GCM at rest with key rotation (`SECRETS_KEY` + optional `SECRETS_KEY_PREVIOUS`).
Cross-customer access	Every query filters by `customerId`. No "global admin" path.
Audit log	Append-only `audit_events` table with `ON DELETE SET NULL` so history outlives the row it referenced.

What's deliberately not here

Multi-region. Single primary Postgres, single Redis. Adding a read-replica is a small DNS swap; multi-region writes is a different product.
A blob store for media. Inbound attachments live in an in-process LRU cache and are served via signed proxy URLs that expire. Past v1, this becomes S3/R2.
A separate worker process. See above. The seams are in place; the split is deferred.