OrbitDesk

Multi-tenant omnichannel CRM platform with AI-powered automation, visual flow builder, and real-time messaging across WhatsApp, Telegram, Messenger, Instagram, Email, SMS & Live Chat.

NestJS React 19 FastAPI PostgreSQL MongoDB Redis Socket.IO Stripe Razorpay Twilio OpenAI Realtime Telegram Messenger Instagram DM

Project Overview

OrbitDesk is a full-featured, multi-tenant SaaS platform that enables businesses to manage customer communications across 7 messaging channels, automate interactions with visual flow builders, and leverage AI chatbots for customer support. Channels include WhatsApp (Cloud API + Baileys QR), Telegram, Facebook Messenger, Instagram DM, Email (SMTP/IMAP), Twilio SMS, and an embeddable Live Chat widget.

Key Capabilities

Omnichannel Inbox

Unified real-time inbox for 7 channels: WhatsApp, Telegram, Messenger, Instagram DM, Email, SMS & Live Chat. Agent assignment, conversation status, read receipts, and channel filter.

Visual Flow Builder

Drag-and-drop automation builder with 13+ node types: messaging, conditions, delays, HTTP calls, AI, and more.

AI Chatbots (RAG)

Orbit-powered AI agents with knowledge base (documents, URLs, text). Auto-handoff to human agents.

👥

Multi-Tenant

Complete tenant isolation with role-based access control (Admin, Manager, Agent, Superadmin).

🔌

Plugin Ecosystem

Modular plugins for Google Sheets, Twilio (SMS/Voice), Email (SMTP), and WhatsApp QR pairing.

📈

Dashboard Analytics

Real-time metrics: message volume, active conversations, contact growth, and 7-day trend charts.

💳

Subscriptions & Billing

Built-in subscription system with Stripe and Razorpay integration. Plan-based resource limits, usage tracking, and automated billing lifecycle.

🔑

REST API Access

Scoped API keys for external integrations. Connect Zapier, CRMs, e-commerce platforms, and custom apps with fine-grained permission control and rate limiting.

📞

AI Voice Calling

Outbound voice campaigns with AI agents powered by OpenAI Realtime API & Twilio. Real-time transcription, tool calling, sentiment analysis, and lead qualification.

Architecture

OrbitDesk follows a monorepo structure managed by Turborepo + pnpm workspaces. The platform consists of three application services backed by three data stores.

┌───────────────┐ │ Nginx │ │ (SSL / Proxy) │ │ :80 / :443 │ └──────┬────────┘ │ ┌────────────┼────────────┐ │ │ │ ┌────┬────┐ ┌───┬────┐ ┌───┬────┐ │ Web │ │ API │ │ AI │ │ React │ │ NestJS │ │ FastAPI │ │ :3000 │ │ :3002 │ │ :8000 │ └─────────┘ └────┬────┘ └─────────┘ │ ┌────────────┼────────────┐ │ │ │ ┌────┬────┐ ┌───┬────┐ ┌───┬────┐ │Postgres │ │ MongoDB │ │ Redis │ │ :5432 │ │ :27017 │ │ :6379 │ └─────────┘ └─────────┘ └─────────┘
ServiceTechnologyPortPurpose
WebReact 19 + Vite 7 + Tailwind v43000Frontend SPA (Nginx in production)
APINestJS 11 + Prisma + Mongoose3002REST API + WebSocket gateway + Voice bridge
AIFastAPI + Uvicorn8000LLM completions & RAG queries
PostgreSQLPostgreSQL 165432Primary relational data (Prisma)
MongoDBMongoDB 727017Conversations, messages, flow sessions
RedisRedis 76379Cache, pub/sub, rate limiting

Tech Stack

Backend

  • NestJS 11 — Framework
  • Prisma 6 — PostgreSQL ORM
  • Mongoose 8 — MongoDB ODM
  • ioredis — Redis client
  • Socket.IO 4 — WebSockets
  • Passport + JWT + API Key — Authentication
  • @nestjs/throttler — Rate limiting
  • Swagger — API docs (/api/docs)
  • Baileys — WhatsApp Web API
  • Twilio SDK — Voice calling & SMS
  • ws — Raw WebSocket (Twilio media stream)
  • node-mulaw — Audio transcoding (μ-law ↔ PCM)
  • AWS SDK v3 — S3 storage (optional)

Frontend

  • React 19 — UI framework
  • Vite 7 — Build tool
  • Tailwind CSS v4 — Styling
  • TanStack Query — Data fetching
  • TanStack Table — Data tables
  • React Router 7 — Routing
  • React Hook Form + Yup — Forms
  • FlowGram.ai — Visual flow editor
  • ApexCharts — Dashboard charts
  • wavesurfer.js — Audio waveform player

AI Service

  • FastAPI — Python web framework
  • Uvicorn — ASGI server
  • OpenAI SDK — LLM provider
  • Multi-provider: OpenAI, Anthropic, Gemini, Mistral, Deepseek

Infrastructure

  • Turborepo — Monorepo orchestration
  • pnpm 9 — Package manager
  • Docker — Containerization
  • Nginx — Reverse proxy + SSL
  • PM2 — Process manager (bare metal)
  • GitHub Actions — CI/CD

Messaging & Inbox

The real-time inbox is the core of OrbitDesk. Agents can manage customer conversations across all connected channels, send messages, assign chats, and track delivery status — all in real time via WebSockets.

WhatsApp Connection Types

Cloud API (Official)

Meta Business Platform integration. Requires Phone Number ID, Business Account ID, and permanent access token. Supports all message types, webhooks, and read receipts.

Production Ready Webhook-based

Baileys QR (Personal)

Connect any WhatsApp number by scanning a QR code. Uses the Baileys library. Requires the whatsapp-qr plugin installed. QR delivered via WebSocket.

Plugin Required Socket-based

Message Types

Text Image Video Audio Document Template Interactive Location Contacts Sticker Reaction

Conversation Lifecycle

Open
Assigned
Pending
Resolved
Closed

Channel Pipeline (all channels)

Inbound Event
Channel Adapter
Normalizer
Contact Upsert
Conversation Upsert
Store (MongoDB)
WebSocket Emit
Flow Trigger

Channel Adapters

OrbitDesk ships with a unified ChannelsModule that abstracts every messaging channel behind a common IMessageChannel interface. All adapters share the same pipeline: inbound normalisation → contact upsert → conversation upsert → MongoDB store → WebSocket emit → flow trigger.

Supported Channels

ChannelInbound MethodOutboundNotes
WhatsApp Cloud API Meta webhook (POST /whatsapp/webhook) Meta Graph API HMAC-SHA256 signature verification, template support
WhatsApp QR (Baileys) Baileys socket events Baileys socket Requires whatsapp-qr plugin; QR via WebSocket
Telegram Bot API webhook (POST /channels/telegram/:token/webhook) Telegram Bot API Text, photo, document, voice, sticker messages
Facebook Messenger Meta Graph webhook (POST /channels/meta/webhook) Meta Graph API HMAC-SHA256 verified; text & attachments
Instagram DM Meta Graph webhook (shared with Messenger) Meta Graph API Messaging product: instagram
Email (SMTP / IMAP) IMAP polling (ImapPollerService) Nodemailer SMTP Configurable poll interval; supports HTML & plain text
Twilio SMS Twilio webhook (POST /channels/twilio-sms/webhook) Twilio REST API Two-way SMS; E.164 phone numbers
Live Chat Widget REST endpoint (POST /channels/live-chat/message) WebSocket push to widget Embeddable JS widget; JWT-authenticated visitor sessions

Architecture

Each adapter implements IMessageChannel and self-registers via ChannelRouterService on module init. Inbound messages are normalised to a common NormalizedMessage shape by a per-channel Normalizer before entering the shared pipeline.

ServiceResponsibility
ChannelRouterServiceRegistry of active adapters; routes outbound sends to the correct adapter by channel type
ChannelPipelineServiceShared inbound processing: contact upsert → conversation upsert → store → emit → flow trigger
ChannelConfigServiceReads per-tenant channel credentials from encrypted integration records
*NormalizerPer-channel class that converts raw provider payloads to NormalizedMessage

Adding a Custom Channel

Implement the IMessageChannel interface, create a Normalizer, register the adapter in ChannelsModule, and add a controller for the inbound webhook. No changes needed to the pipeline, flow engine, or inbox.

Contact Management

Contacts are the CRM backbone. Each contact is scoped to a tenant and identified by phone number. Contacts support custom fields and tags for segmentation.

FieldTypeDescription
phoneString (unique/tenant)WhatsApp phone number (E.164)
nameStringContact display name
emailString (optional)Email address
tagsString[]Tags for segmentation (e.g., "VIP", "Lead")
customFieldsJSONFlexible key-value store (e.g., company, city)
optedInBooleanOpted into communications
avatarUrlString (optional)Profile picture URL

Flow Automation

The visual flow builder (powered by FlowGram.ai) lets users design automation workflows with a drag-and-drop canvas. Flows are triggered by events and execute a graph of nodes.

Flow Lifecycle

Draft
Active
Paused
Archived

Available Node Types (13)

Trigger

Entry point. Fires on: message received, keyword match, contact created, tag added, manual trigger, or external webhook.

Send Message

Send WhatsApp text, template, or media messages to the contact.

Condition

If/else branching based on variables, message content, or contact fields.

Delay

Wait for a specified duration (seconds, minutes, hours) before continuing.

AI Response

Call an LLM (OpenAI, Anthropic, etc.) with system prompt and dynamic context.

HTTP Request

Call external APIs (GET/POST/PUT/DELETE) with custom headers and body.

Set Variable

Create or update flow variables for use in subsequent nodes.

Send Email

Send emails via SMTP. Requires the email plugin.

Send SMS

Send SMS via Twilio. Requires the twilio plugin.

Voice Call

Initiate voice calls via Twilio. Requires the twilio plugin.

Google Sheets

Append data to a Google Sheets spreadsheet. Requires google-sheets plugin + OAuth.

Loop

Iterate over array variables and execute child nodes for each item.

End

Terminal node. Ends flow execution and marks session as completed.

Trigger Types

TriggerDescription
message_receivedAny incoming WhatsApp message
keywordMessage contains a specific keyword
contact_createdNew contact is created
tag_addedA tag is added to a contact
manualManually triggered by a user
webhookExternal HTTP request to a unique webhook URL (details below)

Flow Execution Engine

The backend flow engine (FlowExecutorService) traverses the graph from the trigger node, executing each node via the NodeExecutorFactory. Flow sessions (state, variables, execution log) are stored in MongoDB. Webhook-triggered flows run asynchronously — the HTTP response is returned immediately while the flow executes in the background.

Webhook Integration

External services can trigger flow automations by sending HTTP requests to unique webhook URLs. This enables push-based integrations with e-commerce platforms, form builders, payment gateways, and any system that supports outgoing webhooks.

How It Works

External Service
POST /api/v1/webhooks/:token
Verify Signature
Start Flow
Execute Nodes

When a flow’s trigger type is set to webhook, the system generates a unique UUID token. External services send a POST request to /api/v1/webhooks/{token}. The system validates the request, triggers the flow asynchronously, and returns an immediate 200 response.

Webhook URL

POST https://<your-domain>/api/v1/webhooks/<webhook-token>

The webhook token is a UUID generated automatically when a flow’s trigger type is set to webhook. It is globally unique and serves as both the identifier and access key.

Note: The webhook URL is shown in the flow editor once the flow is saved with a webhook trigger. You can copy it directly from the trigger node configuration panel.

Authentication & Signature Verification

Webhook endpoints are public (no Bearer token required) — access is controlled by the unique token in the URL. For additional security, you can configure an optional HMAC-SHA256 secret.

HeaderDescription
x-hub-signature-256GitHub-style HMAC signature (preferred)
x-webhook-signatureAlternative signature header
x-signatureGeneric signature header

If a webhook secret is configured, incoming requests must include a valid signature in one of the headers above. The expected format is:

sha256=<hex-encoded HMAC-SHA256 of raw request body>

The system uses timing-safe comparison to prevent timing attacks. Requests with invalid or missing signatures receive a 403 Forbidden response.

Request Format

POST /api/v1/webhooks/a1b2c3d4-e5f6-7890-abcd-ef1234567890?source=shopify HTTP/1.1
Host: app.example.com
Content-Type: application/json
X-Hub-Signature-256: sha256=abcdef1234567890...

{
  "event": "order.created",
  "data": {
    "orderId": "12345",
    "amount": 99.99,
    "customer": {
      "email": "user@example.com",
      "name": "John Doe"
    }
  }
}

The endpoint accepts any valid JSON body. Query parameters, headers, and the full body are all captured and made available as flow variables.

Response

HTTP/1.1 200 OK
Content-Type: application/json

{ "success": true, "message": "Webhook received" }
Asynchronous execution: The response is returned immediately. The flow runs in the background. Long-running flows do not block the webhook response.

Available Variables

Webhook data is accessible in all subsequent flow nodes using template syntax:

VariableDescription
{{webhook.body}}Full JSON request body
{{webhook.body.event}}Nested field access (dot notation)
{{webhook.headers}}Request headers (sanitized — auth/cookie headers removed)
{{webhook.query}}URL query parameters
{{webhook.method}}HTTP method (always POST)
{{webhook.receivedAt}}ISO 8601 timestamp when the webhook was received
{{webhook.token}}The webhook token used

Example — Using webhook data in a Send Message node:

Dear {{webhook.body.data.customer.name}},

Your order #{{webhook.body.data.orderId}} totaling ${{webhook.body.data.amount}}
has been received. We'll process it shortly.

Token Management

ActionEndpointDescription
Auto-generatedToken created automatically when trigger type is set to webhook
Regenerate token POST /flows/:id/regenerate-webhook-token Generates a new UUID. The old URL becomes invalid immediately. Use if a token is compromised.
Set secret PATCH /flows/:id/webhook-secret Sets HMAC-SHA256 secret for signature verification. Pass null to disable.

Verification Endpoint

GET /api/v1/webhooks/:token

Returns the flow name and status. Use this to verify connectivity from your external service before sending events.

Security

Error Responses

StatusReason
404 Not FoundInvalid webhook token (no matching flow)
403 ForbiddenFlow is not active, or signature verification failed
200 OKWebhook received and flow triggered successfully

cURL Example

# Without signature verification
curl -X POST https://app.example.com/api/v1/webhooks/YOUR_TOKEN \
  -H "Content-Type: application/json" \
  -d '{"event": "order.created", "orderId": "12345"}'

# With HMAC-SHA256 signature
BODY='{"event": "order.created", "orderId": "12345"}'
SECRET="your-webhook-secret"
SIGNATURE="sha256=$(echo -n "$BODY" | openssl dgst -sha256 -hmac "$SECRET" | cut -d' ' -f2)"

curl -X POST https://app.example.com/api/v1/webhooks/YOUR_TOKEN \
  -H "Content-Type: application/json" \
  -H "X-Hub-Signature-256: $SIGNATURE" \
  -d "$BODY"

Common Use Cases

🛒

E-commerce Orders

Shopify, WooCommerce, or custom storefronts send order events to trigger order confirmation and shipping notifications.

📋

Form Submissions

Typeform, Google Forms, or landing pages trigger lead nurturing flows when a form is submitted.

💰

Payment Events

Stripe, Razorpay, or PayPal send payment success/failure events to trigger customer notifications.

🔧

Custom Integrations

Any system with outgoing webhook support (CRMs, ERPs, monitoring tools) can trigger automated workflows.

Important: Webhook-triggered flows do not have an associated contact or conversation. Nodes that require a contact (like Send Message) need the contact phone number or ID to be passed in the webhook payload and used via variables.

REST API Access

OrbitDesk provides scoped API key authentication for external integrations. API keys allow third-party services, automation platforms, and custom applications to access the REST API without requiring user login credentials.

How It Works

Create API Key
Select Scopes
Copy Key (shown once)
Use X-Api-Key Header
Access Scoped Endpoints

API keys are managed from Settings > API Keys (admin-only). Each key is generated with a wcrm_ prefix, and only the SHA-256 hash is stored in the database. The raw key is displayed only once at creation time.

Authentication

API key authentication works alongside existing JWT authentication. Any protected endpoint accepts either a Bearer token or an X-Api-Key header. The system tries JWT first, then falls back to API key validation.

# Authenticate with API key
curl -H "X-Api-Key: wcrm_abc123..." \
  https://your-domain.com/api/v1/contacts

# JWT authentication still works as before
curl -H "Authorization: Bearer eyJhb..." \
  https://your-domain.com/api/v1/contacts

API key requests are automatically scoped to the tenant that created the key via the TenantScopeInterceptor — no manual tenant ID required.

Available Scopes

Each API key is granted specific scopes that control which endpoints it can access. JWT-authenticated users bypass scope checks entirely (full access as before).

ScopeGrants Access To
contacts:readList and view contacts
contacts:writeCreate, update, and delete contacts
conversations:readList and view conversations
messages:readList and view messages
messages:sendSend messages
flows:readList and view automation flows
flows:triggerTrigger flows programmatically
templates:readList and view message templates
Scope enforcement: API keys cannot access admin-only endpoints (tenant settings, user management, API key management). These are restricted to JWT-authenticated admin/superadmin users.

Key Management Endpoints

These endpoints require JWT authentication with admin or superadmin role:

POST/api-keysCreate API key (returns raw key once)
GET/api-keysList all API keys for tenant
DELETE/api-keys/:id/revokeRevoke (deactivate) an API key
DELETE/api-keys/:idPermanently delete an API key

Create Key Request

POST /api/v1/api-keys
Authorization: Bearer <admin-jwt-token>
Content-Type: application/json

{
  "name": "Zapier Integration",
  "scopes": ["contacts:read", "contacts:write", "messages:send"],
  "expiresAt": "2027-01-01T00:00:00Z"   // optional, null = never expires
}

Create Key Response

HTTP/1.1 201 Created

{
  "id": "a1b2c3d4-...",
  "name": "Zapier Integration",
  "key": "wcrm_7f3a9b2c4d5e6f...",       // ⚠ shown ONLY once
  "scopes": ["contacts:read", "contacts:write", "messages:send"],
  "expiresAt": "2027-01-01T00:00:00.000Z",
  "createdAt": "2026-02-19T10:30:00.000Z",
  "isActive": true
}
Important: The raw API key (wcrm_...) is only returned in the creation response. It is not stored — only its SHA-256 hash is saved. Copy and securely store the key immediately after creation.

Rate Limiting

All API requests are rate-limited to 100 requests per minute. API key requests are tracked per key ID, while JWT requests are tracked per IP address. Rate limit information is included in response headers:

HeaderDescription
X-RateLimit-LimitMaximum requests per window
X-RateLimit-RemainingRemaining requests in current window
Retry-AfterSeconds to wait when rate limited (429 response)

Use Cases

🔄

CRM Sync

Bidirectional contact sync with HubSpot, Salesforce, or Zoho. Use contacts:read + contacts:write scopes.

No-Code Automations

Connect Zapier, Make, or n8n workflows. Example: Shopify order triggers WhatsApp message via messages:send.

🛒

E-Commerce Notifications

Shopify/WooCommerce sends order confirmations and shipping updates via messages:send + contacts:read.

📊

Custom Dashboards & BI

Pull data into Metabase or Google Data Studio using conversations:read + messages:read.

📋

Lead Capture Forms

Website forms create contacts directly via contacts:write and trigger welcome flows via flows:trigger.

📚

Bulk Contact Import

Scripts or microservices batch-import contacts from CSV or external databases using contacts:write.

Security

Error Responses

StatusReason
401 UnauthorizedInvalid, revoked, or expired API key
403 ForbiddenAPI key does not have required scope
429 Too Many RequestsRate limit exceeded

cURL Examples

# List contacts with API key
curl -H "X-Api-Key: wcrm_7f3a9b2c4d5e6f..." \
  https://your-domain.com/api/v1/contacts

# Create a contact
curl -X POST -H "X-Api-Key: wcrm_7f3a9b2c4d5e6f..." \
  -H "Content-Type: application/json" \
  -d '{"phone": "+1234567890", "name": "John Doe", "email": "john@example.com"}' \
  https://your-domain.com/api/v1/contacts

# Send a message
curl -X POST -H "X-Api-Key: wcrm_7f3a9b2c4d5e6f..." \
  -H "Content-Type: application/json" \
  -d '{"conversationId": "conv-uuid", "type": "text", "content": "Hello!"}' \
  https://your-domain.com/api/v1/messages

Templates & Campaigns

WhatsApp Message Templates

Templates are pre-approved message formats required by Meta for outbound messaging outside the 24-hour window. They go through a review process before they can be used.

Draft
Pending (Meta Review)
Approved
or
Rejected

Template components: Header (text/image/video), Body (text with variables), Footer (text), Buttons (quick reply / URL). Sync with Meta Business Account via API.

Campaigns (Bulk Messaging)

Campaigns enable sending templates to a list of contacts at scale. Campaign statuses: Draft Scheduled Running Completed Failed

Each campaign tracks totalRecipients, sentCount, and failedCount for delivery reporting.

AI Chatbots

OrbitDesk integrates with agents.fastlab.ai to provide AI-powered chatbots with Retrieval-Augmented Generation (RAG) and optional AI Function Calling. Each tenant can create multiple chat agents with their own knowledge bases and tool capabilities.

How It Works

  1. Create a Chat Agent — Define name, description, and behavior in Chat Agents
  2. Upload Datasources — Files (PDF, text), URLs, or raw text for RAG indexing
  3. Enable AI Function Calling — Optionally give the agent tools to take actions (CRM lookups, bookings, flow triggers)
  4. Assign to Conversations — Bot automatically responds to incoming messages
  5. Human Handoff — Configurable keywords trigger transfer to a live agent

Two Agent Modes

📚

RAG Mode (default)

The agent answers questions using a knowledge base (PDFs, URLs, text). Queries are embedded and matched via vector search against the tenant's indexed datasources. Best for Q&A and support bots.

Tool Calling Mode

The agent runs a multi-step function-calling loop — it can look up contacts, update CRM fields, book appointments, trigger automation flows, and update conversation status before sending a final reply.

AI Function Calling

When AI Function Calling is enabled on a chat agent, incoming messages are routed to the Orbit FastAPI AI service instead of the standard RAG path. The service runs an autonomous tool-calling loop:

WhatsApp Message │ ▼ NestJS: agent-chat.service.ts │ checks chatAgent.toolsEnabled │ ├─ false ──▶ Orbit RAG path (knowledge base Q&A) │ └─ true ──▶ Orbit FastAPI POST /api/v1/agent/chat │ ├─ LLM call with tools defined │ ├─ tool_call returned │ └─ calls back to NestJS REST API │ (contacts, conversations, webhooks) │ ├─ feed result back to LLM → repeat │ └─ finish_reason = "stop" │ ▼ Final text ──▶ WhatsApp reply

Available Tools (5)

ToolWhat it doesHost app endpoint
lookup_crm Search contacts by phone, name, or email GET /contacts?search=...
update_contact Update contact name, email, notes, tags, or custom fields PATCH /contacts/:id
book_appointment Save a structured appointment note on a contact record GET + PATCH /contacts/:id
trigger_flow Trigger an automation flow via its webhook token POST /webhooks/:token
update_conversation Change conversation status or tags PATCH /conversations/:id

Configuring Tool Calling via the UI

Tool calling is configured in Chat Agents — no API calls or seeders required.

  1. Open Chat Agents and click Add Agent or the edit icon on an existing agent.
  2. Scroll to the AI Function Calling section and toggle it on.
  3. Configure the fields:
FieldDescription
Provideropenai or anthropic
Modele.g. gpt-4o-mini, gpt-4o, claude-3-5-haiku
System PromptOptional custom instructions for the tool-calling agent
Enabled ToolsCheckboxes for each of the 5 built-in tools
Max IterationsTool-calling rounds before fallback (1–20, default 10)

The configuration is stored on the ChatAgent record in PostgreSQL as toolsEnabled (boolean) and toolsConfig (JSON).

Fallback behaviour: If max_iterations is reached or the LLM provider throws a fatal error, the agent replies with a polite fallback message and sets fallback_used: true in the response log. The conversation is never silently dropped.

AI Service (FastAPI)

The standalone Python microservice at services/ai/ handles both RAG and tool-calling paths:

The service is stateless — all context (contact info, conversation ID, tool config) is passed per-request by NestJS. Authentication uses short-lived HS256 JWTs (INTERNAL_JWT_SECRET) that expire after 2 minutes. See services/ai/REQUIREMENTS.md for the full API spec.

AI Voice Calling

OrbitDesk includes a full AI-powered voice calling system built on Twilio for telephony and OpenAI Realtime API for conversational AI. AI voice agents can make outbound calls, converse naturally with contacts, execute tool calls, and qualify leads — all autonomously.

Architecture: NestJS handles telephony orchestration and WebSocket bridging. Twilio sends raw audio via WebSocket, which NestJS bridges to OpenAI Realtime API with real-time μ-law ↔ PCM16 transcoding. All AI inference for post-call processing (summarization, sentiment, transcription) is handled by the Orbit FastAPI microservice.

Voice Call Flow

Initiate Call
Twilio Dials
Media Stream WS
OpenAI Realtime
AI Converses
Post-Call Analysis
Contact's Phone OrbitDesk (NestJS) OpenAI | | | | <-- Twilio Call --> | | | Audio (μ-law 8kHz) ------> | MediaStreamGateway | | | Transcode μ-law → PCM16 ------> Realtime API | | Transcode PCM16 ← μ-law <------ (AI Response) | Audio (μ-law 8kHz) <------ | | | | Tool calls ────────> ToolExecutor | | Transcript ────────> ChatGateway (Socket.IO) | | | | Call Ends | Post-call: summarize, sentiment, qualify | | ───────> Orbit FastAPI |

Core Components

🤖

Voice Agents

Configure AI agents with custom system prompts, voice selection (OpenAI or ElevenLabs), tool capabilities, knowledge base integration, and conversation flow design.

📣

Voice Campaigns

Schedule bulk outbound call campaigns. Assign an agent, contact list, phone number, concurrency limits, and retry configuration. DNC contacts are automatically skipped.

📖

Knowledge Bases

Upload documents (PDF, text), URLs, or raw text. Content is chunked and embedded with pgvector for real-time RAG retrieval during live calls.

📄

Call Logs & Transcripts

Full call history with audio playback (wavesurfer.js), live transcription viewer, sentiment analysis, lead qualification, and tool call logs.

📞

Phone Numbers

Browse and purchase Twilio phone numbers, assign them to voice agents, and manage your number inventory per tenant.

📈

Voice Analytics

Dashboard with call volume trends, outcome distribution, sentiment breakdown, agent performance, campaign comparison, cost analysis, lead funnel, and hourly heatmap.

Voice Agent Configuration

FieldTypeDescription
nameStringAgent display name
systemPromptTextInstructions and persona for the AI (supports dynamic variables)
voiceProviderOPENAI / ELEVENLABSTTS voice provider
voiceIdStringVoice selection (e.g., "alloy", "echo", "shimmer" for OpenAI)
modelStringOpenAI model (default: gpt-4o-realtime-preview)
temperatureFloatAI creativity (0.0 – 1.0, default 0.7)
maxCallDurationIntAuto-hangup after N seconds (default 600)
greetingMessageStringFirst message the agent speaks when the call is answered
toolsConfigJSONArray of callable tools (book_appointment, transfer_call, lookup_crm, etc.)
flowConfigJSONVisual call flow (optional alternative to free-form system prompt)
knowledgeBaseIdUUIDLinked knowledge base for RAG retrieval during calls

Available Agent Tools

ToolDescription
book_appointmentSchedule an appointment (date, time, notes)
transfer_callTransfer to a live agent or external number
send_smsSend an SMS to the contact during the call
lookup_crmLook up contact details from the CRM
search_knowledge_baseSearch the knowledge base for relevant information
save_lead_infoSave captured lead data (name, email, interests)
end_callGracefully end the conversation
custom_webhookPOST to an external URL with call context

Voice Campaigns

Voice campaigns enable automated bulk outbound calling with AI agents. Campaigns support scheduling, concurrency control, retry logic, and DNC compliance.

Draft
Scheduled
Running
Completed
SettingDescription
AgentWhich voice agent handles the calls
Contact ListTarget contacts (DNC contacts automatically excluded)
Phone NumberTwilio number to call from
ConcurrencyMax simultaneous calls (1–10)
ScheduleStart time, end time, timezone, active days
Retry ConfigMax retries, delay between retries, retry on which statuses
DNC Compliance: Contacts flagged as isOnDncList = true are automatically skipped by the campaign scheduler. The DNC list is managed from Settings > DNC List where you can add numbers individually, upload a CSV, or search and remove entries.

Call Lifecycle

QUEUED
RINGING
IN_PROGRESS
COMPLETED

Other terminal states: FAILED NO_ANSWER BUSY CANCELED

Post-Call Processing

After each call ends, a Bull queue pipeline performs:

  1. Transcription — Full conversation transcript stored as JSON
  2. Summarization — AI-generated call summary via Orbit FastAPI
  3. Sentiment Analysis — POSITIVE / NEUTRAL / NEGATIVE / MIXED / UNKNOWN
  4. Lead Qualification — HOT / WARM / COLD with score, interests, and next steps
  5. Cost Calculation — Per-minute, TTS, STT, and LLM token costs tracked
  6. Webhook Dispatchcall.completed, lead.qualified events sent to tenant webhooks

Knowledge Bases

Voice knowledge bases provide real-time information retrieval during live calls. When the AI agent needs specific information, it queries the knowledge base via pgvector semantic search.

Source TypeDescription
FILEUpload PDF or text documents (chunked and embedded)
URLCrawl a web page, extract text, chunk and embed
TEXTPaste raw text content directly

Embeddings are generated via OpenAI text-embedding-3-small (1536 dimensions) and stored with pgvector for fast similarity search.

Voice Analytics Dashboard

The analytics dashboard provides 8 visualization panels:

KPI Cards

Total calls, total duration, total cost, and success rate.

Call Volume

30-day area chart with daily calls broken down by status.

Outcomes & Sentiment

Donut charts showing call outcome distribution and sentiment breakdown.

Agent Performance

Table comparing agents by total calls, success rate, avg duration, and cost.

Campaign Comparison

Bar chart comparing campaigns by total, completed, and failed calls.

Lead Funnel

Total → Connected → Completed → Qualified conversion funnel.

Cost Breakdown

Daily cost trend with per-type breakdown (call minutes, TTS, STT, LLM tokens).

Hourly Heatmap

7×24 heatmap showing call volume by day of week and hour.

Per-Tenant API Key Isolation

Each tenant configures their own API keys for voice services in Settings > Integrations:

Keys are resolved per-request: tenant-specific key first, with global environment variable as fallback.

Plugin System

OrbitDesk has a modular plugin architecture. Plugins extend functionality and are installed per-tenant from the marketplace.

PluginKeyCategoryCapabilities
WhatsApp QRwhatsapp-qrMessagingEnable Baileys QR code pairing
Google Sheetsgoogle-sheetsGoogleWrite flow data to spreadsheets
TwiliotwilioCommunicationSMS sending, voice calls
Email (SMTP)emailCommunicationSend emails from flows

Plugins can be FREE or PAID, and have visibility: GLOBAL (all tenants) or TENANT (specific tenant). Custom plugins can be uploaded as .tgz packages.

Multi-Tenancy & Authentication

Tenant Isolation

Every resource (contacts, conversations, flows, templates, etc.) is scoped to a tenantId. The API extracts tenant context from the JWT token, ensuring complete data isolation between organizations.

Authentication Methods

OrbitDesk supports two authentication methods. Both extract tenant context automatically:

JWT (Dashboard Users)

Login
JWT (15m)
Bearer Header

Full access based on user role (admin/manager/agent).

API Key (External Integrations)

Create Key
X-Api-Key Header
Scoped Access

Limited to assigned scopes. No admin access. Details →

Role-Based Access Control

RoleScopeCapabilities
SUPERADMINPlatform-wideManage all tenants, approve registrations, platform settings
ADMINTenantFull tenant management: users, settings, plugins, integrations, flows
MANAGERTenantManage team, view analytics, manage flows and templates
AGENTTenantHandle conversations, send messages, view contacts

Tenant Onboarding

Register
Status: Pending
Superadmin Approves
Status: Active
Trial Subscription

Subscription Plans

Each plan defines limits: maxAgents, maxContacts, maxMessagesPerMonth, plus a flexible features JSON for feature flags. Plans can be priced and sorted in the marketplace.

Dashboard & Analytics

The home dashboard provides a real-time overview of key metrics, cached in Redis for 5 minutes.

Total Contacts

Count of all contacts in the tenant's CRM.

Messages Today

Total inbound + outbound messages sent today.

Active Conversations

Conversations with status "open" or "assigned".

Active Flows

Number of flows with status "ACTIVE".

7-Day Message Volume Chart — Line/bar chart showing inbound vs outbound messages over the past week, powered by ApexCharts.

Subscriptions & Billing

OrbitDesk includes a complete subscription and billing system with support for multiple payment gateways. Tenants can subscribe to plans with resource limits, and superadmins can manage plans, configure payment providers, and manually assign plans.

Payment Gateways

Stripe

Full integration with Stripe Checkout for subscription payments. Supports automatic webhook handling for activation, renewal, payment failure, and cancellation events.

Checkout Sessions Webhooks

Razorpay

Native Razorpay subscription integration with dynamic plan creation. Handles subscription activation, charging, halting, and cancellation via webhooks.

Subscriptions API Webhooks

Plan Management

Superadmins can create and manage subscription plans with configurable resource limits:

Resource Limits

Each plan defines limits for: agents, contacts, messages/month, flows, chat agents, templates, and campaigns.

Limit Enforcement

Resource creation is automatically blocked when a tenant exceeds their plan limits. Usage tracked in real time with Redis caching (60–120s TTL).

Gateway Integration

Plans can be linked to Stripe price IDs and Razorpay plan IDs for automatic billing. Dynamic pricing creation is also supported.

Manual Assignment

Superadmins can manually assign plans to tenants without payment, useful for enterprise deals, trials, or internal accounts.

Subscription Lifecycle

Select Plan Choose Provider Checkout Redirect Payment Webhook Activation Subscription Active
Subscription Statuses Subscriptions can be in one of four states: ACTIVE, TRIALING, PAST_DUE (payment failed), or CANCELED. Cancellations take effect at the end of the current billing period.

Webhook Events

ProviderEventAction
Stripecheckout.session.completedActivate subscription
Stripeinvoice.payment_succeededRenew subscription period
Stripeinvoice.payment_failedMark as PAST_DUE
Stripecustomer.subscription.deletedMark as CANCELED
Razorpaysubscription.activatedActivate subscription
Razorpaysubscription.chargedRenew subscription period
Razorpaysubscription.pending / haltedMark as PAST_DUE
Razorpaysubscription.cancelledMark as CANCELED

Usage Tracking

The PlanLimitsService tracks resource usage across seven dimensions and enforces limits automatically:

ResourcePlan Limit FieldSource
AgentsmaxAgentsPostgreSQL (User count)
ContactsmaxContactsPostgreSQL (Contact count)
Messages/monthmaxMessagesPerMonthMongoDB (current month)
FlowsmaxFlowsPostgreSQL (Flow count)
Chat AgentsmaxChatAgentsPostgreSQL (ChatAgent count)
TemplatesmaxTemplatesPostgreSQL (Template count)
CampaignsmaxCampaignsPostgreSQL (Campaign count)
Voice AgentsmaxVoiceAgentsPostgreSQL (VoiceAgent count)
Calls/monthmaxCallsPerMonthPostgreSQL (Call count, current month)
Knowledge BasesmaxKnowledgeBasesPostgreSQL (VoiceKnowledgeBase count)

Payment Provider Configuration

Payment providers are configured via the Superadmin panel. Credentials are stored encrypted using AES encryption and masked in API responses (only last 4 characters shown). Only one provider can be active at a time.

Stripe Credentials
  • Publishable Key (pk_live_...)
  • Secret Key (sk_live_...)
  • Webhook Secret (whsec_...)
Razorpay Credentials
  • Key ID (rzp_live_...)
  • Key Secret
  • Webhook Secret

API Reference

All API endpoints are prefixed with /api/v1. Interactive Swagger docs are available at /api/docs in development.

Authentication All endpoints (except auth and webhook) require authentication via either a Bearer token in the Authorization header or an API key in the X-Api-Key header. Tenant context is extracted automatically from both JWT and API key. See REST API Access for API key details.

Auth

POST/auth/registerRegister tenant + admin user
POST/auth/loginLogin, returns access + refresh tokens
POST/auth/refreshRefresh access token
POST/auth/logoutInvalidate token
GET/auth/meCurrent user profile

Conversations

GET/conversationsList conversations (paginated, filterable)
GET/conversations/:idGet conversation with messages
PATCH/conversations/:id/assignAssign agent
PATCH/conversations/:id/statusUpdate status
POST/conversations/:id/readMark as read
PATCH/conversations/:id/tagsUpdate tags
PATCH/conversations/:id/assign-botAssign AI chat agent

Messages

GET/messages?conversationId=...List messages (cursor pagination)
POST/messagesSend message
PATCH/messages/:idEdit outbound text message
DELETE/messages/:idDelete outbound message

Contacts

GET/contactsList contacts (paginated)
POST/contactsCreate contact
PATCH/contacts/:idUpdate contact
DELETE/contacts/:idDelete contact

Flows

GET/flowsList flows
POST/flowsCreate flow
PATCH/flows/:idUpdate flow graph
POST/flows/:id/activateActivate flow
POST/flows/:id/deactivateDeactivate flow
DELETE/flows/:idDelete flow
POST/flows/:id/regenerate-webhook-tokenRegenerate webhook token
PATCH/flows/:id/webhook-secretSet/clear webhook secret

Webhooks (Flow Triggers)

GET/webhooks/:tokenVerify webhook connectivity
POST/webhooks/:tokenTrigger flow via webhook (public)

Templates

GET/templatesList templates
POST/templatesCreate template
PATCH/templates/:idUpdate template
POST/templates/syncSync from Meta

Chat Agents

GET/chat-agentsList AI agents
POST/chat-agentsCreate agent
POST/chat-agents/:id/datasourcesUpload datasource (RAG)
DELETE/chat-agents/:id/datasources/:dsIdDelete datasource

WhatsApp

GET/whatsapp/webhookMeta webhook verification
POST/whatsapp/webhookReceive webhook events
POST/whatsapp/sendSend WhatsApp message
POST/whatsapp/baileys/startStart QR session

Channel Adapters (public inbound)

POST/channels/telegram/:token/webhookTelegram Bot webhook receiver
GET/channels/meta/webhookMeta (Messenger/Instagram) webhook verification
POST/channels/meta/webhookMeta (Messenger/Instagram) inbound events
POST/channels/twilio-sms/webhookTwilio SMS inbound webhook
POST/channels/live-chat/messageLive Chat widget inbound message (public)
POST/channels/email/inboundEmail inbound (forwarded by IMAP poller)

Plans & Subscriptions

GET/plansList active plans (public)
GET/plans/subscriptionCurrent subscription + usage stats
POST/subscriptions/checkoutCreate checkout session (Stripe/Razorpay)
POST/subscriptions/cancelCancel subscription at period end
GET/subscriptions/providersAvailable payment providers + public keys

Webhooks (Payment)

POST/webhooks/stripeStripe webhook events
POST/webhooks/razorpayRazorpay webhook events

Superadmin — Plans

GET/superadmin/plansList all plans (incl. inactive)
POST/superadmin/plansCreate plan
PATCH/superadmin/plans/:idUpdate plan
DELETE/superadmin/plans/:idSoft-delete (deactivate) plan
POST/superadmin/tenants/:tenantId/assign-planManually assign plan to tenant

Superadmin — Payment Providers

GET/superadmin/payment-providersList provider configs (masked)
POST/superadmin/payment-providersCreate/update provider config
PATCH/superadmin/payment-providers/:provider/toggleToggle provider active status
DELETE/superadmin/payment-providers/:providerDelete provider config

API Keys

POST/api-keysCreate API key (admin only)
GET/api-keysList API keys (admin only)
DELETE/api-keys/:id/revokeRevoke API key
DELETE/api-keys/:idDelete API key permanently

Voice Agents

GET/voice-agentsList voice agents
POST/voice-agentsCreate voice agent
GET/voice-agents/:idGet voice agent details
PATCH/voice-agents/:idUpdate voice agent
DELETE/voice-agents/:idDelete voice agent
POST/voice-agents/:id/test-callInitiate a test call

Voice Knowledge Bases

GET/voice-knowledge-basesList knowledge bases
POST/voice-knowledge-basesCreate knowledge base
GET/voice-knowledge-bases/:idGet knowledge base with sources
DELETE/voice-knowledge-bases/:idDelete knowledge base
POST/voice-knowledge-bases/:id/sourcesAdd source (file upload, URL, or text)
DELETE/voice-knowledge-bases/:id/sources/:sourceIdRemove source

Voice Campaigns

GET/voice-campaignsList voice campaigns
POST/voice-campaignsCreate campaign
GET/voice-campaigns/:idGet campaign details with progress
POST/voice-campaigns/:id/startStart campaign
POST/voice-campaigns/:id/pausePause running campaign
POST/voice-campaigns/:id/resumeResume paused campaign
POST/voice-campaigns/:id/cancelCancel campaign

Calls

GET/callsList calls (paginated, filterable by status, agent, campaign, sentiment, date)
GET/calls/:idGet call details (includes summary, transcript, lead qualification)
GET/calls/:id/transcriptGet call transcript
GET/calls/:id/eventsGet call events timeline
GET/calls/export/csvExport calls as CSV

Phone Numbers

GET/phone-numbersList purchased phone numbers
GET/phone-numbers/availableSearch available Twilio numbers
POST/phone-numbers/buyPurchase a phone number
PATCH/phone-numbers/:id/assignAssign number to voice agent
DELETE/phone-numbers/:id/releaseRelease phone number

Voice Analytics

GET/voice-analytics/dashboardKPI summary (total calls, duration, cost, success rate)
GET/voice-analytics/call-volumeDaily call volume by status (30 days)
GET/voice-analytics/sentimentSentiment distribution
GET/voice-analytics/agentsPer-agent performance stats
GET/voice-analytics/campaignsPer-campaign comparison stats
GET/voice-analytics/costsDaily cost breakdown by type
GET/voice-analytics/lead-funnelLead qualification funnel
GET/voice-analytics/heatmapCall volume heatmap (day × hour)

Twilio Webhooks (Voice)

POST/twilio-voice/incomingHandle inbound Twilio call (returns TwiML)
POST/twilio-voice/statusTwilio call status callback
POST/twilio-voice/recordingRecording status callback

DNC (Do Not Call)

GET/contacts?isOnDncList=trueList DNC contacts
PATCH/contacts/:idSet isOnDncList flag on contact
POST/contacts/dnc/bulk-addBulk add phone numbers to DNC list

Other

GET/dashboard/statsDashboard analytics
POST/media/uploadUpload media (16MB limit)
GET/plugins/marketplaceAvailable plugins

Data Models

PostgreSQL (Prisma ORM)

Relational data for entities that need ACID guarantees, foreign keys, and complex queries.

ModelKey FieldsPurpose
Tenantname, slug, connectionType, status, settingsOrganization / workspace
Useremail, name, role, tenantId, isActiveTeam members with RBAC
Contactphone, name, email, tags, customFieldsCRM contacts
Flowname, graphJson, status, triggerTypeAutomation workflows
Templatename, language, category, status, componentsWhatsApp message templates
Campaignname, templateId, status, sentCountBulk messaging campaigns
Planname, price, maxAgents, maxContacts, maxMessagesPerMonth, maxFlows, maxChatAgents, maxTemplates, maxCampaigns, stripePriceId, razorpayPlanId, featuresSubscription tiers with resource limits
SubscriptiontenantId, planId, status, paymentProvider, currentPeriodStart/End, cancelAtPeriodEnd, stripeSubscriptionId, razorpaySubscriptionIdActive subscriptions with payment tracking
PaymentProviderConfigprovider (unique), displayName, isActive, credentials (encrypted JSON)Payment gateway configurations (Stripe/Razorpay)
Pluginkey, name, billing, visibility, typeInstallable extensions
TenantPlugintenantId, pluginId, status, configPlugin installations
ChatAgentname, orbitChatbotId, handoffKeywordsAI chatbot configs
ApiKeyname, keyHash, scopes[], expiresAt, lastUsedAt, isActiveScoped API keys for external integrations
VoiceAgentname, systemPrompt, voiceProvider, voiceId, model, temperature, maxCallDuration, toolsConfig, flowConfig, knowledgeBaseIdAI voice agent configurations
VoiceKnowledgeBasename, description, embeddingStatus, totalChunksKnowledge bases for voice RAG
VoiceKnowledgeBaseSourcetype (FILE/URL/TEXT), fileName, url, textContent, status, chunksCountKB data sources (documents, URLs, text)
VoiceKnowledgeBaseChunkcontent, embedding (vector 1536), metadata, tokenCountEmbedded text chunks for pgvector search
TwilioPhoneNumbertwilioSid, number, friendlyName, country, capabilities, assignedAgentIdPurchased Twilio phone numbers
VoiceCampaignname, agentId, type, status, contactListId, phoneNumberId, scheduleConfig, concurrencyLimit, retryConfig, totalContacts, completedContactsBulk outbound call campaigns
VoiceCampaignContactcampaignId, contactId, status, attempts, nextRetryAt, lastCallIdPer-contact campaign progress tracking
CallagentId, contactId, twilioCallSid, direction, status, fromNumber, toNumber, durationSeconds, recordingUrl, transcript, summary, sentiment, leadQualification, toolCallsLog, costCentsIndividual call records with AI analysis
CallEventcallId, eventType, payload, timestampCall event timeline (status changes, tool calls, errors)
VoiceUsageLogtype (CALL_MINUTE/TTS_CHARACTER/STT_MINUTE/LLM_TOKEN/PHONE_NUMBER), quantity, costCents, callIdGranular voice usage tracking for billing
ContactListname, contactsCountReusable contact lists for campaigns

MongoDB (Mongoose ODM)

Document store for high-volume, time-series data that benefits from flexible schemas.

CollectionKey FieldsPurpose
ConversationtenantId, contactPhone, agentId, status, lastMessage, tagsChat conversations with metadata
MessageconversationId, direction, type, content, statusIndividual chat messages
FlowSessionflowId, contactId, currentNodeId, variables, stateActive flow execution state
BaileysAuthtenantId, encrypted auth dataQR-based WhatsApp session storage

Real-Time (WebSocket)

OrbitDesk uses Socket.IO on the /chat namespace for real-time communication. JWT authentication is required for connection.

EventDirectionDescription
join_conversationClient → ServerJoin a conversation room
leave_conversationClient → ServerLeave a conversation room
send_messageClient → ServerBroadcast message
typingClient → ServerTyping indicator
new_messageServer → ClientNew message received (all channels)
message_statusServer → ClientDelivery/read status update (all channels)
notificationServer → ClientTenant-wide notification
baileys:qrServer → ClientQR code for scanning
baileys:statusServer → ClientConnection status change
flow:sessionServer → ClientFlow execution updates
call:startedServer → ClientVoice call initiated (includes callId, agentId, contactId)
call:statusServer → ClientCall status change (RINGING, IN_PROGRESS, COMPLETED, FAILED)
call:transcriptServer → ClientReal-time transcript delta (role, text, timestamp)
call:completedServer → ClientCall ended with summary, sentiment, duration
voice-campaign:progressServer → ClientCampaign progress update (completed/total, success/fail counts)
voice-campaign:completedServer → ClientCampaign finished with final stats

Rooms: tenant:{tenantId} (tenant-wide events), conversation:{conversationId} (per-conversation events)

Voice WebSocket: The Twilio media stream uses a raw WebSocket (not Socket.IO) at /api/v1/realtime/media-stream. This is separate from the Socket.IO chat gateway and handles real-time audio bridging between Twilio and OpenAI. Nginx is configured with 1-hour timeouts for this path.

Security

JWT Authentication

Short-lived access tokens (15m) with refresh token rotation (7d). Tokens contain userId, tenantId, and role.

AES-256 Encryption

Sensitive fields (API keys, OAuth tokens, WhatsApp credentials) encrypted at rest with a 32-byte key.

Webhook Verification

All channel webhooks verified by signature: Meta (HMAC-SHA256 X-Hub-Signature-256), Twilio (request URL signature), Telegram (token in path). Live Chat uses JWT visitor tokens.

Tenant Isolation

Every query filtered by tenantId extracted from JWT. No cross-tenant data leakage.

Helmet + CORS

Security headers via Helmet. CORS restricted to configured frontend origin with credentials.

Input Validation

NestJS ValidationPipe with whitelist mode. No unrecognized properties allowed.

API Key Auth

SHA-256 hashed keys with fine-grained scopes. Raw keys shown once, never stored. Supports expiry and instant revocation.

Rate Limiting

Per-key and per-IP rate limiting via @nestjs/throttler. 100 requests/minute default. Prevents API abuse.

Project Structure

whatsapp-crm/
├── apps/
│   ├── api/                    # NestJS Backend (REST + WebSocket)
│   │   ├── prisma/             #   Prisma schema, migrations, seed
│   │   └── src/
│   │       ├── common/         #   Shared guards, decorators, schemas
│   │       ├── modules/        #   Feature modules (auth, contacts, flows,
│   │       │                   #   voice-agents, twilio-voice, voice-campaigns,
│   │       │                   #   voice-knowledge-base, voice-analytics, calls...)
│   │       └── main.ts         #   App bootstrap
│   ├── web/                    # React Frontend (SPA)
│   │   └── src/
│   │       ├── pages/          #   Page components (inbox, flows, settings,
│   │       │                   #   voice-agents, voice-campaigns, calls,
│   │       │                   #   phone-numbers, voice-analytics...)
│   │       ├── router/         #   Route definitions
│   │       ├── services/       #   API client (Axios + TanStack Query)
│   │       └── layouts/        #   App shell, sidebar, topbar
│   └── plugins/                # Installable plugin packages
│       ├── orbitdesk-plugin-email/
│       ├── orbitdesk-plugin-google-sheets/
│       ├── orbitdesk-plugin-twilio/
│       └── orbitdesk-plugin-whatsapp-qr/
├── packages/
│   ├── shared/                 # Shared types, validators, constants
│   └── ui/                     # UI component library (Tailwind-based)
├── services/
│   └── ai/                     # FastAPI Python AI service
├── docker/                     # Dockerfiles + Nginx configs
├── scripts/                    # Deployment & backup scripts
├── .github/workflows/          # CI/CD pipelines
├── turbo.json                  # Turborepo configuration
├── pnpm-workspace.yaml         # Workspace definition
├── docker-compose.prod.yml     # Production Docker Compose
└── ecosystem.config.cjs        # PM2 configuration

Environment Variables

All environment variables are defined in .env at the project root. Copy from .env.example and configure for your environment.

VariableRequiredDescription
DATABASE_URLYesPostgreSQL connection string
MONGODB_URIYesMongoDB connection string
REDIS_URLYesRedis connection string
JWT_SECRETYesSecret for signing access tokens
JWT_REFRESH_SECRETYesSecret for signing refresh tokens
ENCRYPTION_KEYYes32-byte key for AES encryption
CORS_ORIGINYesAllowed frontend origin (e.g., https://your-domain.com)
NODE_ENVYesEnvironment: development / production
OPENAI_API_KEYFor AIOpenAI API key for chat completions
AI_SERVICE_URLFor AIURL of the FastAPI AI service
GOOGLE_OAUTH_CLIENT_IDFor SheetsGoogle OAuth client ID
GOOGLE_OAUTH_CLIENT_SECRETFor SheetsGoogle OAuth client secret
SMTP_HOSTFor Email channelSMTP server hostname (outbound email)
SMTP_PORTFor Email channelSMTP port (default: 587)
SMTP_USERFor Email channelSMTP username / address
SMTP_PASSFor Email channelSMTP password / app password
TWILIO_ACCOUNT_SIDFor SMS/VoiceTwilio account SID (global fallback)
TWILIO_AUTH_TOKENFor SMS/VoiceTwilio auth token (global fallback)
TWILIO_SMS_FROMFor SMS channelTwilio phone number for outbound SMS (E.164)
ELEVENLABS_API_KEYFor VoiceElevenLabs API key for premium TTS voices (global fallback)
ORBIT_API_URLFor VoiceOrbit FastAPI service URL for AI inference
STRIPE_SECRET_KEYFor BillingStripe secret key (fallback if not configured in DB)
STRIPE_WEBHOOK_SECRETFor BillingStripe webhook signing secret
RAZORPAY_KEY_IDFor BillingRazorpay key ID (fallback if not configured in DB)
RAZORPAY_KEY_SECRETFor BillingRazorpay key secret
RAZORPAY_WEBHOOK_SECRETFor BillingRazorpay webhook signing secret
FRONTEND_URLFor BillingFrontend URL for checkout success/cancel redirects
STORAGE_TYPENolocal (default) or s3
IS_DEMONoEnable demo mode (restricts destructive actions)

Deployment

OrbitDesk supports two deployment strategies on Ubuntu servers. See DEPLOYMENT.md for the complete step-by-step guide.

Docker Deployment

All services containerized with docker-compose.prod.yml. Includes Nginx, Certbot SSL auto-renewal, and health checks.

Recommended Single Command
docker compose -f docker-compose.prod.yml up -d

Bare Metal Deployment

Direct install with PM2 process manager, system Nginx, and manual database setup. Automated setup script provided.

Manual Setup PM2 + Nginx
sudo bash scripts/deploy-bare-metal.sh

Server Requirements

ResourceMinimumRecommended
CPU2 vCPUs4 vCPUs
RAM4 GB8 GB
Storage40 GB SSD100 GB SSD
OSUbuntu 22.04 LTSUbuntu 24.04 LTS

Required Ports (Firewall)

22 (SSH) 80 (HTTP) 443 (HTTPS)

Internal ports (3000, 3002, 5432, 27017, 6379, 8000) should NOT be exposed publicly.

CI/CD Pipeline

GitHub Actions workflows for automated deployment on push to main.

Pipeline Flow

Push to main
Lint & Type Check
Build
Deploy via SSH
Run Migrations
Health Check

Required GitHub Secrets

SecretDescriptionExample
DEPLOY_HOSTServer IP or hostname203.0.113.50
DEPLOY_USERSSH usernameorbitdesk
DEPLOY_SSH_KEYSSH private keyEd25519 key contents
DEPLOY_SSH_PORTSSH port (optional)22

Workflow Files

Developer Setup

Prerequisites

Quick Start

# 1. Clone the repository
git clone https://github.com/your-org/whatsapp-crm.git
cd whatsapp-crm

# 2. Install dependencies
corepack enable
pnpm install

# 3. Start databases (Docker)
docker compose -f docker/docker-compose.yml up -d

# 4. Configure environment
cp .env.example .env
# Edit .env with your settings

# 5. Setup database
pnpm db:generate
pnpm db:migrate
pnpm db:seed

# 6. Start development servers
pnpm dev
# Web:  http://localhost:3000
# API:  http://localhost:3002
# Docs: http://localhost:3002/api/docs

# 7. (Optional) Start AI service
cd services/ai
python3.11 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload --port 8000

Useful Commands

CommandDescription
pnpm devStart all dev servers (Turborepo)
pnpm buildBuild all packages
pnpm lintLint all packages
pnpm db:migrateRun Prisma migrations
pnpm db:seedSeed database with sample data
pnpm db:studioOpen Prisma Studio (DB browser)
pnpm db:generateGenerate Prisma client

Frequently Asked Questions

What is OrbitDesk?

OrbitDesk is a multi-tenant SaaS platform for managing WhatsApp Business communications. It provides a real-time inbox, visual flow automation builder, AI-powered chatbots, template management, and a plugin ecosystem — all designed for teams to manage customer interactions at scale.

What are the two WhatsApp connection methods?

Cloud API (Official): Uses Meta's official Business Platform. Requires a verified business account, phone number ID, and access token. Best for production use with high reliability.

Baileys QR: Connects any WhatsApp number by scanning a QR code (similar to WhatsApp Web). Requires the whatsapp-qr plugin. Suitable for testing or personal numbers, but less stable than the official API.

How does multi-tenancy work?

Each organization (tenant) is completely isolated. Every database query is filtered by tenantId, extracted from the authenticated user's JWT token. Tenants cannot see or access each other's data. A Superadmin can manage all tenants from the platform admin panel.

What databases are used and why?

PostgreSQL: Primary relational database (via Prisma ORM) for structured data with referential integrity — tenants, users, contacts, flows, templates, plans, subscriptions, and plugins.

MongoDB: Document database (via Mongoose) for high-volume, time-series data — conversations, messages, and flow execution sessions. Flexible schema suits the varied message content types.

Redis: In-memory cache for dashboard analytics (5-minute TTL), session data, rate limiting, and real-time pub/sub.

How do flow automations work?

Flows are visual workflows created in the FlowGram.ai-powered editor. A flow starts with a Trigger node (e.g., message received) and chains through nodes like Send Message, Condition, Delay, AI Response, HTTP Request, etc.

When triggered, the FlowExecutorService traverses the graph, executing each node via the NodeExecutorFactory. Flow state (current node, variables, execution log) is persisted in MongoDB as a FlowSession.

How do AI chatbots (Chat Agents) work?

Chat Agents operate in two modes:

  • RAG mode (default) — Create an agent, upload datasources (PDFs, text, URLs) for RAG indexing, then assign it to conversations. The bot answers questions using the knowledge base. Configurable handoff keywords trigger transfer to a human agent.
  • Tool calling mode — Enable AI Function Calling on the agent (via Chat Agents › edit › AI Function Calling toggle). The agent runs an autonomous loop, calling tools like lookup_crm, book_appointment, trigger_flow, update_contact, and update_conversation before returning a final reply. Configure provider (OpenAI/Anthropic), model, system prompt, which tools to enable, and max iterations — all from the UI.
What LLM providers are supported?

The FastAPI AI service supports: OpenAI (GPT-4o, GPT-4, GPT-3.5), Anthropic (Claude), Google Gemini, Mistral, Deepseek, and X.ai. Provider can be configured globally or overridden per flow node.

How do I add a new plugin?

Plugins follow a standard structure in apps/plugins/. Each plugin has a orbitdesk-plugin.json manifest defining key, name, capabilities, and billing. Plugins can be uploaded as .tgz packages via the Plugin API or added to the codebase directly.

Custom plugins can be tenant-scoped (visible only to the uploading tenant) or global (visible to all tenants, requires superadmin).

What is the difference between Docker and Bare Metal deployment?

Docker: All services (app + databases + nginx) run in containers orchestrated by docker-compose.prod.yml. Easier to set up, reproducible, includes automatic SSL renewal via Certbot container. Recommended for most use cases.

Bare Metal: Services run directly on the OS with PM2 as the process manager. Databases installed as system services. Requires more manual configuration but gives full control over the stack. Good for servers that already have databases running.

How do I set up SSL/HTTPS?

Docker: The production compose includes a Certbot container that automatically renews certificates. Initial certificate obtained via certbot certonly --standalone -d your-domain.com.

Bare Metal: Run sudo certbot --nginx -d your-domain.com after setting up the Nginx config. Certbot auto-configures SSL and sets up automatic renewal via systemd timer.

How do backups work?

The scripts/backup.sh script creates compressed backups of PostgreSQL (pg_dump), MongoDB (mongodump), and the uploads directory. It runs daily via cron at 2 AM and retains backups for 7 days by default. Backups are stored in /var/www/orbitdesk/backups/.

What is Demo Mode?

Setting IS_DEMO=true restricts destructive actions: users cannot delete flows, disconnect WhatsApp, delete templates, or modify critical settings. This is useful for public demos or showcases where you want to prevent data loss.

How do I troubleshoot WebSocket connection issues?

Common causes: (1) Nginx missing WebSocket upgrade headers — ensure proxy_set_header Upgrade and Connection "upgrade" are set for /socket.io/. (2) CORS_ORIGIN not matching the frontend domain. (3) JWT token expired or missing. Check browser console for Socket.IO errors and API logs for connection attempts.

What is the Superadmin panel?

The Superadmin panel (/platform/* routes) is only accessible to users with the SUPERADMIN role. It provides: tenant management (list, approve, reject new registrations), cross-tenant user management, public site page editing (landing page, privacy policy, terms of service), subscription plan management (create/edit/deactivate plans with resource limits), payment provider configuration (Stripe & Razorpay with encrypted credential storage), and manual plan assignment to tenants.

How does the campaign system work?

Campaigns allow sending a WhatsApp template message to a batch of contacts. You select a template, target contacts (by tags or manual selection), and schedule the send. The system tracks total recipients, sent count, and failed count. Campaign statuses: Draft, Scheduled, Running, Completed, Failed, Canceled.

How do subscriptions and billing work?

OrbitDesk supports Stripe and Razorpay as payment gateways for subscription billing. Superadmins configure payment provider credentials (encrypted at rest) via the platform admin panel. Only one provider can be active at a time.

Tenants subscribe by selecting a plan and completing checkout via the active payment provider. The system handles the full subscription lifecycle (activation, renewal, payment failures, cancellation) through webhook events. Plans define resource limits (agents, contacts, messages/month, flows, chat agents, templates, campaigns) that are enforced automatically when tenants create resources.

Superadmins can also manually assign plans to tenants without requiring payment, useful for enterprise deals or trial accounts.

How do API keys work for external integrations?

API keys allow external services (Zapier, CRMs, custom apps) to access OrbitDesk's REST API without user login. Admins create keys from Settings > API Keys, selecting specific scopes (e.g., contacts:read, messages:send). The raw key is shown only once at creation.

To authenticate, send the key in the X-Api-Key header. Keys are SHA-256 hashed before storage, support optional expiry dates, and can be instantly revoked. Each key is rate-limited to 100 requests/minute. See the REST API Access section for full details.

How does AI Voice Calling work?

AI Voice Calling uses Twilio for telephony and OpenAI Realtime API for real-time conversational AI. When a call is initiated, Twilio establishes a media stream WebSocket to the NestJS server. The server bridges this to OpenAI's Realtime API, transcoding audio between Twilio's μ-law 8kHz format and OpenAI's PCM16 format.

The AI agent converses naturally, can execute tools (book appointments, look up CRM data, search knowledge bases), and qualifies leads in real time. After the call ends, the system runs post-call processing: transcription, summarization, sentiment analysis, and lead qualification via the Orbit FastAPI microservice.

What do I need to set up voice calling?

You need three things:

1. Twilio Account: Account SID + Auth Token for telephony. Purchase at least one phone number via the Phone Numbers page.

2. OpenAI API Key: Required for the Realtime API (conversational AI) and embeddings (knowledge base). Configure in Settings > Integrations.

3. (Optional) ElevenLabs API Key: For premium voice synthesis. If not configured, OpenAI's built-in voices are used.

Each tenant configures their own API keys. Keys are encrypted at rest and resolved per-request with a global fallback.

What is the DNC (Do Not Call) list?

The DNC list prevents specific contacts from being called by voice campaigns. Contacts flagged with isOnDncList = true are automatically skipped during campaign execution. Manage the DNC list from Settings > DNC List where you can add individual numbers, bulk upload via CSV, or search and remove entries. This ensures compliance with Do Not Call regulations.

How do voice knowledge bases work?

Voice knowledge bases provide real-time information retrieval during live calls. You upload documents (PDF, text), URLs, or raw text. The system chunks the content, generates embeddings using OpenAI's text-embedding-3-small model (1536 dimensions), and stores them with pgvector in PostgreSQL.

During a live call, when the AI agent needs specific information, it uses the search_knowledge_base tool to perform semantic similarity search and retrieve relevant context. This enables agents to accurately answer questions about products, policies, or services.

Can I use S3/MinIO instead of local storage for media?

Yes. Set STORAGE_TYPE=s3 in your .env and configure S3_BUCKET, S3_REGION, S3_ACCESS_KEY_ID, S3_SECRET_ACCESS_KEY, and optionally S3_ENDPOINT (for MinIO). The media module will automatically use S3 for uploads and generate pre-signed URLs for serving.

What are the user roles and their permissions?

SUPERADMIN: Platform-wide access — manage all tenants, approve registrations, manage platform settings and site pages.

ADMIN: Full tenant access — manage users, settings, plugins, integrations, flows, templates, and all conversations.

MANAGER: Limited admin access — manage team, view analytics, manage flows and templates, handle conversations.

AGENT: Operational access — handle assigned conversations, send messages, and view contacts.

OrbitDesk Documentation — Built with care

Node 20 • pnpm 9 • NestJS 11 • React 19 • FastAPI • PostgreSQL 16 • MongoDB 7 • Redis 7