On-Premise Deployment - DialNexa Documentation

DialNexa’s on-premise deployment option lets enterprises run the full DialNexa platform within their own infrastructure - on-premises data centers, private cloud environments, or government cloud instances. All voice processing, LLM inference routing, transcription, TTS, and call data storage operate inside your environment, with no data leaving your network.

On-premise deployment is an Enterprise-tier feature. Contact sales at [email protected] or your account manager to discuss requirements and pricing.

When to Use On-Premise

On-premise deployment is the right choice when:

Strict data sovereignty

Your data cannot legally or contractually leave your network - common in banking, defense, government, and regulated healthcare.

Air-gapped environments

Your infrastructure has no internet access or highly restricted egress. On-premise runs fully disconnected from DialNexa’s cloud.

Custom compliance requirements

Your compliance framework (ISO 27001, SOC 2, HIPAA, DPDP, etc.) requires you to control where voice data is processed and stored at the infrastructure level.

Very high call volume

Organizations processing millions of minutes per month may find on-premise deployment more cost-effective than cloud pricing at scale.

Architecture Overview

The DialNexa on-premise stack is composed of containerized services that you run on your own Kubernetes cluster:

┌─────────────────────────────────────────────────────────┐
│                    Your Infrastructure                   │
│                                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │  Call Gateway │  │  LLM Router  │  │  TTS Engine  │  │
│  │  (SIP/PSTN)  │  │  (GPT/Claude)│  │  (Cartesia/  │  │
│  └──────┬───────┘  └──────┬───────┘  │  ElevenLabs) │  │
│         │                 │          └──────┬───────┘  │
│  ┌──────▼───────┐  ┌──────▼───────┐         │          │
│  │  Transcription│  │  Orchestrator│◄────────┘          │
│  │  (Deepgram/  │  │  (Core agent │                    │
│  │   Soniox)    │  │   engine)    │                    │
│  └──────────────┘  └──────┬───────┘                    │
│                           │                            │
│  ┌────────────────────────▼───────────────────────┐    │
│  │                Data Layer                      │    │
│  │  (Call records, recordings, transcripts, logs) │    │
│  └────────────────────────────────────────────────┘    │
│                                                         │
│  ┌──────────────────────────────────────────────────┐  │
│  │  Dashboard + API (self-hosted web application)   │  │
│  └──────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────┘

All components run as Docker containers orchestrated by Kubernetes. The architecture supports horizontal scaling of each component independently.

Supported Infrastructure

Container orchestration: Kubernetes 1.25+ Cloud platforms (private cloud or on-premise cloud stacks):

Amazon Web Services (EKS)
Microsoft Azure (AKS)
Google Cloud Platform (GKE)
On-premises Kubernetes (Rancher, OpenShift, vanilla k8s)

Minimum compute requirements (for a production deployment handling ~50 concurrent calls):

Component	CPU	RAM	Storage
Call Gateway	4 vCPU	8 GB	20 GB
Orchestrator	8 vCPU	16 GB	50 GB
Transcription	8 vCPU	16 GB	10 GB
TTS Engine	4 vCPU	8 GB	10 GB
LLM Router	4 vCPU	8 GB	10 GB
Data Layer (DB)	4 vCPU	16 GB	500 GB+
Dashboard + API	4 vCPU	8 GB	20 GB

Requirements scale approximately linearly with concurrency. For 200 concurrent calls, multiply the above by 4. Networking requirements:

Outbound internet access required for: LLM API calls (OpenAI, Groq, DeepSeek, Google) unless running local LLM inference; TTS API calls (Cartesia, ElevenLabs) unless using locally hosted models; License validation (periodic, can be done through a proxy)
If fully air-gapped: Requires locally hosted LLM (e.g., llama.cpp, vLLM with an approved model) and locally hosted TTS (additional setup required; contact solutions engineering)

LLM Options for On-Premise

For on-premise deployments that require fully air-gapped operation:

OpenAI API via Azure OpenAI Service: Azure OpenAI offers GPT-4o and GPT-4o Mini with data residency in Azure regions of your choice, including India. API calls stay within Azure - this is often sufficient for data residency requirements without requiring a fully local LLM.
Self-hosted open-source LLMs: For fully disconnected deployments, DialNexa’s on-premise stack supports integration with vLLM-hosted models. Model selection and quality will differ from hosted options. Contact solutions engineering for supported model list.

Telephony Integration

On-premise deployments connect to the PSTN via:

SIP trunk: Bring your own SIP trunk from any carrier. The DialNexa Call Gateway accepts SIP connections on configurable ports.
Existing carrier via SIP: Route calls from your existing Plivo account or any other SIP-capable carrier to the on-premise Call Gateway via SIP.
Legacy PBX: DialNexa on-premise can integrate with existing on-premises PBX systems (Asterisk, FreeSWITCH, Cisco UCM) via SIP.

Updates and Licensing

On-premise deployments are licensed annually per seat/concurrency tier. DialNexa provides:

Container images via a private registry (credentials provided at contract signing)
A Helm chart for Kubernetes deployment
Regular security and feature updates (applied on your schedule)
A dedicated solutions engineer for initial deployment and ongoing support

License keys must be renewed annually. License validation is performed by the DialNexa Orchestrator service. In fully air-gapped environments, offline license validation is available.

Getting Started

On-premise deployment requires a sales engagement before technical setup can begin.

Contact sales

Reach out at [email protected] or through your account manager. Provide a brief description of your requirements: call volume, infrastructure environment, compliance constraints, and timeline.

Solutions engineering call

A DialNexa solutions engineer will meet with your team to assess requirements, size the deployment, and discuss integration with your existing telephony and identity infrastructure.

Proof of concept

For qualified prospects, DialNexa provides a time-limited on-premise POC deployment to validate the platform in your environment before committing to a full deployment.

Deployment and onboarding

Upon contract signing, you receive access to the container registry, Helm charts, and deployment documentation. A dedicated solutions engineer supports the initial deployment and provides onboarding for your technical team.

Enterprise Plan - full list of Enterprise features
Data Residency - cloud-based India region (alternative to on-premise for data residency)
Access Control - RBAC and SSO for Enterprise

​When to Use On-Premise

Strict data sovereignty

Air-gapped environments

Custom compliance requirements

Very high call volume

​Architecture Overview

​Supported Infrastructure

​LLM Options for On-Premise

​Telephony Integration

​Updates and Licensing

​Getting Started

​Related Pages

When to Use On-Premise

Architecture Overview

Supported Infrastructure

LLM Options for On-Premise

Telephony Integration

Updates and Licensing

Getting Started

Related Pages