Reliability Overview - DialNexa Documentation

DialNexa is built to handle production voice AI workloads - where a dropped call or a silent agent isn’t just a bug, it’s a broken customer interaction. This page gives you a full picture of how reliability is handled at the platform level and what levers you have to tune it for your deployment.

Uptime and SLA

DialNexa’s infrastructure targets 99.9% uptime for call processing and API endpoints. Platform status is published at status.dialnexa.com, where you can subscribe to incident notifications by email or webhook.

Enterprise customers on custom SLA agreements may have higher uptime commitments and dedicated incident response channels. See the Enterprise Plan page for details.

The platform operates across multiple availability zones. Call processing, LLM routing, TTS synthesis, and telephony signaling are each independently resilient - a failure in one subsystem does not cascade into a full platform outage.

What Can Go Wrong on a Call

Understanding failure modes helps you build agents that degrade gracefully. Failures on a DialNexa call fall into five categories:

Telephony failures

The PSTN or SIP layer drops the call before it reaches the agent. Causes include carrier issues, invalid numbers, regional routing problems, or call rejected by the destination.

Transcription errors

The transcriber fails to convert speech to text accurately, or returns an empty result. This produces silence or a nonsensical agent response.

LLM errors

The language model returns an error, times out, or produces a response that violates the agent’s configured constraints (e.g., too long, wrong format for a tool call).

TTS failures

The text-to-speech engine fails to synthesize audio. The call goes silent on the agent’s turn.

Tool errors

A tool invocation (calendar booking, custom function, transfer) fails with a non-200 response or times out. The agent may stall or respond incorrectly.

Call Failure Handling

When a recoverable error occurs mid-call, DialNexa attempts to keep the call alive:

LLM timeout: If the LLM does not respond within the configured timeout, the agent plays a fallback phrase (“Give me just a moment…”) and retries the request once. If the retry also fails, the call is ended gracefully.
TTS failure: If synthesis fails, the system retries with the same voice provider. If the retry fails, DialNexa falls back to a brief synthesized message using a secondary provider before ending.
Tool timeout: Tool calls have a configurable timeout (default: 10 seconds). If a tool exceeds its timeout, the agent receives an error response and can handle it within the prompt logic.
Transcription silence: If no speech is detected for an extended period, the agent triggers the configured “no input” behaviour (either prompting the caller or ending the call).

Telephony-layer failures (carrier drops, disconnected numbers) cannot be recovered by the platform. These appear in call logs with a telephony_failed or call_not_answered status.

Retry Logic for Outbound Calls

For batch campaigns and API-initiated outbound calls, DialNexa supports configurable retry logic:

Setting	Description	Default
Max retries	How many times to retry an unanswered or failed call	2
Retry interval	Minimum gap between retry attempts	30 minutes
Retry on no-answer	Retry if the call rings but is not picked up	Enabled
Retry on busy	Retry if the destination is busy	Enabled
Retry on voicemail	Retry if voicemail is detected	Disabled

Configure these per campaign in the Batch Calls section of the dashboard or via the API when creating a campaign.

Concurrency Limits

Concurrency is the number of simultaneous active calls your account can handle at any moment. Exceeding the limit causes new call attempts to be queued or rejected, depending on your configuration. Default concurrency limits are plan-dependent. To check your current limit, go to Settings → Plan in the dashboard. For a full breakdown, see Concurrency Tiers.

If you are running a batch campaign that requires higher concurrency than your plan allows, you can request a temporary limit increase from the Support team before your campaign launch.

Diagnosing Issues

The fastest path to diagnosing a call issue:

Find the call in history

Go to Monitoring → Call History in the dashboard. Filter by agent, date, or status. Click the call to open its detail page.

Check the call status

The status field tells you the high-level outcome: completed, failed, no-answer, busy, voicemail, transferred. Failed calls include a failure reason.

Read the transcript

The transcript shows the full conversation turn by turn, including agent responses, caller speech, and tool calls. Identify where the conversation went wrong.

Inspect tool call logs

If a tool was invoked, expand the tool call entry in the transcript. You’ll see the request payload sent to the tool, the response received, and the latency.

Check error codes

Look for error_code fields on failed turns. Common codes: llm_timeout, tts_failure, tool_error, transcription_empty. Each maps to a specific failure mode.

For a detailed walkthrough, see Debug Call Issues.

Health Monitoring

You can monitor your DialNexa deployment’s health proactively:

Webhook events: Subscribe to call lifecycle events (call.started, call.ended, call.failed) via your agent’s webhook URL. Failed calls emit a call.failed event with a machine-readable error code.
Post-call analysis: Configure post-call analysis fields to extract quality signals from transcripts automatically (e.g., detect calls where the agent didn’t answer the caller’s question).
Dashboard metrics: The Monitoring tab shows aggregate call volume, success rate, average duration, and tool error rates across your agents.

Debug Call Issues - step-by-step call debugging
Fix Agent Behavior - prompt and configuration fixes
Latency - understanding and reducing call latency
Fraud Protection - protecting against malicious traffic
Prevent Abuse - rate limits, key security, and configuration hardening

​Uptime and SLA

​What Can Go Wrong on a Call

Telephony failures

Transcription errors

LLM errors

TTS failures

Tool errors

​Call Failure Handling

​Retry Logic for Outbound Calls

​Concurrency Limits

​Diagnosing Issues

​Health Monitoring

​Related Pages

Uptime and SLA

What Can Go Wrong on a Call

Call Failure Handling

Retry Logic for Outbound Calls

Concurrency Limits

Diagnosing Issues

Health Monitoring

Related Pages