Skip to main content
Auto language switching allows an agent to detect when a caller changes languages during a call and switch the transcription pipeline to match. Without it, a transcriber configured for one language will produce high error rates if the caller switches to another. A common scenario: a Hindi-English household where the primary caller speaks Hindi but a family member joins mid-call and continues in English, or a multilingual region where callers may start in one language and shift to another based on comfort.

How it works

  1. The caller speaks in the configured primary language.
  2. DialNexa monitors the incoming audio stream for language change signals.
  3. When a language shift is detected with sufficient confidence, the transcription model switches to the detected language.
  4. The agent continues the conversation using the new transcription context.
  5. If the caller switches back, the transcription model switches again.
Language detection and switching happen continuously during the call. Each switch adds a small amount of latency (typically under 300 ms) as the transcriber reinitializes for the new language.

Prerequisites

  • The transcription model in use must support multi-language detection. Currently, this is available with Deepgram Nova-2 and select configurations.
  • Both the primary language and the target language(s) must be in the supported language list for your workspace.
  • Auto-switch is configured at the agent level, not the workspace level. Each agent has its own switch language list.

Enabling auto-switch

1

Open Speech Settings

Navigate to your agent and go to Settings > Speech > Language.
2

Set the primary language

Select the language the agent is primarily configured for. This is the language used at call start and the one the agent’s prompt is written in.
3

Enable Auto-Switch

Toggle Auto-Switch Languages to on.
4

Select candidate languages

Choose the languages the agent should be able to switch to. Only select languages that are genuinely expected in your caller population — adding too many candidates reduces detection confidence.
5

Test

Place test calls where you deliberately switch languages mid-call. Review transcription accuracy in Session History for both the pre-switch and post-switch segments.

Limitations

Not all language pairs are supported. Language detection works best for pairs with distinct phonological systems (e.g., Hindi and English). Pairs that share significant phonology (e.g., Spanish and Portuguese) may produce false detections. Detection lag. The system needs at least 2 to 3 seconds of speech in the new language before it detects a switch with high confidence. The first few words after a switch may be transcribed in the wrong language. Accuracy loss during the switch window. The brief period between the switch starting and the transcriber reinitializing has elevated word error rates. Design agent prompts to handle ambiguous or missing input gracefully during these moments. Voice synthesis does not switch automatically. Auto-switch affects the transcription pipeline only. The agent’s voice continues speaking in whatever language its TTS voice supports. If Hinglish Map language tracking is configured, the runtime can ask the LLM to mirror English, Hindi, Romanized Hindi, or Hinglish on each turn, but the selected voice still needs to support the spoken output. Hinglish is a special case. For callers who mix Hindi and English within sentences (code-switching rather than full language switching), use the hi-en language setting with the Hinglish Map instead of auto-switch. Auto-switch targets utterance-level language changes, not within-sentence mixing.

Common use cases

  • Multilingual households: a call center that serves both English and Hindi speakers uses auto-switch so callers can use whichever language they are more comfortable with, without requiring the agent to know which to expect
  • Bilingual regions: geographies where callers commonly alternate between two languages (e.g., Tagalog and English in the Philippines, French and English in Quebec)
  • Escalation flows: a primary agent configured for one language hands off to a segment of the call in a different language