Precise Transcripts - DialNexa Documentation

Transcription accuracy directly affects agent behavior. A word error in the transcript propagates to the LLM, which may misinterpret the caller’s intent, extract the wrong data, or produce an incorrect response. This page covers techniques for improving transcript accuracy at every point in the pipeline.

Where transcription errors come from

Before applying fixes, identify the source of the error. The same symptom (agent responding incorrectly) can come from different causes:

Symptom	Source
Agent uses wrong word that sounds like what the caller said	Transcription error (acoustic confusion)
Agent uses wrong word with no phonetic similarity	LLM hallucination or prompt issue
Specific domain terms are consistently wrong	Missing vocabulary hints
Errors only in noisy calls	Background noise, insufficient denoising
Errors only for specific callers	Accent not well-supported by current model
Numbers/dates extracted incorrectly	Text normalization issue, not transcription

Check the Session History LLM call event to see the exact text that reached the LLM. If the transcription event shows correct text but the LLM output is wrong, the problem is in the LLM layer, not the transcriber.

Vocabulary hints

Vocabulary hints are custom words and phrases that tell the transcriber to expect specific terms. The transcriber uses hints to bias recognition toward these terms when the audio is ambiguous. Use vocabulary hints for:

Product names, brand names, proprietary terms that do not appear in general language models (e.g., “DialNexa”, “Nexabot”, “Vexa Pro”)
Medical, legal, or technical terminology specific to your domain
Names of people, places, or services that are commonly mispronounced by the transcriber
Short alphanumeric codes or identifiers that transcribers often read as separate words

Configuring vocabulary hints: Navigate to Settings > Speech > Vocabulary Hints. Add each term on a new line. For multi-word phrases, enter the full phrase. You can include phonetic spellings in parentheses for unusual terms.

DialNexa
Nexabot
appointment ID
haematology
Kovalam Beach

Keep your vocabulary hint list focused. Adding hundreds of common words does not help — the transcriber already handles common vocabulary well. Prioritize domain-specific terms with the highest call volume impact.

Limitations of vocabulary hints:

Hints bias recognition, they do not guarantee a specific word is used. If the caller pronounces a term very differently from the expected pronunciation, hints may not help.
Deepgram’s vocabulary hint support varies by model. Verify that the model you are using accepts custom vocabulary.

Transcription model selection

Different Deepgram models have different accuracy profiles. Selecting the right model for your use case is often the highest-impact change you can make.

Model	Best for
Nova-2	General purpose, broad language coverage
Nova-2 (Phone Call)	Telephone audio quality (8 kHz, compression artifacts)
Nova-2 (Medical)	Medical terminology, clinical conversations
Whisper (via Deepgram)	Maximum accuracy, accented speech, high latency

For most DialNexa deployments, Nova-2 (Phone Call) is the right default. It is trained on telephony audio and handles the compression and bandwidth limitations of phone calls better than the general Nova-2 model. Switch to Whisper if:

Your callers have diverse accents and Nova-2 accuracy is insufficient
Response latency is less critical than transcript accuracy (Whisper is slower)
You have a high-value use case where accuracy matters more than throughput

Background noise and denoising

Background noise increases word error rates. Server-side denoising (Denoising Mode in Speech Settings) reduces noise before the audio reaches the transcriber. For the denoising configuration guide, see Handle Background Noise. Key principle: apply only as much denoising as needed. High denoising on clean audio or accented speech can attenuate phonemes alongside noise, making transcription worse rather than better.

Speaker diarization

Speaker diarization labels which portions of the transcript belong to which speaker (caller vs. agent). In DialNexa, diarization is most relevant for:

Post-call analysis that needs to distinguish caller statements from agent statements
Transcription cleanup where the caller and agent speak over each other

Enable diarization in Settings > Speech > Diarization (if available for your model). When enabled, transcript events include a speaker field (caller or agent) for each segment.

Diarization is a compute-intensive feature and adds latency to transcript delivery. Enable it only when post-call processing requires caller/agent separation.

Transcript cleanup post-processing

For use cases where transcript accuracy is critical for downstream analysis (medical, legal, compliance), apply post-processing to the raw transcript before using it. Post-processing approaches:

Acronym expansion: replace “AI” with “artificial intelligence”, “OTP” with “one-time password” based on domain-specific rules
Number normalization: standardize how dates and numbers appear in the transcript for consistent PCA extraction
Filler word removal: strip “um”, “uh”, “you know” from the caller transcript before LLM processing

Implement post-processing in a webhook handler that receives the call.transcript_ready event and runs cleanup before storing or analyzing the transcript. The cleaned transcript can then be passed to your own LLM analysis pipeline.

Identifying transcription errors vs. LLM errors in Session History

Open the session in Session History

Go to Monitor > Sessions, find the call, and open the event timeline.

Locate the problematic turn

Find the agent response that was incorrect or surprising. Identify which caller utterance preceded it.

Check the transcription event for that utterance

The transcription event shows the exact text the transcriber produced. Is the text correct? If the caller said “I want to book an appointment” but the transcription shows “I want to book a department”, that is a transcription error.

If transcription is correct, check the LLM event

If the transcription event shows the correct caller text, open the LLM call event for the agent’s response. Examine the full input context. Did the LLM receive the correct conversation history? Was the prompt intact?If the LLM input was correct but the output was wrong, the issue is in the prompt or LLM behavior, not the transcriber.

Apply the right fix

Transcription error: adjust denoising, switch transcription model, add vocabulary hints
LLM error: revise system prompt, add constraints, check for context length issues

Common transcription errors and fixes

Error pattern	Fix
Domain-specific terms consistently wrong	Add to vocabulary hints
Numbers read as words instead of digits	Adjust text normalization settings
First word of each utterance frequently wrong	Check endpoint detection sensitivity; caller may be cutting off the beginning of speech
All errors in calls from a specific region	Switch to Nova-2 (Phone Call) or Whisper for better accent coverage
Errors only in calls with background noise	Increase Denoising Mode from Low to High; test for accent impact
Agent name or product name consistently misspelled	Add to vocabulary hints with correct spelling

​Where transcription errors come from

​Vocabulary hints

​Transcription model selection

​Background noise and denoising

​Speaker diarization

​Transcript cleanup post-processing

​Identifying transcription errors vs. LLM errors in Session History

​Common transcription errors and fixes

​Related

Where transcription errors come from

Vocabulary hints

Transcription model selection

Background noise and denoising

Speaker diarization

Transcript cleanup post-processing

Identifying transcription errors vs. LLM errors in Session History

Common transcription errors and fixes

Related