Text normalization is the process of converting written tokens — numbers, dates, currency amounts, abbreviations — into their spoken equivalents before synthesis. Without normalization, a TTS engine may speak “123” as “one hundred twenty-three” or as “one two three” depending on context and provider behavior. Normalization settings let you control this explicitly.
What normalization affects
| Written form | Normalized (cardinal) | Normalized (digit-by-digit) |
|---|
123 | ”one hundred twenty-three" | "one two three” |
2024-03-15 | ”March fifteenth, twenty twenty-four” | depends on format setting |
$49.99 | ”forty-nine dollars and ninety-nine cents" | "forty-nine point nine nine dollars” |
+1 800 555 0199 | ”one eight hundred five hundred fifty-five zero one ninety-nine" | "plus one eight zero zero five five five zero one nine nine” |
10% | ”ten percent" | "ten percent” (same) |
The default normalization behavior is controlled by the voice provider (ElevenLabs or Cartesia) and the Normalize Text settings in DialNexa. DialNexa applies a pre-processing layer before text reaches the provider.
When to override normalization
The default “speak numbers as words” behavior is correct for most conversational contexts. Override it when:
- Phone numbers: callers expect digits read individually, not as a cardinal number. “eight hundred five five five” is ambiguous; “eight zero zero five five five zero one nine nine” is not.
- Account or reference IDs: “your booking ID is AB-1234” should be “A B one two three four”, not “A B one thousand two hundred thirty-four”.
- Confirmation codes: PIN codes, OTPs, and similar short digit strings should always be digit-by-digit.
- Years: “2024” as “twenty twenty-four” is natural; “two thousand twenty-four” is less so for conversational speech.
Configuring normalization in DialNexa
Normalization is configured per agent in Settings > Speech > Normalize Text.
Global Normalize Text toggle
When enabled, DialNexa applies its normalization pipeline before sending text to the TTS provider. When disabled, text is sent as-is and the provider’s default behavior applies.
Number reading mode
- Words: numbers are spoken as cardinal or ordinal words (“one hundred twenty-three”)
- Digits: numbers are spoken digit-by-digit (“one two three”)
- Auto: DialNexa infers the appropriate mode based on context (phone number patterns, ID patterns, year patterns)
Auto mode handles most cases well. Switch to explicit Words or Digits mode if Auto produces incorrect output for specific token types common in your use case.
Overriding normalization in prompts
For precise control over how specific values are spoken, you can write them in the prompt using SSML-style hints or explicit phonetic spelling. For example:
- Write
"1-800-555-0199" (with hyphens) instead of "18005550199" — hyphens signal a phone number pattern to the normalization engine
- Write
"PIN: 4-2-7-9" with explicit separators for digit-by-digit reading
- Write
"twenty twenty-four" directly if you want a year spoken a specific way regardless of normalization settings
Dynamic variable normalization
When dynamic variables are injected into agent speech (e.g., {{customer_account_number}}), the value is subject to the same normalization rules as static text. If your variable contains a phone number or ID that should be read digit-by-digit, either:
- Configure the agent’s number reading mode to Digits, or
- Format the variable value with separators before injecting it (e.g.,
"1-800-555-0199" instead of "18005550199")