Start with the Transcript
Before changing anything, read the transcript of a call where the behavior occurred. Specifically:- What did the caller say immediately before the unwanted behavior?
- What did the agent say (or not say)?
- Was a tool invoked? What did it return?
- Is the behavior deterministic (happens every time) or occasional?
Common Behavioral Problems and Fixes
Agent goes off-topic or ignores instructions
Agent goes off-topic or ignores instructions
The agent is not respecting constraints in the system prompt. This is the most common issue.Fixes:
- Move the constraint closer to the top of the system prompt. LLMs weight earlier instructions more heavily.
- Make the constraint explicit and negative: instead of “focus on appointments”, say “Do not discuss topics unrelated to appointment booking. If asked about anything else, politely redirect the caller.”
- Add a reminder at the end of the prompt: “Always stay within your role as a booking assistant.”
- If using a multi-prompt architecture, check that the constraint is present in the specific prompt where the behavior occurs - it may not be inherited from another prompt.
Agent gives incorrect information
Agent gives incorrect information
The agent is hallucinating facts or misremembering details.Fixes:
- Never rely on the LLM’s parametric knowledge for facts that need to be accurate (business hours, prices, addresses, policies). Pass this information explicitly in the system prompt or via a tool that fetches it dynamically.
- Use a dynamic variable (
{{business_hours}}) to inject current facts into the prompt at call time, rather than hardcoding them. - Add an explicit instruction: “Only state information that is explicitly provided to you. Do not guess or infer details.”
- Switch to a more capable model (GPT-4o) if the information is complex and the current model is struggling.
Agent doesn't use a tool when it should
Agent doesn't use a tool when it should
The LLM is deciding not to invoke the tool even when the situation calls for it.Fixes:
- Make the tool invocation condition explicit in the prompt: “When the caller wants to book an appointment, you MUST use the
book_calendartool. Never confirm an appointment without using this tool.” - Review the tool’s description. The LLM decides when to use a tool based on the description. If the description is vague, the LLM won’t know when to invoke it.
- Check whether the tool is actually enabled on the agent. Go to the agent’s Tools tab and confirm the tool is listed and active.
- Add a few examples in the prompt showing the scenario where the tool should be called.
Agent uses a tool at the wrong time
Agent uses a tool at the wrong time
The LLM is invoking a tool in situations where it shouldn’t.Fixes:
- Add a negative condition to the tool description: “Only invoke this tool when the caller has explicitly confirmed their appointment details. Do not invoke this tool to check availability.”
- If you have two tools that the LLM confuses (e.g.,
check_slotsandbook_slot), make their descriptions sharply distinct in purpose. - Add prompt instructions: “Before booking, always confirm the caller’s name, date, and time. Do not call
book_calendaruntil all three are confirmed.”
Agent ends the call too early
Agent ends the call too early
The agent is triggering its end-call logic prematurely.Fixes:
- Review the end-call condition in your prompt. If it says “end the call when the conversation is complete,” the LLM may interpret a natural pause as completion.
- Be specific: “Only end the call after the caller explicitly says goodbye or indicates they have no further questions.”
- In conversation flow mode, check the transition conditions on nodes that lead to the End Node. An overly broad condition may be firing too early.
Agent doesn't end the call when it should
Agent doesn't end the call when it should
The call goes on indefinitely because the agent doesn’t recognize it should close.Fixes:
- Add an explicit end condition: “Once the appointment is confirmed and the caller has no further questions, thank them and end the call.”
- Use the
max_durationsetting to cap call length as a safety net. - In conversation flow mode, ensure there is a valid path to the End Node for all happy-path scenarios.
Agent response is too long or verbose
Agent response is too long or verbose
The agent is producing overly long responses that sound unnatural in a voice context.Fixes:
- Add an explicit length constraint: “Keep all responses under 2 sentences. Speak naturally and concisely - this is a phone call, not a written message.”
- Add: “Do not use bullet points, lists, or formatting. Speak in plain conversational sentences.”
- Reduce temperature slightly (e.g., from 0.7 to 0.5) to make responses more focused.
Agent response is too short or curt
Agent response is too short or curt
The agent sounds robotic or dismissive.Fixes:
- Add personality guidance: “Speak warmly and naturally. Acknowledge what the caller said before responding.”
- Provide an example of a good response in the prompt: “For example, if the caller says they want to cancel, say: ‘Of course, I can help with that. Let me pull up your appointment…’”
- Increase temperature slightly (e.g., from 0.3 to 0.6) to allow more natural variation.
Adjusting Temperature
Temperature controls how deterministic the LLM’s responses are. Lower values produce more predictable, focused output; higher values produce more varied, creative output.| Temperature | Effect | Best for |
|---|---|---|
| 0.0 - 0.3 | Very consistent, sometimes robotic | Structured data extraction, strict compliance scenarios |
| 0.4 - 0.6 | Balanced consistency and naturalness | Most customer service use cases |
| 0.7 - 0.9 | More natural and varied, occasionally unpredictable | Conversational agents, open-ended interactions |
| 1.0+ | Highly creative, prone to hallucination | Rarely appropriate for voice agents |
Improving Tool Descriptions
The LLM reads tool descriptions to decide when and how to use each tool. A poorly written description is one of the most common causes of incorrect tool behavior. A good tool description answers:- What does this tool do?
- When should I call it (and when should I not)?
- What information do I need before calling it?
- What will I get back?
Fixing Conversation Flow Logic
If you are using the conversation flow (node-based) architecture, behavioral issues often come from misconfigured transition conditions.Map the expected path
Draw or review the intended flow for the scenario where the issue occurs. Which node should the agent be on? Which node does it actually end up on?
Check transition conditions
Click each edge (connection between nodes) in the flow editor. The transition condition is evaluated after each agent turn. If it is too broad (e.g., “if the caller responds”), it may fire too early.
Check for missing transitions
If there is no valid transition condition for a caller input, the conversation flow may get stuck on the current node, causing repetitive behavior. Add a fallback transition for unexpected inputs.
Test with the flow simulator
Use the built-in simulator to walk through the conversation step by step. Enter example caller inputs and verify the agent takes the expected path through the flow.
Using Fine-Tune Examples
For persistent behavioral issues, you can add few-shot examples directly in the system prompt to show the LLM the exact behavior you want. Format examples as a dialogue within the prompt:Testing Changes Before Going Live
Always test prompt and flow changes before deploying to production:Use the Test Call feature
From your agent’s page, use the Test Call button to make a live call to your own phone. This runs the actual voice pipeline with real TTS and transcription.
Use the Flow Simulator
For conversation flow agents, the simulator lets you test transitions and node behavior without making a real call. Fast iteration before committing changes.
A/B test prompt versions
Create a duplicate agent with the modified prompt and route a small percentage of real calls to it. Compare post-call analysis metrics between versions.
Review post-call analysis
Configure extraction fields that flag specific behavioral issues. After running test calls, check the analysis results to verify the issue is resolved.
Related Pages
- Debug Call Issues - finding which calls have problems
- Reliability Overview - platform-level failure handling