LLM Tab - DialNexa Documentation

The LLM Tab provides detailed control over how the language model processes each turn of the conversation. It overlaps with the Engine Tab for model and temperature settings but focuses specifically on context management, token limits, and which tools the LLM can access.

The LLM Tab and Engine Tab share some settings (model selection, temperature). Changes made in either tab affect the same underlying configuration. Check both tabs when auditing an agent’s LLM configuration.

Model Settings

Model

The LLM used for all agent response generation. This is the same setting as Primary Model in the Engine Tab. Changing it here updates the Engine Tab as well. See Engine Tab for guidance on selecting the right model for your use case.

Temperature

Controls response randomness. Same as the temperature setting in the Engine Tab. Range: 0.0 to 1.0. See Engine Tab for the full explanation.

Context Window Behavior

Context Window Size

The number of recent conversation turns included in the LLM prompt at each step. Including more turns gives the model more context to work with, which improves coherence in long conversations. Including too many turns increases token consumption and latency.

Setting	Effect
Small (last 5 turns)	Low token consumption, low latency. Suitable for short, transactional calls.
Medium (last 10-15 turns)	Balanced. Good for most conversational agents.
Large (last 20+ turns or full history)	High token consumption. Suitable for complex, long-running support calls where full history is needed.

Most LLMs have a maximum context window (e.g., 128k tokens). DialNexa automatically truncates the oldest turns if the conversation exceeds the model’s context limit. Configure the context window size to stay well within this limit for your expected call length and prompt size.

System Prompt Placement

Controls where in the LLM prompt the system instructions appear relative to the conversation history:

Top: System prompt appears before the conversation history (standard for most models).
Bottom: System prompt appears after the conversation history. Some models perform better with instructions at the end. Test with your specific model if you encounter instruction-following issues.

Include Call Metadata in Context

When enabled, the LLM receives structured metadata about the call in the prompt context:

Call ID
Caller phone number (for inbound calls)
Call start time
Variables passed at call initiation

This lets the LLM reference call metadata in its responses (e.g., “I can see you’re calling from the number we have on file.”) without you needing to inject this data manually into the variables.

Tool and Function Access

Available Tools

The list of tools and functions the LLM can call during the conversation. This list is populated from what you have attached in the Tools Tab. Use the LLM Tab to control which tools are active in the LLM’s tool-use context.

Control	Action
Toggle on/off	Enable or disable a specific tool for this agent version without removing it entirely. Disabled tools are not shown to the LLM and will not be called.
Reorder	Drag tools to reorder them. Tool order may affect which tool the LLM selects when multiple tools are applicable. Place the most commonly used tools first.

Tool Call Mode

Controls how the LLM decides when to call tools:

Mode	Behavior
Auto	The LLM decides when a tool call is appropriate based on the conversation. Standard mode.
Required	The LLM must call at least one tool on every turn. Use this for agents where every caller input requires a backend lookup.
None	The LLM cannot call any tools, even if tools are attached. Use for testing prompt behavior without tool calls.

Setting tool call mode to Required on an agent with multiple tools can cause the LLM to make unnecessary tool calls on every turn. Use this mode only when every turn genuinely requires a tool lookup.

Max Tool Calls Per Turn

The maximum number of sequential tool calls the LLM can make in a single turn before it must generate a response. Prevents runaway tool call chains. Default: 5. Lower this if tool calls are taking too long and causing excessive latency. Raise it if your use case requires chaining multiple lookups before responding.

Save Changes

Click Save to save to the current draft. Publish to apply to live calls.

​Model Settings

​Model

​Temperature

​Context Window Behavior

​Context Window Size

​System Prompt Placement

​Include Call Metadata in Context

​Tool and Function Access

​Available Tools

​Tool Call Mode

​Max Tool Calls Per Turn

​Save Changes

​Related

Model Settings

Model

Temperature

Context Window Behavior

Context Window Size

System Prompt Placement

Include Call Metadata in Context

Tool and Function Access

Available Tools

Tool Call Mode

Max Tool Calls Per Turn

Save Changes

Related