Configuration Guide
Complete guide to configuring LLMs, models, providers, and pattern-specific parameters for Agent Patterns.
Overview
Agent Patterns provides flexible configuration for:
LLM Providers: OpenAI, Anthropic, and others
Model Selection: Choose models per role
Model Parameters: Temperature, tokens, etc.
Pattern Parameters: Pattern-specific settings
Environment Variables: Centralized configuration
LLM Configuration Structure
Basic Structure
llm_configs = {
"role_name": {
"provider": "openai", # or "anthropic"
"model": "gpt-4",
"temperature": 0.7,
"max_tokens": 2000
}
}
Multi-Role Configuration
Different patterns use different roles:
from agent_patterns.patterns import ReflectionAgent
llm_configs = {
"documentation": {
"provider": "openai",
"model": "gpt-3.5-turbo", # Cheaper model for generation
"temperature": 0.8, # Higher temp for creativity
"max_tokens": 2000
},
"reflection": {
"provider": "openai",
"model": "gpt-4", # Smarter model for critique
"temperature": 0.3, # Lower temp for focused analysis
"max_tokens": 1500
}
}
agent = ReflectionAgent(llm_configs=llm_configs)
Provider Configuration
OpenAI
openai_config = {
"provider": "openai",
"model": "gpt-4-turbo",
"temperature": 0.7,
"max_tokens": 2000,
# Optional parameters
"top_p": 1.0,
"frequency_penalty": 0.0,
"presence_penalty": 0.0
}
Recommended Models:
gpt-4-turbo: Best quality, most expensive
gpt-4: High quality, expensive
gpt-3.5-turbo: Good quality, affordable
gpt-3.5-turbo-16k: Longer context, affordable
Anthropic
anthropic_config = {
"provider": "anthropic",
"model": "claude-3-opus-20240229",
"temperature": 0.7,
"max_tokens": 2000
}
Recommended Models:
claude-3-opus: Highest intelligence
claude-3-sonnet: Balanced performance/cost
claude-3-haiku: Fast, affordable
Mixed Providers
mixed_configs = {
"thinking": {
"provider": "anthropic",
"model": "claude-3-opus-20240229",
"temperature": 0.5
},
"execution": {
"provider": "openai",
"model": "gpt-3.5-turbo",
"temperature": 0.7
}
}
Model Parameters
Temperature
Controls randomness/creativity:
# Low temperature (0.0-0.3): Focused, deterministic
# Good for: Analysis, factual tasks, code generation
config_focused = {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.2
}
# Medium temperature (0.4-0.7): Balanced
# Good for: General tasks, Q&A
config_balanced = {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.7
}
# High temperature (0.8-1.0): Creative, diverse
# Good for: Creative writing, brainstorming
config_creative = {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.9
}
Max Tokens
Controls output length:
# Short responses
config_brief = {
"provider": "openai",
"model": "gpt-4",
"max_tokens": 500
}
# Standard responses
config_standard = {
"provider": "openai",
"model": "gpt-4",
"max_tokens": 2000
}
# Long responses
config_long = {
"provider": "openai",
"model": "gpt-4",
"max_tokens": 4000
}
Role-Based Configuration
Role Purposes by Pattern
ReActAgent:
react_configs = {
"thinking": { # For reasoning and tool selection
"provider": "openai",
"model": "gpt-4", # Use smart model
"temperature": 0.5 # Focused reasoning
}
}
ReflectionAgent:
reflection_configs = {
"documentation": { # For generating content
"provider": "openai",
"model": "gpt-3.5-turbo", # Can use cheaper model
"temperature": 0.8 # Creative generation
},
"reflection": { # For critique
"provider": "openai",
"model": "gpt-4", # Use smart model
"temperature": 0.3 # Focused critique
}
}
SelfDiscoveryAgent:
discovery_configs = {
"thinking": { # For strategy selection and planning
"provider": "openai",
"model": "gpt-4",
"temperature": 0.5
},
"execution": { # For executing steps
"provider": "openai",
"model": "gpt-4", # Keep quality high
"temperature": 0.6
}
}
PlanAndSolveAgent:
plan_solve_configs = {
"planning": { # For creating plans
"provider": "openai",
"model": "gpt-4", # Smart planning
"temperature": 0.4 # Focused planning
},
"execution": { # For executing steps
"provider": "openai",
"model": "gpt-3.5-turbo", # Can be cheaper
"temperature": 0.7
},
"documentation": { # For final output
"provider": "openai",
"model": "gpt-3.5-turbo",
"temperature": 0.7
}
}
ReflexionAgent:
reflexion_configs = {
"execution": { # For attempting solutions
"provider": "openai",
"model": "gpt-4",
"temperature": 0.7
},
"reflection": { # For learning from failures
"provider": "openai",
"model": "gpt-4", # Keep smart
"temperature": 0.3 # Focused reflection
}
}
Pattern-Specific Parameters
ReAct
from agent_patterns.patterns import ReActAgent
agent = ReActAgent(
llm_configs=configs,
tools=tools,
max_iterations=5, # Max reasoning cycles
prompt_dir="prompts",
custom_instructions=None,
prompt_overrides=None
)
Reflection
from agent_patterns.patterns import ReflectionAgent
agent = ReflectionAgent(
llm_configs=configs,
max_reflection_cycles=2, # How many refine cycles
prompt_dir="prompts",
custom_instructions=None,
prompt_overrides=None
)
Self-Discovery
from agent_patterns.patterns import SelfDiscoveryAgent
agent = SelfDiscoveryAgent(
llm_configs=configs,
reasoning_modules=None, # Use custom modules
max_selected_modules=3, # How many strategies
prompt_dir="prompts",
custom_instructions=None,
prompt_overrides=None
)
REWOO
from agent_patterns.patterns import REWOOAgent
agent = REWOOAgent(
llm_configs=configs,
tools=tools,
prompt_dir="prompts",
custom_instructions=None,
prompt_overrides=None
)
Reflexion
from agent_patterns.patterns import ReflexionAgent
agent = ReflexionAgent(
llm_configs=configs,
max_trials=3, # Max retry attempts
prompt_dir="prompts",
custom_instructions=None,
prompt_overrides=None
)
Environment Variables
Setup
Create .env file:
# API Keys
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
# Default Models
THINKING_MODEL_PROVIDER=openai
THINKING_MODEL_NAME=gpt-4
THINKING_TEMPERATURE=0.7
THINKING_MAX_TOKENS=2000
DOCUMENTATION_MODEL_PROVIDER=openai
DOCUMENTATION_MODEL_NAME=gpt-3.5-turbo
DOCUMENTATION_TEMPERATURE=0.7
# Pattern Defaults
MAX_ITERATIONS=5
MAX_REFLECTION_CYCLES=2
MAX_TRIALS=3
Loading Environment
import os
from dotenv import load_dotenv
load_dotenv()
# Build configs from environment
llm_configs = {
"thinking": {
"provider": os.getenv("THINKING_MODEL_PROVIDER", "openai"),
"model": os.getenv("THINKING_MODEL_NAME", "gpt-4"),
"temperature": float(os.getenv("THINKING_TEMPERATURE", "0.7")),
"max_tokens": int(os.getenv("THINKING_MAX_TOKENS", "2000"))
}
}
# Pattern parameters
max_iterations = int(os.getenv("MAX_ITERATIONS", "5"))
Cost Optimization
Strategy 1: Use Cheaper Models for Simple Steps
cost_optimized_configs = {
"planning": {
"provider": "openai",
"model": "gpt-4", # Smart planning
"temperature": 0.5
},
"execution": {
"provider": "openai",
"model": "gpt-3.5-turbo", # Cheaper execution
"temperature": 0.7
},
"documentation": {
"provider": "openai",
"model": "gpt-3.5-turbo", # Cheaper writing
"temperature": 0.7
}
}
Strategy 2: Reduce Max Tokens
token_optimized = {
"thinking": {
"provider": "openai",
"model": "gpt-4",
"max_tokens": 1000 # Reduced from 2000
}
}
Strategy 3: Limit Iterations/Cycles
# Limit ReAct iterations
agent = ReActAgent(llm_configs=configs, tools=tools, max_iterations=3)
# Limit Reflection cycles
agent = ReflectionAgent(llm_configs=configs, max_reflection_cycles=1)
Strategy 4: Use REWOO for Tool-Heavy Tasks
REWOO pattern plans all tool calls upfront, avoiding iterative LLM calls.
Performance Optimization
Faster Models
fast_configs = {
"thinking": {
"provider": "anthropic",
"model": "claude-3-haiku-20240307", # Fastest
"temperature": 0.7
}
}
Parallel Execution
Some patterns (like LLM Compiler) support parallel execution:
from agent_patterns.patterns import LLMCompilerAgent
agent = LLMCompilerAgent(
llm_configs=configs,
tools=tools,
# Will execute independent tool calls in parallel
)
Quality Optimization
Use Best Models
quality_configs = {
"thinking": {
"provider": "openai",
"model": "gpt-4-turbo", # Best OpenAI model
"temperature": 0.5
},
"reflection": {
"provider": "anthropic",
"model": "claude-3-opus-20240229", # Best Anthropic model
"temperature": 0.3
}
}
Increase Iterations/Cycles
# More thorough reflection
agent = ReflectionAgent(
llm_configs=configs,
max_reflection_cycles=3 # More refinement
)
# More retry attempts
agent = ReflexionAgent(
llm_configs=configs,
max_trials=5 # More learning attempts
)
Configuration Profiles
Development Profile
dev_configs = {
"thinking": {
"provider": "openai",
"model": "gpt-3.5-turbo", # Cheap, fast
"temperature": 0.7,
"max_tokens": 1000
}
}
dev_agent = ReActAgent(
llm_configs=dev_configs,
tools=tools,
max_iterations=3 # Fewer iterations
)
Production Profile
prod_configs = {
"thinking": {
"provider": "openai",
"model": "gpt-4", # High quality
"temperature": 0.7,
"max_tokens": 2000
}
}
prod_agent = ReActAgent(
llm_configs=prod_configs,
tools=tools,
max_iterations=5
)
Testing Profile
test_configs = {
"thinking": {
"provider": "openai",
"model": "gpt-3.5-turbo",
"temperature": 0.0, # Deterministic for testing
"max_tokens": 500
}
}
Configuration Best Practices
1. Use Environment Variables for Secrets
# Never hardcode API keys
❌ config = {"provider": "openai", "api_key": "sk-..."}
# Use environment variables
✅ load_dotenv()
✅ config = {"provider": "openai"} # API key from OPENAI_API_KEY env var
2. Match Model Capability to Task Complexity
# Simple tasks: cheaper models
simple_configs = {"thinking": {"provider": "openai", "model": "gpt-3.5-turbo"}}
# Complex tasks: smarter models
complex_configs = {"thinking": {"provider": "openai", "model": "gpt-4"}}
3. Adjust Temperature by Use Case
# Factual/analytical: low temperature
analytical_config = {"temperature": 0.2}
# Creative/diverse: high temperature
creative_config = {"temperature": 0.9}
4. Set Appropriate Token Limits
# Brief answers
brief_config = {"max_tokens": 500}
# Detailed analysis
detailed_config = {"max_tokens": 3000}
5. Version Your Configurations
# configs_v1.py
V1_CONFIGS = {
"thinking": {"provider": "openai", "model": "gpt-3.5-turbo"}
}
# configs_v2.py
V2_CONFIGS = {
"thinking": {"provider": "openai", "model": "gpt-4"}
}
Troubleshooting
Issue: Rate Limiting
Solution: Add retry logic or reduce concurrency
import time
from openai import RateLimitError
def create_agent_with_retry():
max_retries = 3
for i in range(max_retries):
try:
return ReActAgent(llm_configs=configs, tools=tools)
except RateLimitError:
if i < max_retries - 1:
time.sleep(2 ** i) # Exponential backoff
else:
raise
Issue: High Costs
Solutions:
Use cheaper models where possible
Reduce max_tokens
Limit iterations/cycles
Use REWOO pattern for tool-heavy workflows
Issue: Poor Quality
Solutions:
Use smarter models (gpt-4, claude-3-opus)
Increase reflection cycles
Lower temperature for focused tasks
Add better custom instructions
Issue: Slow Performance
Solutions:
Use faster models (gpt-3.5-turbo, claude-haiku)
Reduce max_tokens
Limit iterations
Use patterns with parallel execution (LLM Compiler)
Next Steps
Review Error Handling for robust configuration
See Deployment Guide for production configs
Explore Best Practices for optimization
Check individual pattern docs for pattern-specific configs
Reference
Model Comparison
Model |
Provider |
Quality |
Speed |
Cost |
Best For |
|---|---|---|---|---|---|
gpt-4-turbo |
OpenAI |
Excellent |
Medium |
High |
Complex reasoning |
gpt-4 |
OpenAI |
Excellent |
Slow |
High |
High-quality output |
gpt-3.5-turbo |
OpenAI |
Good |
Fast |
Low |
General tasks |
claude-3-opus |
Anthropic |
Excellent |
Medium |
High |
Complex reasoning |
claude-3-sonnet |
Anthropic |
Very Good |
Fast |
Medium |
Balanced tasks |
claude-3-haiku |
Anthropic |
Good |
Very Fast |
Low |
Simple tasks |
Configuration Parameters Reference
LLM Config Parameters:
provider: “openai” or “anthropic”model: Model name/IDtemperature: 0.0-1.0 (default: 0.7)max_tokens: Integer (default: 2000)top_p: 0.0-1.0 (optional)frequency_penalty: -2.0-2.0 (optional, OpenAI)presence_penalty: -2.0-2.0 (optional, OpenAI)
Pattern Parameters:
ReAct:
max_iterationsReflection:
max_reflection_cyclesSelf-Discovery:
max_selected_modules,reasoning_modulesReflexion:
max_trialsAll patterns:
prompt_dir,custom_instructions,prompt_overrides