LLM Compiler Agent Pattern
The LLM Compiler pattern treats multi-tool workflows like a compiler, constructing a Directed Acyclic Graph (DAG) of tool calls with explicit dependencies, then executing nodes in topological order. This enables parallel execution of independent tools while respecting dependencies.
Overview
Best For: Complex multi-tool workflows with parallelizable steps
Complexity: ⭐⭐⭐ Advanced (DAG construction and execution)
Cost: $$ Medium (Efficient execution despite complexity)
When to Use LLM Compiler
Ideal Use Cases
✅ Parallel tool execution
Multiple independent tool calls
Can execute simultaneously
Respects dependencies when present
Maximizes efficiency
✅ Complex data pipelines
Multiple processing steps
Clear dependencies between steps
Benefits from parallel execution
Structured workflow
✅ Multi-source data gathering
Fetch from multiple sources
Some sources independent
Combine results systematically
Optimize execution time
✅ Workflow orchestration
Complex task dependencies
Want optimal execution order
Need to maximize parallelism
Clear input/output relationships
When NOT to Use LLM Compiler
❌ Simple sequential tasks → Use Plan & Solve ❌ Highly dynamic workflows → Use ReAct ❌ Few tools needed → Overhead not worthwhile ❌ Unknown dependencies → Hard to construct DAG upfront
How LLM Compiler Works
The DAG Construction and Execution
TASK: "Get weather in NYC and LA, calculate average temperature"
┌─────────────────────────────────────────┐
│ PHASE 1: PLANNER CREATES DAG │
│ │
│ NODE: node1 │
│ TOOL: get_weather │
│ ARGS: {"location": "NYC"} │
│ DEPENDS_ON: [] │
│ │
│ NODE: node2 │
│ TOOL: get_weather │
│ ARGS: {"location": "LA"} │
│ DEPENDS_ON: [] │
│ │
│ NODE: node3 │
│ TOOL: calculate │
│ ARGS: {"expr": "(#node1 + #node2) / 2"}│
│ DEPENDS_ON: [node1, node2] │
│ │
└─────────────────┬───────────────────────┘
↓
┌─────────────────────────────────────────┐
│ PHASE 2: EXECUTOR (Topological Order) │
│ │
│ Iteration 1: Execute ready nodes │
│ ├─ node1: weather("NYC") → "72°F" │
│ └─ node2: weather("LA") → "85°F" │
│ (Parallel execution!) │
│ │
│ Iteration 2: node1, node2 complete │
│ └─ node3: calculate("(72+85)/2") │
│ → "78.5°F" │
│ │
│ All nodes complete! │
│ │
└─────────────────┬───────────────────────┘
↓
┌─────────────────────────────────────────┐
│ PHASE 3: SYNTHESIZER │
│ │
│ Results: │
│ - node1: 72°F │
│ - node2: 85°F │
│ - node3: 78.5°F │
│ │
│ Final Answer: │
│ "NYC weather: 72°F, LA weather: 85°F, │
│ Average: 78.5°F" │
│ │
└─────────────────────────────────────────┘
Key Concepts
Directed Acyclic Graph (DAG):
Nodes represent tool calls
Edges represent dependencies
No cycles (acyclic)
Enables topological ordering
Topological Execution:
Execute nodes when dependencies satisfied
Parallel execution of independent nodes
Efficient use of resources
Guaranteed correct ordering
Dependency Resolution:
#node1in parameters references another node’s outputAutomatically resolved when node1 completes
Enables data flow through DAG
Theoretical Foundation
Based on the paper “An LLM Compiler for Parallel Function Calling”. Inspired by compiler optimization techniques.
Key principles:
Static analysis: Determine dependencies upfront
Optimization: Identify parallelizable operations
Efficient execution: Run independent operations simultaneously
Correctness: Respect all dependencies
Algorithm
def llm_compiler(task, tools):
"""Simplified LLM Compiler algorithm"""
# Phase 1: Construct DAG
dag = planner_llm_generate_graph(task, tools)
# dag = {
# "nodes": [
# {"id": "node1", "tool": "search", "args": {...}, "depends_on": []},
# {"id": "node2", "tool": "calc", "args": {"x": "#node1"}, "depends_on": ["node1"]},
# ]
# }
# Phase 2: Execute in topological order
results = {}
while not all_nodes_complete(dag, results):
# Find nodes ready to execute (dependencies satisfied)
ready_nodes = [
n for n in dag["nodes"]
if n["id"] not in results
and all(dep in results for dep in n["depends_on"])
]
# Execute ready nodes (can be parallelized)
for node in ready_nodes:
# Resolve dependencies in arguments
resolved_args = resolve_references(node["args"], results)
# Execute tool
result = tools[node["tool"]](**resolved_args)
results[node["id"]] = result
# Phase 3: Synthesize final answer
final_answer = synthesizer_llm(task, dag, results)
return final_answer
API Reference
Class: LLMCompilerAgent
from agent_patterns.patterns import LLMCompilerAgent
agent = LLMCompilerAgent(
llm_configs: Dict[str, Dict[str, Any]],
tools: Optional[Dict[str, Callable]] = None,
prompt_dir: str = "prompts",
custom_instructions: Optional[str] = None,
prompt_overrides: Optional[Dict[str, Dict[str, str]]] = None
)
Parameters
Parameter |
Type |
Required |
Description |
|---|---|---|---|
|
|
Yes |
LLM configs for “thinking” and “documentation” roles |
|
|
No |
Dictionary mapping tool names to functions |
|
|
No |
Custom prompt directory (default: “prompts”) |
|
|
No |
Instructions appended to system prompts |
|
|
No |
Override specific prompts programmatically |
LLM Roles
thinking: Used for planning (DAG generation)
documentation: Used for synthesizing final answer
Methods
run(input_data: str) -> str
Executes the LLM Compiler pattern on the given input.
Parameters:
input_data(str): The task requiring multiple tools
Returns: str - The final synthesized answer
Raises: ValueError if graph not built
build_graph() -> None
Builds the LangGraph state graph. Called automatically during initialization.
Complete Examples
Basic Usage
from agent_patterns.patterns import LLMCompilerAgent
# Define tools
def search_tool(query: str) -> str:
"""Search for information"""
# API call
return f"Search results for: {query}"
def calculate_tool(expression: str) -> float:
"""Evaluate mathematical expression"""
return eval(expression) # Use safe_eval in production
def get_price(product: str) -> float:
"""Get product price"""
prices = {"laptop": 999, "phone": 699, "tablet": 499}
return prices.get(product, 0)
# Configure LLMs
llm_configs = {
"thinking": {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.3,
},
"documentation": {
"provider": "openai",
"model": "gpt-4",
"temperature": 0.7,
}
}
# Create agent
agent = LLMCompilerAgent(
llm_configs=llm_configs,
tools={
"search": search_tool,
"calculate": calculate_tool,
"get_price": get_price,
}
)
# Execute complex workflow
result = agent.run("""
Find the prices of laptop, phone, and tablet.
Calculate the total cost if I buy one of each.
Also search for information about each product's warranty.
Provide a summary with total cost and warranty info.
""")
print(result)
# Agent will:
# 1. PLAN: Create DAG with parallel price lookups and searches
# 2. EXECUTE: Run get_price and search calls in parallel
# Then calculate total (depends on prices)
# 3. SYNTHESIZE: Combine all results into summary
With Custom Instructions
data_pipeline_instructions = """
You are orchestrating data processing pipelines.
DAG CONSTRUCTION:
- Identify all data sources (parallel)
- Identify processing steps (sequential when dependent)
- Identify aggregation steps (after all data ready)
- Maximize parallelism where safe
TOOL EXECUTION:
- Respect all dependencies
- Never execute before dependencies ready
- Handle errors gracefully
SYNTHESIS:
- Present data clearly
- Highlight key insights
- Show data lineage
"""
agent = LLMCompilerAgent(
llm_configs=llm_configs,
tools=tools,
custom_instructions=data_pipeline_instructions
)
result = agent.run("""
Analyze sales data:
1. Fetch sales from Q1, Q2, Q3, Q4 (parallel)
2. Calculate total annual sales
3. Calculate quarter-over-quarter growth rates
4. Identify best and worst performing quarters
5. Generate executive summary
""")
With Prompt Overrides
# Customize DAG planning
overrides = {
"PlanGraph": {
"system": """You are an expert at constructing execution graphs for
multi-tool workflows. Create DAGs that maximize parallelism while respecting
all dependencies.""",
"user": """Task: {task}
Available tools:
{tools}
Create a DAG (Directed Acyclic Graph) for this task.
For each node in the graph, specify:
NODE: <unique_id>
TOOL: <tool_name>
ARGS: <JSON args, use #node_id to reference other nodes>
DEPENDS_ON: <list of node_ids this depends on, or []>
Make independent operations parallelizable by having empty or non-overlapping
dependencies.
Your DAG:"""
},
"Synthesize": {
"system": "You synthesize results from complex workflows into clear answers.",
"user": """Task: {task}
Execution results:
{results}
Create a comprehensive answer that:
1. Addresses the original task completely
2. Presents information logically
3. Highlights key findings
4. Shows how results relate to each other
Your answer:"""
}
}
agent = LLMCompilerAgent(
llm_configs=llm_configs,
tools=tools,
prompt_overrides=overrides
)
Tool Definition Guidelines
Tool Function Signature
def tool_name(param1: str, param2: int = 0) -> Any:
"""
Clear description of what the tool does.
Args:
param1: Description of parameter 1
param2: Description of parameter 2 (optional)
Returns:
Result (can be any type, will be converted to string)
"""
# Tool implementation
return result
Dependency References
Tools can reference other node outputs using #node_id:
# In DAG:
# NODE: node1
# TOOL: get_data
# ARGS: {"source": "api"}
# DEPENDS_ON: []
#
# NODE: node2
# TOOL: process_data
# ARGS: {"data": "#node1"} # References node1's output
# DEPENDS_ON: [node1]
# When executing node2, #node1 is replaced with actual result
Customizing Prompts
Understanding the System Prompt Structure
Version 0.2.0 introduces enterprise-grade prompts with a comprehensive 9-section structure (150-300+ lines vs ~32 lines).
The 9-Section Structure: All prompts include Role and Identity, Core Capabilities, Process, Output Format, Decision-Making Guidelines, Quality Standards, Edge Cases, Examples, and Critical Reminders. Benefits: Better reliability and robustness.
Understanding LLM Compiler Prompts
Uses two main prompts (both now with comprehensive 9-section structure):
PlanGraph: Planner LLM creates DAG structure with detailed quality standards and edge case handling
Synthesize: Synthesizer LLM combines results with systematic process guidance
Method 1: Custom Instructions
agent = LLMCompilerAgent(
llm_configs=llm_configs,
tools=tools,
custom_instructions="""
OPTIMIZATION GOAL: Maximize parallelism
CORRECTNESS GOAL: Respect all dependencies
CLARITY GOAL: Clear, structured final answers
"""
)
Method 2: Prompt Overrides
See “With Prompt Overrides” example above.
Method 3: Custom Prompt Directory
my_prompts/
└── LLMCompilerAgent/
├── PlanGraph/
│ ├── system.md
│ └── user.md
└── Synthesize/
├── system.md
└── user.md
Setting Agent Goals
Via Task Description
# Clear task with sub-goals
agent.run("""
Goal: Compare three cloud providers (AWS, GCP, Azure)
Sub-tasks (can be parallelized):
1. Get pricing for each provider
2. Get features for each provider
3. Search for reviews of each provider
Then:
4. Create comparison matrix
5. Generate recommendation
Provide detailed comparison.
""")
Via Custom Instructions
agent = LLMCompilerAgent(
llm_configs=llm_configs,
tools=tools,
custom_instructions="""
GOAL: Efficient, parallel execution of multi-tool workflows
PLANNING:
- Identify all independent operations
- Enable maximum parallelism
- Clear dependency chains
EXECUTION:
- Respect topological order
- Handle errors without blocking entire workflow
OUTPUT:
- Comprehensive synthesis
- Clear presentation
- Actionable insights
"""
)
Advanced Usage
Parallel Execution Simulation
# Current implementation executes sequentially
# But DAG enables parallel execution in production
class ParallelLLMCompilerAgent(LLMCompilerAgent):
def _executor_dispatch(self, state):
"""Override to add parallel execution"""
import concurrent.futures
graph = state["execution_graph"]
results = state["node_results"]
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
while not self._all_complete(graph, results):
# Find ready nodes
ready_nodes = self._get_ready_nodes(graph, results)
if not ready_nodes:
break
# Submit all ready nodes to executor
futures = {
executor.submit(
self._execute_tool,
node["tool"],
node["args"],
results
): node
for node in ready_nodes
}
# Collect results
for future in concurrent.futures.as_completed(futures):
node = futures[future]
result = future.result()
results[node["id"]] = result
state["node_results"] = results
return state
agent = ParallelLLMCompilerAgent(llm_configs=llm_configs, tools=tools)
DAG Visualization
class VisualizingLLMCompilerAgent(LLMCompilerAgent):
def run(self, input_data):
"""Override to visualize DAG"""
result = super().run(input_data)
# Access DAG (would need to store during execution)
print("\n=== Execution DAG ===")
self._print_dag()
return result
def _print_dag(self):
"""Print DAG structure"""
# Implementation to visualize the execution graph
pass
agent = VisualizingLLMCompilerAgent(llm_configs=llm_configs, tools=tools)
Performance Considerations
Cost Analysis
LLM Compiler cost:
Plan DAG: 1 LLM call
Execute tools: 0 LLM calls (just tool execution)
Synthesize: 1 LLM call
Total: 2 LLM calls (like REWOO)
Efficiency gains:
Parallel execution reduces wall-clock time
Only 2 LLM calls regardless of tool count
Optimal execution order minimizes waste
When LLM Compiler Excels
✅ Many independent tools: Parallel execution shines ✅ Complex dependencies: DAG handles correctly ✅ Time-sensitive: Parallelism speeds up execution ✅ Clear structure: Can plan DAG upfront
When to Use Other Patterns
Scenario |
Better Pattern |
Reason |
|---|---|---|
Dynamic workflow |
ReAct |
Can’t plan DAG upfront |
Simple sequence |
Plan & Solve |
DAG overhead unnecessary |
No tools |
Self-Discovery, Reflection |
LLM Compiler needs tools |
Unknown dependencies |
ReAct |
Adaptive approach better |
Comparison with Other Patterns
Aspect |
LLM Compiler |
REWOO |
ReAct |
|---|---|---|---|
Planning |
DAG construction |
Linear with placeholders |
Adaptive |
Execution |
Topological order |
Sequential |
Iterative |
Parallelism |
Explicit support |
No |
No |
LLM Calls |
2 (fixed) |
2 (fixed) |
N + 1 |
Dependencies |
Explicit in DAG |
Implicit in placeholders |
Adaptive |
Best For |
Complex workflows |
Batch operations |
Dynamic exploration |
Common Pitfalls
1. Circular Dependencies
❌ Bad: Creating cycles in DAG
# NODE: node1 depends on node2
# NODE: node2 depends on node1
# → Impossible to execute!
✅ Good: Acyclic dependencies
# NODE: node1 depends on []
# NODE: node2 depends on [node1]
# NODE: node3 depends on [node1, node2]
2. Missing Dependencies
❌ Bad: Not specifying required dependencies
# NODE: node2 uses #node1 in args
# DEPENDS_ON: [] # Missing node1!
✅ Good: Explicit dependencies
# NODE: node2 uses #node1 in args
# DEPENDS_ON: [node1] # ✅ Correct
3. Over-Sequencing
❌ Bad: Making everything depend on everything
✅ Good: Only specify actual dependencies
# If node2 and node3 are independent:
# NODE: node2 DEPENDS_ON: []
# NODE: node3 DEPENDS_ON: []
# → Can execute in parallel!
4. Incorrect Reference Syntax
❌ Bad: Wrong reference format
# ARGS: {"data": "node1"} # Missing #
✅ Good: Correct reference
# ARGS: {"data": "#node1"} # ✅ Will be resolved
Troubleshooting
DAG Parsing Failures
Symptom: Can’t extract DAG from plan
Solutions:
# Strengthen PlanGraph prompt format
overrides = {
"PlanGraph": {
"user": """...
STRICT FORMAT (follow exactly):
NODE: node1
TOOL: tool_name
ARGS: {"param": "value"}
DEPENDS_ON: []
NODE: node2
TOOL: tool_name
ARGS: {"param": "#node1"}
DEPENDS_ON: [node1]
(Blank line between nodes)
Your DAG:"""
}
}
Execution Hangs
Symptom: Some nodes never execute
Solutions:
# Check for:
# 1. Circular dependencies (impossible to resolve)
# 2. Missing tools (can't execute)
# 3. Incorrect dependency specification
# Add validation:
class ValidatingLLMCompilerAgent(LLMCompilerAgent):
def _planner_generate_graph(self, state):
state = super()._planner_generate_graph(state)
# Validate DAG
if self._has_cycles(state["execution_graph"]):
state["error"] = "Circular dependencies detected"
return state
Poor Parallelism
Symptom: Nodes execute sequentially despite being independent
Solutions:
# Emphasize parallelism in planning
custom_instructions = """
DAG PLANNING:
When creating the DAG, actively look for opportunities for parallel execution.
If two nodes don't depend on each other, they should have independent DEPENDS_ON lists.
"""
Next Steps
Try the complete examples
Learn about REWOO for simpler batch execution
Explore ReAct for dynamic tool workflows
Read the original paper
References
Original paper: An LLM Compiler for Parallel Function Calling
DAG concepts: Directed Acyclic Graphs
Topological sorting: Topological Sort
Compiler optimization techniques applied to LLM workflows