Self-Discovery Agent Pattern

The Self-Discovery pattern enables agents to dynamically select and adapt reasoning strategies from a library of problem-solving heuristics, creating customized reasoning plans for each unique task.

Overview

Best For: Complex reasoning tasks requiring multiple problem-solving approaches

Complexity: ⭐⭐⭐ Advanced (Sophisticated meta-reasoning)

Cost: $$$ Higher (Multiple LLM calls for discovery, adaptation, and execution)

When to Use Self-Discovery

Ideal Use Cases

✅ Multi-faceted problems

Agent discovers relevant reasoning approaches
Adapts general strategies to specific context
Combines multiple perspectives for comprehensive solutions

✅ Novel problem domains

No predetermined approach exists
Agent selects appropriate reasoning modules
Customizes strategy based on task characteristics

✅ Complex analytical tasks

Requires diverse reasoning methods (analogical, first principles, etc.)
Benefits from structured reasoning plan
Needs systematic approach to decomposition

✅ Strategic planning

Analyzes problem from multiple angles
Selects relevant planning heuristics
Executes customized reasoning workflow

When NOT to Use Self-Discovery

❌ Simple queries → Use direct LLM or ReAct ❌ Tasks with known solutions → Use Plan & Solve ❌ Tool-based workflows → Use ReAct or REWOO ❌ Speed-critical tasks → Meta-reasoning adds overhead

How Self-Discovery Works

The Discovery-Adaptation-Execution Cycle

┌─────────────────────────────────────────┐
│                                         │
│  1. DISCOVER: Select relevant modules   │
│     Library: [break_down, analogical,   │
│              first_principles, ...]     │
│     Selected: [break_down,              │
│                first_principles]        │
│                                         │
└─────────────────┬───────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│                                         │
│  2. ADAPT: Customize for task           │
│     break_down → "Decompose the system  │
│     into: UI, API, Database layers"     │
│                                         │
└─────────────────┬───────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│                                         │
│  3. PLAN: Create reasoning sequence     │
│     Step 1: Apply break_down strategy   │
│     Step 2: Apply first_principles      │
│     Step 3: Synthesize insights         │
│                                         │
└─────────────────┬───────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│                                         │
│  4. EXECUTE: Run each reasoning step    │
│     Execute step 1... execute step 2... │
│                                         │
└─────────────────┬───────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│                                         │
│  5. SYNTHESIZE: Combine results         │
│     Integrate all reasoning outputs     │
│     into coherent final answer          │
│                                         │
└─────────────────────────────────────────┘

Theoretical Foundation

Based on the paper “Self-Discover: Large Language Models Self-Compose Reasoning Structures”. Key insights:

Meta-reasoning: Agent reasons about which reasoning approaches to use
Adaptive strategy: Different tasks benefit from different heuristics
Structured thinking: Explicit reasoning plan improves outcomes
Composability: Combines multiple reasoning modules effectively

Algorithm

def self_discovery(task, module_library, max_modules=3):
    """Simplified Self-Discovery algorithm"""

    # Stage 1: Discover relevant modules
    selected_modules = llm_select_modules(
        task=task,
        modules=module_library,
        max_select=max_modules
    )

    # Stage 2: Adapt modules to task
    adapted_modules = []
    for module in selected_modules:
        adapted = llm_adapt_module(
            task=task,
            module=module
        )
        adapted_modules.append(adapted)

    # Stage 3: Create reasoning plan
    reasoning_plan = llm_create_plan(
        task=task,
        adapted_modules=adapted_modules
    )

    # Stage 4: Execute plan steps
    step_results = []
    for step in reasoning_plan:
        result = llm_execute_step(
            task=task,
            step=step,
            previous_results=step_results
        )
        step_results.append(result)

    # Stage 5: Synthesize final answer
    final_answer = llm_synthesize(
        task=task,
        reasoning_steps=step_results
    )

    return final_answer

API Reference

Class: `SelfDiscoveryAgent`

from agent_patterns.patterns import SelfDiscoveryAgent

agent = SelfDiscoveryAgent(
    llm_configs: Dict[str, Dict[str, Any]],
    reasoning_modules: Optional[List[Dict[str, str]]] = None,
    max_selected_modules: int = 3,
    prompt_dir: str = "prompts",
    custom_instructions: Optional[str] = None,
    prompt_overrides: Optional[Dict[str, Dict[str, str]]] = None
)

Parameters

Parameter	Type	Required	Description
`llm_configs`	`Dict[str, Dict[str, Any]]`	Yes	LLM configs for “thinking” and “execution” roles
`reasoning_modules`	`List[Dict]`	No	Custom reasoning module library (uses defaults if None)
`max_selected_modules`	`int`	No	Max modules to select per task (default: 3)
`prompt_dir`	`str`	No	Custom prompt directory (default: “prompts”)
`custom_instructions`	`str`	No	Instructions appended to system prompts
`prompt_overrides`	`Dict`	No	Override specific prompts programmatically

Default Reasoning Modules

The agent includes 10 default reasoning modules:

break_down_problem: Decompose into sub-problems
identify_constraints: Analyze requirements and limitations
analogical_reasoning: Find similar problems and apply lessons
first_principles: Reason from fundamental truths
step_by_step: Proceed systematically through the problem
pros_and_cons: Evaluate different approaches
critical_analysis: Examine assumptions and evidence
pattern_recognition: Identify patterns and trends
hypothesis_testing: Form and test hypotheses
visualization: Create mental models or diagrams

LLM Roles

thinking: Used for discovery, adaptation, planning, and synthesis
execution: Used for executing each reasoning step

Methods

run(input_data: str) -> str

Executes the Self-Discovery pattern on the given input.

Parameters:
- input_data (str): The task or problem to solve
Returns: str - The synthesized final answer
Raises: ValueError if graph not built

build_graph() -> None

Builds the LangGraph state graph. Called automatically during initialization.

Complete Examples

Basic Usage

from agent_patterns.patterns import SelfDiscoveryAgent

# Configure LLMs
llm_configs = {
    "thinking": {
        "provider": "openai",
        "model": "gpt-4",
        "temperature": 0.7,
    },
    "execution": {
        "provider": "openai",
        "model": "gpt-4",
        "temperature": 0.7,
    }
}

# Create agent with default reasoning modules
agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    max_selected_modules=3
)

# Solve complex reasoning problem
result = agent.run("""
Design a sustainable urban transportation system for a city of 2 million people.
Consider environmental impact, cost, efficiency, and user experience.
""")

print(result)
# Agent will:
# 1. Select relevant modules (e.g., break_down, constraints, pros_and_cons)
# 2. Adapt them to urban transportation context
# 3. Create reasoning plan
# 4. Execute each step
# 5. Synthesize comprehensive solution

With Custom Reasoning Modules

# Define domain-specific reasoning modules
custom_modules = [
    {
        "name": "stakeholder_analysis",
        "description": "Identify all stakeholders and their interests",
        "template": "For '{task}', identify key stakeholders and analyze their needs and constraints"
    },
    {
        "name": "risk_assessment",
        "description": "Evaluate potential risks and mitigation strategies",
        "template": "Identify risks in '{task}' and develop mitigation approaches"
    },
    {
        "name": "resource_optimization",
        "description": "Optimize resource allocation and utilization",
        "template": "Analyze resource constraints for '{task}' and propose optimal allocation"
    },
    {
        "name": "scalability_analysis",
        "description": "Assess how solution scales with growth",
        "template": "Evaluate scalability of '{task}' under different growth scenarios"
    },
    {
        "name": "competitive_analysis",
        "description": "Compare with alternative approaches",
        "template": "Compare different approaches to '{task}' and identify best option"
    }
]

agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    reasoning_modules=custom_modules,
    max_selected_modules=3
)

result = agent.run("""
Develop a go-to-market strategy for a B2B SaaS product
targeting enterprise customers in the healthcare industry.
""")

With Custom Instructions

# Add guidance for reasoning selection and execution
business_strategy_instructions = """
You are a business strategy consultant. When selecting reasoning modules:
- Prioritize data-driven approaches
- Consider both qualitative and quantitative factors
- Balance short-term feasibility with long-term vision

When executing reasoning steps:
- Use concrete examples from real companies
- Provide actionable recommendations
- Consider both opportunities and risks
- Reference industry best practices

When synthesizing:
- Create a structured action plan
- Identify key metrics for success
- Highlight critical dependencies
"""

agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    max_selected_modules=4,
    custom_instructions=business_strategy_instructions
)

result = agent.run("""
Create a digital transformation roadmap for a traditional
retail company facing competition from e-commerce.
""")

With Prompt Overrides

# Customize how modules are discovered and adapted
overrides = {
    "DiscoverModules": {
        "system": """You are an expert in selecting problem-solving strategies.
Analyze the task deeply and choose the most relevant reasoning approaches.""",
        "user": """Task: {task}

Available reasoning modules:
{modules}

Select up to {max_modules} modules that are MOST relevant for this specific task.
Focus on modules that will provide unique, valuable perspectives.

For each selected module, output:
SELECTED: <module_name>

Your selections:"""
    },
    "AdaptModules": {
        "system": "You specialize in customizing general strategies for specific contexts.",
        "user": """Task: {task}

Module: {module_name}
Description: {module_description}
Template: {module_template}

Adapt this module specifically for the task. Be concrete and specific.
How would you apply this reasoning approach to this exact problem?

Adapted strategy:"""
    }
}

agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    prompt_overrides=overrides,
    max_selected_modules=3
)

Customizing Reasoning Modules

Understanding the System Prompt Structure

Version 0.2.0 introduces enterprise-grade prompts with a comprehensive 9-section structure. Each system prompt is now 150-300+ lines (vs ~32 lines), providing significantly better guidance.

The 9-Section Structure: All Self-Discovery prompts now include Role and Identity, Core Capabilities (CAN/CANNOT boundaries), Process, Output Format, Decision-Making Guidelines, Quality Standards, Edge Cases, Examples, and Critical Reminders.

Benefits: The five Self-Discovery steps (DiscoverModules, AdaptModules, PlanReasoning, ExecuteStep, SynthesizeOutput) all benefit from comprehensive prompts with increased reliability, better transparency, and improved robustness. No code changes required.

Creating Domain-Specific Modules

# Scientific research modules
research_modules = [
    {
        "name": "literature_review",
        "description": "Survey existing research and identify gaps",
        "template": "Review relevant literature for '{task}' and identify research gaps"
    },
    {
        "name": "hypothesis_formation",
        "description": "Formulate testable hypotheses",
        "template": "Generate testable hypotheses for '{task}'"
    },
    {
        "name": "experimental_design",
        "description": "Design experiments to test hypotheses",
        "template": "Design experiments to investigate '{task}'"
    },
    {
        "name": "data_analysis",
        "description": "Analyze data and draw conclusions",
        "template": "Analyze data patterns in '{task}' and draw evidence-based conclusions"
    }
]

# Engineering design modules
engineering_modules = [
    {
        "name": "requirements_analysis",
        "description": "Define functional and non-functional requirements",
        "template": "Specify detailed requirements for '{task}'"
    },
    {
        "name": "architecture_design",
        "description": "Design system architecture and components",
        "template": "Design system architecture for '{task}'"
    },
    {
        "name": "trade_off_analysis",
        "description": "Evaluate engineering trade-offs",
        "template": "Analyze trade-offs between different design choices for '{task}'"
    }
]

# Choose based on domain
agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    reasoning_modules=research_modules  # or engineering_modules
)

Module Structure

Each module requires three fields:

module = {
    "name": "unique_identifier",  # Used in selection output
    "description": "What this reasoning approach does",  # Helps agent select
    "template": "How to apply it to '{task}'"  # Gets filled with actual task
}

Setting Agent Goals

Via Task Description

Provide clear problem statement:

# Well-defined task
agent.run("""
Problem: Our mobile app has 40% user churn in the first week.

Context:
- 100K downloads/month
- Average session: 3 minutes
- Main competitors: AppA (15% churn), AppB (20% churn)

Goal: Reduce churn to under 25% within 3 months

Constraints:
- Limited development resources (2 engineers)
- $50K marketing budget
- Must maintain current feature set

Question: What strategy should we implement?
""")

Via Custom Instructions

Set persistent reasoning guidelines:

agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    custom_instructions="""
    GOAL: Provide comprehensive, actionable strategic recommendations

    REASONING APPROACH:
    - Select modules that complement each other
    - Ensure both analytical and creative perspectives
    - Ground reasoning in practical constraints

    OUTPUT REQUIREMENTS:
    - Specific, measurable recommendations
    - Timeline and resource estimates
    - Risk factors and mitigation strategies
    - Success metrics
    """
)

Via System Prompt Override

Configure each stage:

overrides = {
    "DiscoverModules": {
        "system": """You select reasoning strategies to maximize solution quality.
Goal: Choose modules that provide diverse, complementary perspectives."""
    },
    "ExecuteStep": {
        "system": """You execute reasoning steps with rigor and depth.
Goal: Generate insights that are specific, actionable, and evidence-based."""
    },
    "SynthesizeOutput": {
        "system": """You synthesize insights into coherent strategic recommendations.
Goal: Create an actionable plan with clear priorities and success metrics."""
    }
}

Advanced Usage

Adjusting Module Selection

# Select more modules for complex tasks
comprehensive_agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    max_selected_modules=5  # More diverse perspectives
)

# Select fewer for focused analysis
focused_agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    max_selected_modules=2  # Streamlined reasoning
)

Combining Default and Custom Modules

from agent_patterns.patterns.self_discovery_agent import DEFAULT_REASONING_MODULES

# Add custom modules to defaults
all_modules = DEFAULT_REASONING_MODULES + [
    {
        "name": "ethical_analysis",
        "description": "Evaluate ethical implications and considerations",
        "template": "Analyze ethical dimensions of '{task}'"
    },
    {
        "name": "sustainability_check",
        "description": "Assess environmental and social sustainability",
        "template": "Evaluate sustainability aspects of '{task}'"
    }
]

agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    reasoning_modules=all_modules,
    max_selected_modules=4
)

Role-Specific LLM Configurations

# Use stronger model for meta-reasoning, standard for execution
llm_configs = {
    "thinking": {
        "provider": "openai",
        "model": "gpt-4",  # Stronger for discovery/planning
        "temperature": 0.7,
    },
    "execution": {
        "provider": "openai",
        "model": "gpt-3.5-turbo",  # Cheaper for execution steps
        "temperature": 0.7,
    }
}

Performance Considerations

Cost Optimization

Self-Discovery makes many LLM calls:

Per task cost:

Discover modules: 1 call
Adapt modules: N calls (where N = selected modules)
Plan reasoning: 1 call
Execute steps: M calls (where M = steps in plan)
Synthesize: 1 call
Total: ~7-12 LLM calls for typical task

Optimization strategies:

# 1. Reduce max_selected_modules
agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    max_selected_modules=2  # Fewer modules = fewer calls
)

# 2. Use cheaper model for execution
llm_configs = {
    "thinking": {"provider": "openai", "model": "gpt-4"},
    "execution": {"provider": "openai", "model": "gpt-3.5-turbo"}
}

# 3. Create focused module libraries (fewer options to consider)
focused_modules = [
    # Only 4-5 most relevant modules for your domain
]

When to Use vs Other Patterns

Task Type	Best Pattern	Reason
Complex reasoning, no tools	Self-Discovery	✅ Leverages diverse reasoning
Simple reasoning	Direct LLM	❌ Self-Discovery overhead unnecessary
Tool-based workflows	ReAct, REWOO	❌ Modules don’t replace tools
Learning from failures	Reflexion	❌ Self-Discovery doesn’t maintain memory
Predetermined steps	Plan & Solve	❌ Self-Discovery for novel problems

Comparison with Other Patterns

Aspect	Self-Discovery	Plan & Solve	ReAct
Planning	Dynamic module selection	Fixed planning phase	Adaptive per iteration
Reasoning	Multi-strategy	Linear steps	Thought-action
Tools	Not supported	Not supported	Core feature
Best For	Complex reasoning	Structured tasks	Dynamic tool use
Cost	High	Medium	Medium
Flexibility	Very high	Medium	High

Common Pitfalls

1. Too Many Modules

❌ Bad: Overwhelming the agent with too many options

huge_library = [/* 20+ modules */]
agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    reasoning_modules=huge_library,
    max_selected_modules=8  # Too many
)

✅ Good: Curated, relevant module library

focused_library = [/* 6-8 most relevant modules */]
agent = SelfDiscoveryAgent(
    llm_configs=llm_configs,
    reasoning_modules=focused_library,
    max_selected_modules=3
)

2. Generic Module Descriptions

❌ Bad: Vague descriptions

{
    "name": "analyze",
    "description": "Analyze the problem",
    "template": "Analyze '{task}'"
}

✅ Good: Specific, actionable descriptions

{
    "name": "stakeholder_analysis",
    "description": "Identify stakeholders, their goals, and potential conflicts",
    "template": "For '{task}', list key stakeholders, their objectives, and areas of alignment/conflict"
}

3. Using for Simple Tasks

❌ Bad: Over-engineering simple queries

agent.run("What is the capital of France?")  # Overkill!

✅ Good: Use for genuinely complex reasoning

agent.run("""
Design a comprehensive climate change mitigation strategy
for a mid-sized industrial city, balancing economic growth
with environmental sustainability.
""")

4. Ignoring Module Adaptation

❌ Bad: Skipping adaptation in custom implementation

# Directly using generic modules without task-specific customization

✅ Good: Let agent adapt modules to context

# Trust the adaptation phase - it makes generic modules task-specific
agent = SelfDiscoveryAgent(llm_configs=llm_configs)  # Uses adaptation

Troubleshooting

Poor Module Selection

Symptom: Agent selects irrelevant modules

Solutions:

# 1. Improve module descriptions
modules = [
    {
        "name": "financial_analysis",
        "description": "Evaluate financial viability, ROI, and cost-benefit. Use for business/financial decisions.",
        "template": "..."
    }
]

# 2. Add custom selection instructions
custom_instructions = """
When selecting modules, prioritize those that:
1. Directly address the core problem
2. Provide complementary perspectives
3. Are specific to the task domain
"""

# 3. Override DiscoverModules prompt for better selection

Weak Reasoning Steps

Symptom: Execution steps are shallow or generic

Solutions:

# 1. Add execution guidance
custom_instructions = """
When executing reasoning steps:
- Provide specific examples and evidence
- Show detailed work and calculations
- Reference concrete data points
- Explain reasoning clearly
"""

# 2. Use stronger LLM for execution
llm_configs = {
    "thinking": {"provider": "openai", "model": "gpt-4"},
    "execution": {"provider": "openai", "model": "gpt-4"}  # Not 3.5
}

Synthesis Doesn’t Integrate Well

Symptom: Final answer doesn’t coherently combine step results

Solutions:

# Override synthesis prompt
overrides = {
    "SynthesizeOutput": {
        "system": "You are an expert synthesizer. Create cohesive answers that integrate all reasoning.",
        "user": """Task: {task}

Reasoning steps and results:
{reasoning_steps}

Synthesize these into a comprehensive, well-structured answer that:
1. Integrates insights from all steps
2. Resolves any contradictions
3. Provides actionable recommendations
4. Includes supporting evidence

Your synthesis:"""
    }
}

Next Steps

Try the complete examples
Learn about Plan & Solve for simpler structured reasoning
Explore Reflexion for learning from multiple trials
Read the original paper

References

Original paper: Self-Discover: Large Language Models Self-Compose Reasoning Structures
Related work on Chain-of-Thought prompting
Reasoning strategies in cognitive science

Self-Discovery Agent Pattern

Overview

When to Use Self-Discovery

Ideal Use Cases

When NOT to Use Self-Discovery

How Self-Discovery Works

The Discovery-Adaptation-Execution Cycle

Theoretical Foundation

Algorithm

API Reference

Class: SelfDiscoveryAgent

Parameters

Default Reasoning Modules

LLM Roles

Methods

Complete Examples

Basic Usage

With Custom Reasoning Modules

With Custom Instructions

With Prompt Overrides

Customizing Reasoning Modules

Understanding the System Prompt Structure

Creating Domain-Specific Modules

Module Structure

Setting Agent Goals

Via Task Description

Via Custom Instructions

Via System Prompt Override

Advanced Usage

Adjusting Module Selection

Combining Default and Custom Modules

Role-Specific LLM Configurations

Performance Considerations

Cost Optimization

When to Use vs Other Patterns

Comparison with Other Patterns

Common Pitfalls

1. Too Many Modules

2. Generic Module Descriptions

3. Using for Simple Tasks

4. Ignoring Module Adaptation

Troubleshooting

Poor Module Selection

Weak Reasoning Steps

Synthesis Doesn’t Integrate Well

Next Steps

References

Class: `SelfDiscoveryAgent`