# Self-Discovery Agent Pattern The **Self-Discovery** pattern enables agents to dynamically select and adapt reasoning strategies from a library of problem-solving heuristics, creating customized reasoning plans for each unique task. ## Overview **Best For**: Complex reasoning tasks requiring multiple problem-solving approaches **Complexity**: ⭐⭐⭐ Advanced (Sophisticated meta-reasoning) **Cost**: $$$ Higher (Multiple LLM calls for discovery, adaptation, and execution) ## When to Use Self-Discovery ### Ideal Use Cases ✅ **Multi-faceted problems** - Agent discovers relevant reasoning approaches - Adapts general strategies to specific context - Combines multiple perspectives for comprehensive solutions ✅ **Novel problem domains** - No predetermined approach exists - Agent selects appropriate reasoning modules - Customizes strategy based on task characteristics ✅ **Complex analytical tasks** - Requires diverse reasoning methods (analogical, first principles, etc.) - Benefits from structured reasoning plan - Needs systematic approach to decomposition ✅ **Strategic planning** - Analyzes problem from multiple angles - Selects relevant planning heuristics - Executes customized reasoning workflow ### When NOT to Use Self-Discovery ❌ **Simple queries** → Use direct LLM or ReAct ❌ **Tasks with known solutions** → Use Plan & Solve ❌ **Tool-based workflows** → Use ReAct or REWOO ❌ **Speed-critical tasks** → Meta-reasoning adds overhead ## How Self-Discovery Works ### The Discovery-Adaptation-Execution Cycle ``` ┌─────────────────────────────────────────┐ │ │ │ 1. DISCOVER: Select relevant modules │ │ Library: [break_down, analogical, │ │ first_principles, ...] │ │ Selected: [break_down, │ │ first_principles] │ │ │ └─────────────────┬───────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ │ │ 2. ADAPT: Customize for task │ │ break_down → "Decompose the system │ │ into: UI, API, Database layers" │ │ │ └─────────────────┬───────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ │ │ 3. PLAN: Create reasoning sequence │ │ Step 1: Apply break_down strategy │ │ Step 2: Apply first_principles │ │ Step 3: Synthesize insights │ │ │ └─────────────────┬───────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ │ │ 4. EXECUTE: Run each reasoning step │ │ Execute step 1... execute step 2... │ │ │ └─────────────────┬───────────────────────┘ ↓ ┌─────────────────────────────────────────┐ │ │ │ 5. SYNTHESIZE: Combine results │ │ Integrate all reasoning outputs │ │ into coherent final answer │ │ │ └─────────────────────────────────────────┘ ``` ### Theoretical Foundation Based on the paper ["Self-Discover: Large Language Models Self-Compose Reasoning Structures"](https://arxiv.org/abs/2402.03620). Key insights: 1. **Meta-reasoning**: Agent reasons about which reasoning approaches to use 2. **Adaptive strategy**: Different tasks benefit from different heuristics 3. **Structured thinking**: Explicit reasoning plan improves outcomes 4. **Composability**: Combines multiple reasoning modules effectively ### Algorithm ```python def self_discovery(task, module_library, max_modules=3): """Simplified Self-Discovery algorithm""" # Stage 1: Discover relevant modules selected_modules = llm_select_modules( task=task, modules=module_library, max_select=max_modules ) # Stage 2: Adapt modules to task adapted_modules = [] for module in selected_modules: adapted = llm_adapt_module( task=task, module=module ) adapted_modules.append(adapted) # Stage 3: Create reasoning plan reasoning_plan = llm_create_plan( task=task, adapted_modules=adapted_modules ) # Stage 4: Execute plan steps step_results = [] for step in reasoning_plan: result = llm_execute_step( task=task, step=step, previous_results=step_results ) step_results.append(result) # Stage 5: Synthesize final answer final_answer = llm_synthesize( task=task, reasoning_steps=step_results ) return final_answer ``` ## API Reference ### Class: `SelfDiscoveryAgent` ```python from agent_patterns.patterns import SelfDiscoveryAgent agent = SelfDiscoveryAgent( llm_configs: Dict[str, Dict[str, Any]], reasoning_modules: Optional[List[Dict[str, str]]] = None, max_selected_modules: int = 3, prompt_dir: str = "prompts", custom_instructions: Optional[str] = None, prompt_overrides: Optional[Dict[str, Dict[str, str]]] = None ) ``` #### Parameters | Parameter | Type | Required | Description | |-----------|------|----------|-------------| | `llm_configs` | `Dict[str, Dict[str, Any]]` | Yes | LLM configs for "thinking" and "execution" roles | | `reasoning_modules` | `List[Dict]` | No | Custom reasoning module library (uses defaults if None) | | `max_selected_modules` | `int` | No | Max modules to select per task (default: 3) | | `prompt_dir` | `str` | No | Custom prompt directory (default: "prompts") | | `custom_instructions` | `str` | No | Instructions appended to system prompts | | `prompt_overrides` | `Dict` | No | Override specific prompts programmatically | #### Default Reasoning Modules The agent includes 10 default reasoning modules: 1. **break_down_problem**: Decompose into sub-problems 2. **identify_constraints**: Analyze requirements and limitations 3. **analogical_reasoning**: Find similar problems and apply lessons 4. **first_principles**: Reason from fundamental truths 5. **step_by_step**: Proceed systematically through the problem 6. **pros_and_cons**: Evaluate different approaches 7. **critical_analysis**: Examine assumptions and evidence 8. **pattern_recognition**: Identify patterns and trends 9. **hypothesis_testing**: Form and test hypotheses 10. **visualization**: Create mental models or diagrams #### LLM Roles - **thinking**: Used for discovery, adaptation, planning, and synthesis - **execution**: Used for executing each reasoning step #### Methods **`run(input_data: str) -> str`** Executes the Self-Discovery pattern on the given input. - **Parameters**: - `input_data` (str): The task or problem to solve - **Returns**: str - The synthesized final answer - **Raises**: ValueError if graph not built **`build_graph() -> None`** Builds the LangGraph state graph. Called automatically during initialization. ## Complete Examples ### Basic Usage ```python from agent_patterns.patterns import SelfDiscoveryAgent # Configure LLMs llm_configs = { "thinking": { "provider": "openai", "model": "gpt-4", "temperature": 0.7, }, "execution": { "provider": "openai", "model": "gpt-4", "temperature": 0.7, } } # Create agent with default reasoning modules agent = SelfDiscoveryAgent( llm_configs=llm_configs, max_selected_modules=3 ) # Solve complex reasoning problem result = agent.run(""" Design a sustainable urban transportation system for a city of 2 million people. Consider environmental impact, cost, efficiency, and user experience. """) print(result) # Agent will: # 1. Select relevant modules (e.g., break_down, constraints, pros_and_cons) # 2. Adapt them to urban transportation context # 3. Create reasoning plan # 4. Execute each step # 5. Synthesize comprehensive solution ``` ### With Custom Reasoning Modules ```python # Define domain-specific reasoning modules custom_modules = [ { "name": "stakeholder_analysis", "description": "Identify all stakeholders and their interests", "template": "For '{task}', identify key stakeholders and analyze their needs and constraints" }, { "name": "risk_assessment", "description": "Evaluate potential risks and mitigation strategies", "template": "Identify risks in '{task}' and develop mitigation approaches" }, { "name": "resource_optimization", "description": "Optimize resource allocation and utilization", "template": "Analyze resource constraints for '{task}' and propose optimal allocation" }, { "name": "scalability_analysis", "description": "Assess how solution scales with growth", "template": "Evaluate scalability of '{task}' under different growth scenarios" }, { "name": "competitive_analysis", "description": "Compare with alternative approaches", "template": "Compare different approaches to '{task}' and identify best option" } ] agent = SelfDiscoveryAgent( llm_configs=llm_configs, reasoning_modules=custom_modules, max_selected_modules=3 ) result = agent.run(""" Develop a go-to-market strategy for a B2B SaaS product targeting enterprise customers in the healthcare industry. """) ``` ### With Custom Instructions ```python # Add guidance for reasoning selection and execution business_strategy_instructions = """ You are a business strategy consultant. When selecting reasoning modules: - Prioritize data-driven approaches - Consider both qualitative and quantitative factors - Balance short-term feasibility with long-term vision When executing reasoning steps: - Use concrete examples from real companies - Provide actionable recommendations - Consider both opportunities and risks - Reference industry best practices When synthesizing: - Create a structured action plan - Identify key metrics for success - Highlight critical dependencies """ agent = SelfDiscoveryAgent( llm_configs=llm_configs, max_selected_modules=4, custom_instructions=business_strategy_instructions ) result = agent.run(""" Create a digital transformation roadmap for a traditional retail company facing competition from e-commerce. """) ``` ### With Prompt Overrides ```python # Customize how modules are discovered and adapted overrides = { "DiscoverModules": { "system": """You are an expert in selecting problem-solving strategies. Analyze the task deeply and choose the most relevant reasoning approaches.""", "user": """Task: {task} Available reasoning modules: {modules} Select up to {max_modules} modules that are MOST relevant for this specific task. Focus on modules that will provide unique, valuable perspectives. For each selected module, output: SELECTED: Your selections:""" }, "AdaptModules": { "system": "You specialize in customizing general strategies for specific contexts.", "user": """Task: {task} Module: {module_name} Description: {module_description} Template: {module_template} Adapt this module specifically for the task. Be concrete and specific. How would you apply this reasoning approach to this exact problem? Adapted strategy:""" } } agent = SelfDiscoveryAgent( llm_configs=llm_configs, prompt_overrides=overrides, max_selected_modules=3 ) ``` ## Customizing Reasoning Modules ### Understanding the System Prompt Structure Version 0.2.0 introduces **enterprise-grade prompts** with a comprehensive 9-section structure. Each system prompt is now 150-300+ lines (vs ~32 lines), providing significantly better guidance. **The 9-Section Structure**: All Self-Discovery prompts now include Role and Identity, Core Capabilities (CAN/CANNOT boundaries), Process, Output Format, Decision-Making Guidelines, Quality Standards, Edge Cases, Examples, and Critical Reminders. **Benefits**: The five Self-Discovery steps (DiscoverModules, AdaptModules, PlanReasoning, ExecuteStep, SynthesizeOutput) all benefit from comprehensive prompts with increased reliability, better transparency, and improved robustness. No code changes required. ### Creating Domain-Specific Modules ```python # Scientific research modules research_modules = [ { "name": "literature_review", "description": "Survey existing research and identify gaps", "template": "Review relevant literature for '{task}' and identify research gaps" }, { "name": "hypothesis_formation", "description": "Formulate testable hypotheses", "template": "Generate testable hypotheses for '{task}'" }, { "name": "experimental_design", "description": "Design experiments to test hypotheses", "template": "Design experiments to investigate '{task}'" }, { "name": "data_analysis", "description": "Analyze data and draw conclusions", "template": "Analyze data patterns in '{task}' and draw evidence-based conclusions" } ] # Engineering design modules engineering_modules = [ { "name": "requirements_analysis", "description": "Define functional and non-functional requirements", "template": "Specify detailed requirements for '{task}'" }, { "name": "architecture_design", "description": "Design system architecture and components", "template": "Design system architecture for '{task}'" }, { "name": "trade_off_analysis", "description": "Evaluate engineering trade-offs", "template": "Analyze trade-offs between different design choices for '{task}'" } ] # Choose based on domain agent = SelfDiscoveryAgent( llm_configs=llm_configs, reasoning_modules=research_modules # or engineering_modules ) ``` ### Module Structure Each module requires three fields: ```python module = { "name": "unique_identifier", # Used in selection output "description": "What this reasoning approach does", # Helps agent select "template": "How to apply it to '{task}'" # Gets filled with actual task } ``` ## Setting Agent Goals ### Via Task Description Provide clear problem statement: ```python # Well-defined task agent.run(""" Problem: Our mobile app has 40% user churn in the first week. Context: - 100K downloads/month - Average session: 3 minutes - Main competitors: AppA (15% churn), AppB (20% churn) Goal: Reduce churn to under 25% within 3 months Constraints: - Limited development resources (2 engineers) - $50K marketing budget - Must maintain current feature set Question: What strategy should we implement? """) ``` ### Via Custom Instructions Set persistent reasoning guidelines: ```python agent = SelfDiscoveryAgent( llm_configs=llm_configs, custom_instructions=""" GOAL: Provide comprehensive, actionable strategic recommendations REASONING APPROACH: - Select modules that complement each other - Ensure both analytical and creative perspectives - Ground reasoning in practical constraints OUTPUT REQUIREMENTS: - Specific, measurable recommendations - Timeline and resource estimates - Risk factors and mitigation strategies - Success metrics """ ) ``` ### Via System Prompt Override Configure each stage: ```python overrides = { "DiscoverModules": { "system": """You select reasoning strategies to maximize solution quality. Goal: Choose modules that provide diverse, complementary perspectives.""" }, "ExecuteStep": { "system": """You execute reasoning steps with rigor and depth. Goal: Generate insights that are specific, actionable, and evidence-based.""" }, "SynthesizeOutput": { "system": """You synthesize insights into coherent strategic recommendations. Goal: Create an actionable plan with clear priorities and success metrics.""" } } ``` ## Advanced Usage ### Adjusting Module Selection ```python # Select more modules for complex tasks comprehensive_agent = SelfDiscoveryAgent( llm_configs=llm_configs, max_selected_modules=5 # More diverse perspectives ) # Select fewer for focused analysis focused_agent = SelfDiscoveryAgent( llm_configs=llm_configs, max_selected_modules=2 # Streamlined reasoning ) ``` ### Combining Default and Custom Modules ```python from agent_patterns.patterns.self_discovery_agent import DEFAULT_REASONING_MODULES # Add custom modules to defaults all_modules = DEFAULT_REASONING_MODULES + [ { "name": "ethical_analysis", "description": "Evaluate ethical implications and considerations", "template": "Analyze ethical dimensions of '{task}'" }, { "name": "sustainability_check", "description": "Assess environmental and social sustainability", "template": "Evaluate sustainability aspects of '{task}'" } ] agent = SelfDiscoveryAgent( llm_configs=llm_configs, reasoning_modules=all_modules, max_selected_modules=4 ) ``` ### Role-Specific LLM Configurations ```python # Use stronger model for meta-reasoning, standard for execution llm_configs = { "thinking": { "provider": "openai", "model": "gpt-4", # Stronger for discovery/planning "temperature": 0.7, }, "execution": { "provider": "openai", "model": "gpt-3.5-turbo", # Cheaper for execution steps "temperature": 0.7, } } ``` ## Performance Considerations ### Cost Optimization Self-Discovery makes many LLM calls: **Per task cost**: - Discover modules: 1 call - Adapt modules: N calls (where N = selected modules) - Plan reasoning: 1 call - Execute steps: M calls (where M = steps in plan) - Synthesize: 1 call - **Total**: ~7-12 LLM calls for typical task **Optimization strategies**: ```python # 1. Reduce max_selected_modules agent = SelfDiscoveryAgent( llm_configs=llm_configs, max_selected_modules=2 # Fewer modules = fewer calls ) # 2. Use cheaper model for execution llm_configs = { "thinking": {"provider": "openai", "model": "gpt-4"}, "execution": {"provider": "openai", "model": "gpt-3.5-turbo"} } # 3. Create focused module libraries (fewer options to consider) focused_modules = [ # Only 4-5 most relevant modules for your domain ] ``` ### When to Use vs Other Patterns | Task Type | Best Pattern | Reason | |-----------|-------------|---------| | Complex reasoning, no tools | Self-Discovery | ✅ Leverages diverse reasoning | | Simple reasoning | Direct LLM | ❌ Self-Discovery overhead unnecessary | | Tool-based workflows | ReAct, REWOO | ❌ Modules don't replace tools | | Learning from failures | Reflexion | ❌ Self-Discovery doesn't maintain memory | | Predetermined steps | Plan & Solve | ❌ Self-Discovery for novel problems | ## Comparison with Other Patterns | Aspect | Self-Discovery | Plan & Solve | ReAct | |--------|---------------|--------------|-------| | **Planning** | Dynamic module selection | Fixed planning phase | Adaptive per iteration | | **Reasoning** | Multi-strategy | Linear steps | Thought-action | | **Tools** | Not supported | Not supported | Core feature | | **Best For** | Complex reasoning | Structured tasks | Dynamic tool use | | **Cost** | High | Medium | Medium | | **Flexibility** | Very high | Medium | High | ## Common Pitfalls ### 1. Too Many Modules ❌ **Bad**: Overwhelming the agent with too many options ```python huge_library = [/* 20+ modules */] agent = SelfDiscoveryAgent( llm_configs=llm_configs, reasoning_modules=huge_library, max_selected_modules=8 # Too many ) ``` ✅ **Good**: Curated, relevant module library ```python focused_library = [/* 6-8 most relevant modules */] agent = SelfDiscoveryAgent( llm_configs=llm_configs, reasoning_modules=focused_library, max_selected_modules=3 ) ``` ### 2. Generic Module Descriptions ❌ **Bad**: Vague descriptions ```python { "name": "analyze", "description": "Analyze the problem", "template": "Analyze '{task}'" } ``` ✅ **Good**: Specific, actionable descriptions ```python { "name": "stakeholder_analysis", "description": "Identify stakeholders, their goals, and potential conflicts", "template": "For '{task}', list key stakeholders, their objectives, and areas of alignment/conflict" } ``` ### 3. Using for Simple Tasks ❌ **Bad**: Over-engineering simple queries ```python agent.run("What is the capital of France?") # Overkill! ``` ✅ **Good**: Use for genuinely complex reasoning ```python agent.run(""" Design a comprehensive climate change mitigation strategy for a mid-sized industrial city, balancing economic growth with environmental sustainability. """) ``` ### 4. Ignoring Module Adaptation ❌ **Bad**: Skipping adaptation in custom implementation ```python # Directly using generic modules without task-specific customization ``` ✅ **Good**: Let agent adapt modules to context ```python # Trust the adaptation phase - it makes generic modules task-specific agent = SelfDiscoveryAgent(llm_configs=llm_configs) # Uses adaptation ``` ## Troubleshooting ### Poor Module Selection **Symptom**: Agent selects irrelevant modules **Solutions**: ```python # 1. Improve module descriptions modules = [ { "name": "financial_analysis", "description": "Evaluate financial viability, ROI, and cost-benefit. Use for business/financial decisions.", "template": "..." } ] # 2. Add custom selection instructions custom_instructions = """ When selecting modules, prioritize those that: 1. Directly address the core problem 2. Provide complementary perspectives 3. Are specific to the task domain """ # 3. Override DiscoverModules prompt for better selection ``` ### Weak Reasoning Steps **Symptom**: Execution steps are shallow or generic **Solutions**: ```python # 1. Add execution guidance custom_instructions = """ When executing reasoning steps: - Provide specific examples and evidence - Show detailed work and calculations - Reference concrete data points - Explain reasoning clearly """ # 2. Use stronger LLM for execution llm_configs = { "thinking": {"provider": "openai", "model": "gpt-4"}, "execution": {"provider": "openai", "model": "gpt-4"} # Not 3.5 } ``` ### Synthesis Doesn't Integrate Well **Symptom**: Final answer doesn't coherently combine step results **Solutions**: ```python # Override synthesis prompt overrides = { "SynthesizeOutput": { "system": "You are an expert synthesizer. Create cohesive answers that integrate all reasoning.", "user": """Task: {task} Reasoning steps and results: {reasoning_steps} Synthesize these into a comprehensive, well-structured answer that: 1. Integrates insights from all steps 2. Resolves any contradictions 3. Provides actionable recommendations 4. Includes supporting evidence Your synthesis:""" } } ``` ## Next Steps - Try the [complete examples](../examples/self-discovery-examples.md) - Learn about [Plan & Solve](plan-and-solve.md) for simpler structured reasoning - Explore [Reflexion](reflexion.md) for learning from multiple trials - Read the [original paper](https://arxiv.org/abs/2402.03620) ## References - Original paper: [Self-Discover: Large Language Models Self-Compose Reasoning Structures](https://arxiv.org/abs/2402.03620) - Related work on [Chain-of-Thought prompting](https://arxiv.org/abs/2201.11903) - [Reasoning strategies in cognitive science](https://en.wikipedia.org/wiki/Problem_solving#Techniques)