Feature: AI-powered 'ask' command with flexible LLM provider support

# Feature: AI-powered 'ask' command with flexible LLM provider support

## Problem Statement

Current workflow for LLMs working with documentation:
1. External LLM needs to navigate documentation
2. Must load large chunks into context (risk of overflow)
3. Relevant information may be spread across multiple sections
4. No intelligent iteration or consolidation
5. Every query costs API tokens

**Result:** Inefficient context usage, incomplete answers, high costs

## Proposed Solution

Add an `ask` command that uses an internal LLM to intelligently navigate documentation, iteratively build context, and provide consolidated answers with source references.

**Key Innovation:** Use existing CLI coding assistants (Claude Code, Aider, etc.) to leverage user's flatrate subscriptions instead of additional API costs.

## Architecture

### Iterative Context Building

```
User: dacli ask "How does authentication work?"
         ↓
    1. Search relevant sections
       → security.authentication
       → security.authorization
       → api.endpoints
         ↓
    2. Iterate through sections:
       
       Iteration 1 (security.authentication):
       ┌─────────────────────────────────────┐
       │ Question: How does auth work?       │
       │ Previous findings: [none]           │
       │ Section: security.authentication    │
       │ Content: [JWT tokens, OAuth2...]    │
       └─────────────────────────────────────┘
                    ↓ claude-code
       Findings: "Uses JWT tokens, check authorization flow"
       
       Iteration 2 (security.authorization):
       ┌─────────────────────────────────────┐
       │ Question: How does auth work?       │
       │ Previous findings:                  │
       │   - Uses JWT tokens                 │
       │   - Need authorization details      │
       │ Section: security.authorization     │
       │ Content: [RBAC, permissions...]     │
       └─────────────────────────────────────┘
                    ↓ claude-code
       Findings: "JWT validated, then RBAC checks..."
       
       Iteration 3: Consolidation
       ┌─────────────────────────────────────┐
       │ All findings: [accumulated context] │
       │ Task: Provide final answer with     │
       │       source references             │
       └─────────────────────────────────────┘
                    ↓ claude-code
       Final answer with sources
         ↓
    3. Return consolidated answer
```

## LLM Provider Support

### Flexible Configuration via `.dacli/config.toml`

```toml
[llm]
# Provider: "claude-code", "aider", "amazon-q", "bedrock", "api", "custom"
provider = "claude-code"

# Custom command template (for any CLI tool)
# Variables: {prompt}, {prompt_file}, {model}
custom_command = "claude-code --non-interactive"

[llm.options]
max_iterations = 5          # Max sections to check
timeout_seconds = 30        # Per iteration
context_window = 8000       # Tokens per iteration
verbose = false             # Show iteration steps

[llm.api]
# Direct API fallback (if no CLI tool)
api_key = "${ANTHROPIC_API_KEY}"
model = "claude-sonnet-4"
base_url = ""  # Optional proxy

[llm.bedrock]
# AWS Bedrock (enterprise option)
region = "us-east-1"
model_id = "anthropic.claude-3-sonnet-20240229-v1:0"
profile = "default"

[llm.amazon_q]
# Amazon Q Developer
region = "us-east-1"
profile = "default"

[search]
# Search strategy for finding relevant sections
use_embeddings = false      # Requires ADR-007 SQLite index
max_sections = 10           # Max sections to consider
min_relevance = 0.3         # Minimum search score
```

### Supported Providers

| Provider | Command | Cost Model | Use Case |
|----------|---------|------------|----------|
| **Claude Code** | `claude-code --non-interactive` | Flatrate | Recommended - uses existing subscription |
| **Aider** | `aider --yes --message` | Flatrate/API | Popular coding assistant |
| **Amazon Q** | `q chat --no-input` | AWS subscription | Enterprise, AWS-integrated |
| **Cursor** | `cursor --stdin` | Subscription | IDE users |
| **Bedrock** | boto3 API | Pay-per-token | Enterprise, compliance needs |
| **Anthropic API** | Direct API | Pay-per-token | Fallback option |
| **Custom** | User-defined | Varies | Any CLI LLM tool |

### Auto-Detection Priority

1. Check `.dacli/config.toml` for configured provider
2. Detect available CLI tools:
   - `claude-code` (priority 1)
   - `aider` (priority 2)
   - `cursor` (priority 3)
   - `q` (Amazon Q, priority 4)
3. Check for API credentials:
   - `ANTHROPIC_API_KEY`
   - `AWS_PROFILE` / AWS credentials
4. Fallback: Error with setup instructions

## CLI Interface

### Basic Usage

```bash
# Ask question about documentation
dacli ask "How does authentication work?"

# Scope to specific section
dacli ask "What are deployment options?" --section architecture

# Specify provider
dacli ask "..." --provider claude-code
dacli ask "..." --provider amazon-q
dacli ask "..." --provider bedrock

# Verbose mode (show iteration steps)
dacli ask "..." --verbose

# Custom LLM command
dacli ask "..." --llm-command "my-ai-tool --query"

# Output format
dacli ask "..." --format json
dacli ask "..." --format markdown
```

### Configuration Management

```bash
# Initialize configuration
dacli config init
# → Detects available providers
# → Creates .dacli/config.toml with defaults
# → Prompts for provider selection

# View current config
dacli config show

# Set provider
dacli config set llm.provider claude-code
dacli config set llm.provider bedrock

# Set provider-specific options
dacli config set llm.bedrock.region us-west-2
dacli config set llm.options.max_iterations 10

# Test configuration
dacli config test
# → Runs test query
# → Reports if provider works
# → Shows estimated cost (if applicable)
```

## MCP Tool Interface

```python
@mcp.tool()
def ask_documentation(
    question: str,
    scope: str | None = None,
    max_iterations: int = 5,
    provider: str | None = None,
) -> dict:
    """
    Ask a question about the documentation using AI reasoning.
    
    Uses configured LLM provider to intelligently search and consolidate
    information from multiple sections.
    
    Args:
        question: The question to answer
        scope: Optional section path to limit search (e.g., "architecture")
        max_iterations: Maximum sections to check (default: 5)
        provider: Override configured provider (e.g., "claude-code", "api")
    
    Returns:
        {
            "answer": "Consolidated answer text",
            "sources": [
                {"path": "security.authentication", "relevance": 0.95},
                {"path": "security.authorization", "relevance": 0.87}
            ],
            "iterations": 3,
            "provider_used": "claude-code",
            "tokens_used": 4200  # If available
        }
    """
```

## Implementation Phases

### Phase 1: Basic Infrastructure (MVP)

**Goal:** Simple ask command with Claude Code integration

- [ ] Add `.dacli/config.toml` support
- [ ] Implement `LLMProvider` base class
- [ ] Add `ClaudeCodeProvider` implementation
- [ ] Basic `dacli ask` command (single iteration)
- [ ] Configuration commands (`config init`, `config show`)
- [ ] Tests for provider detection and execution

**Deliverable:** `dacli ask "question"` works with Claude Code

### Phase 2: Iterative Context Building

**Goal:** Smart multi-section reasoning

- [ ] Implement context accumulation logic
- [ ] Add iteration loop with findings consolidation
- [ ] Prompt templates for:
  - Initial search
  - Section evaluation
  - Follow-up suggestions
  - Final consolidation
- [ ] Early termination (when enough info found)
- [ ] Source tracking and citation
- [ ] Verbose mode to show iterations

**Deliverable:** Intelligent multi-hop reasoning

### Phase 3: Multiple Provider Support

**Goal:** Support major LLM providers

- [ ] `AiderProvider` implementation
- [ ] `AmazonQProvider` implementation
- [ ] `BedrockProvider` implementation (boto3)
- [ ] `AnthropicAPIProvider` implementation
- [ ] `CustomProvider` for arbitrary CLI tools
- [ ] Auto-detection with priority fallback
- [ ] Provider-specific configuration validation
- [ ] Cost estimation (for pay-per-token providers)

**Deliverable:** Works with 5+ different LLM tools

### Phase 4: Advanced Features

**Goal:** Production-ready AI assistant

- [ ] Integration with SQLite embeddings (ADR-007)
- [ ] Semantic search for better section selection
- [ ] Conversation history (follow-up questions)
- [ ] Citation mode with line numbers
- [ ] Structured output formats (JSON, Markdown)
- [ ] MCP tool implementation
- [ ] Performance metrics and logging
- [ ] Rate limiting and timeout handling

**Deliverable:** Enterprise-ready documentation assistant

## Technical Details

### Provider Interface

```python
class LLMProvider(ABC):
    """Base class for LLM provider implementations."""
    
    @abstractmethod
    def execute(self, prompt: str, context: str = "") -> str:
        """Execute prompt with accumulated context."""
        pass
    
    @abstractmethod
    def is_available(self) -> bool:
        """Check if provider is available/configured."""
        pass
    
    @property
    @abstractmethod
    def cost_model(self) -> str:
        """Return 'flatrate', 'pay-per-token', or 'free'."""
        pass
```

### Iteration Algorithm

```python
def ask_with_iteration(
    question: str,
    scope: str | None = None,
    max_iterations: int = 5
) -> dict:
    """
    Iteratively search documentation and build answer.
    """
    provider = load_configured_provider()
    findings = ""
    sources = []
    
    # 1. Find relevant sections
    candidate_sections = search_relevant_sections(question, scope)
    
    # 2. Iteratively evaluate sections
    for i, section in enumerate(candidate_sections[:max_iterations]):
        prompt = build_iteration_prompt(
            question=question,
            previous_findings=findings,
            current_section=section
        )
        
        response = provider.execute(prompt, context=findings)
        
        # Update accumulated findings
        findings = extract_findings(response)
        sources.append({
            "path": section.path,
            "relevance": calculate_relevance(response)
        })
        
        # Check if we should continue
        if should_terminate(response, findings):
            break
    
    # 3. Final consolidation
    final_answer = provider.execute(
        build_consolidation_prompt(question, findings),
        context=findings
    )
    
    return {
        "answer": final_answer,
        "sources": sources,
        "iterations": i + 1,
        "provider_used": provider.name
    }
```

### Prompt Templates

```python
ITERATION_PROMPT = """
Question: {question}

Previous findings:
{previous_findings}

Current section: {section_path}
{section_content}

Task:
1. Does this section contain information relevant to the question?
2. If yes, extract key points
3. Identify what information is still missing
4. Suggest which sections to check next (if any)

Respond in this format:
RELEVANT: Yes/No
KEY_POINTS: [bullet list]
MISSING: [what's still needed]
NEXT_SECTIONS: [suggested sections or "DONE" if complete]
"""

CONSOLIDATION_PROMPT = """
Question: {question}

All findings from documentation:
{accumulated_findings}

Task: Provide a final, consolidated answer that:
1. Directly answers the question
2. Synthesizes information from all sections
3. Includes source references (section paths)
4. Is clear and well-structured

Format:
ANSWER: [your answer]
SOURCES: [list of section paths used]
"""
```

## Use Cases

### Use Case 1: External LLM Needs Answer

**Scenario:** Claude (in Claude.ai) needs to know how authentication works in a project.

**Current workflow:**
```
Claude.ai → MCP get_structure → MCP get_section "security" → 
Read 50KB content → Try to answer from context → Maybe miss details
```

**With ask command:**
```
Claude.ai → MCP ask_documentation "How does authentication work?" →
dacli internally:
  - Finds relevant sections (security.auth, api.endpoints, config)
  - Iterates through them with Claude Code (flatrate!)
  - Consolidates findings
  → Returns complete answer with sources
Claude.ai → Uses answer in conversation
```

**Benefit:** Better answer, lower cost (flatrate), less context usage

### Use Case 2: Developer Onboarding

**Scenario:** New developer exploring codebase documentation.

```bash
$ dacli ask "What's the deployment process?"
Searching documentation...
Checking: deployment.process ✓
Checking: deployment.docker ✓
Checking: ci-cd.pipeline ✓

Answer:
The deployment process involves 3 steps:
1. Build Docker image (deployment.docker)
2. Run tests in CI pipeline (ci-cd.pipeline)
3. Deploy to staging, then production (deployment.process)

For manual deployment, use: docker-compose up

Sources:
- deployment.process (line 45)
- deployment.docker (line 12)
- ci-cd.pipeline (line 78)
```

### Use Case 3: API Documentation Query

**Scenario:** Finding all authentication-related endpoints.

```bash
$ dacli ask "List all endpoints that require authentication" --format json

{
  "answer": "4 endpoints require authentication",
  "details": [
    "/api/users/* - All user management endpoints",
    "/api/admin/* - Admin panel endpoints",
    "/api/profile - User profile endpoint",
    "/api/settings - User settings endpoint"
  ],
  "sources": [
    {"path": "api.authentication", "line": 23},
    {"path": "api.endpoints.users", "line": 45},
    {"path": "api.endpoints.admin", "line": 89}
  ]
}
```

## Cost Analysis

### Without dacli ask (External LLM)

- Load full documentation: ~200k tokens
- Claude API cost: ~$0.60 per query
- Multiple queries needed: $2-5 per session
- **Monthly cost (100 queries):** $60-500

### With dacli ask (Claude Code CLI)

- Uses existing Claude Pro subscription ($20/month)
- 5 iterations × 4k tokens = 20k tokens
- Covered by flatrate
- **Additional cost:** $0

### With dacli ask (Bedrock)

- 5 iterations × 4k tokens = 20k tokens
- Claude Sonnet on Bedrock: ~$0.06
- **Cost per query:** $0.06 (90% cheaper than loading full docs)

## Security Considerations

- API keys stored in environment variables (not in config file)
- `.dacli/` added to `.gitignore` automatically
- Prompts sanitized (no injection attacks)
- Timeout limits to prevent infinite loops
- Optional cost estimation before execution (for pay-per-token)

## Configuration Example

```bash
$ dacli config init

╭─────────────────────────────────────────╮
│  dacli Configuration Setup              │
╰─────────────────────────────────────────╯

Detecting available LLM providers...

✓ Claude Code (claude-code) - FOUND
  Status: Ready to use
  Cost: Flatrate (recommended)

✗ Aider (aider) - NOT FOUND
  Install: pip install aider-chat

✓ Anthropic API - FOUND
  Status: ANTHROPIC_API_KEY detected
  Cost: Pay-per-token ($0.003/1k tokens)

✓ AWS Bedrock - FOUND
  Status: AWS credentials detected
  Cost: Pay-per-token ($0.003/1k tokens)
  
Select default provider:
  [1] claude-code (recommended - uses your flatrate)
  [2] api (Anthropic API)
  [3] bedrock (AWS Bedrock)
  [4] custom (specify command)
> 1

✓ Configuration saved to .dacli/config.toml
✓ Created .dacli/.gitignore

Test configuration? [Y/n] y
Running test query...
✓ Provider works! Response received in 2.3s

Setup complete! Try:
  dacli ask "How does authentication work?"
```

## Documentation Updates

- Add `src/docs/50-user-manual/30-ai-assistant.adoc` - Full guide for ask command
- Update CLI spec (`06_cli_specification.adoc`) with ask command
- Update architecture (`05_building_block_view.adoc`) with LLM integration
- Add ADR-008 for LLM provider architecture decision

## Acceptance Criteria

- [ ] `dacli ask "question"` works with Claude Code
- [ ] Iterative context building with 3+ sections
- [ ] Support for 3+ different LLM providers (Claude Code, API, Bedrock)
- [ ] Configuration via `.dacli/config.toml`
- [ ] Auto-detection of available providers
- [ ] Source citations in answers (section paths + lines)
- [ ] Verbose mode shows iteration steps
- [ ] MCP tool implementation
- [ ] Cost estimation for pay-per-token providers
- [ ] Tests for all provider implementations
- [ ] Documentation for setup and usage

## Related Issues

- #185 - ADR-007: SQLite index (enables embeddings for better search)
- #184 - Included files duplicates (affects search quality)

## Labels

`enhancement`, `feature`, `ai`, `llm`, `search`, `high-priority`

## Priority

**High** - Transforms dacli from navigation tool to intelligent documentation assistant. Leverages existing flatrate subscriptions for zero additional cost.

Provider	Command	Cost Model	Use Case
Claude Code	`claude-code --non-interactive`	Flatrate	Recommended - uses existing subscription
Aider	`aider --yes --message`	Flatrate/API	Popular coding assistant
Amazon Q	`q chat --no-input`	AWS subscription	Enterprise, AWS-integrated
Cursor	`cursor --stdin`	Subscription	IDE users
Bedrock	boto3 API	Pay-per-token	Enterprise, compliance needs
Anthropic API	Direct API	Pay-per-token	Fallback option
Custom	User-defined	Varies	Any CLI LLM tool

Feature: AI-powered 'ask' command with flexible LLM provider support #186

Description

Feature: AI-powered 'ask' command with flexible LLM provider support

Problem Statement

Proposed Solution

Architecture

Iterative Context Building

LLM Provider Support

Flexible Configuration via .dacli/config.toml

Supported Providers

Auto-Detection Priority

CLI Interface

Basic Usage

Configuration Management

MCP Tool Interface

Implementation Phases

Phase 1: Basic Infrastructure (MVP)

Phase 2: Iterative Context Building

Phase 3: Multiple Provider Support

Phase 4: Advanced Features

Technical Details

Provider Interface

Iteration Algorithm

Prompt Templates

Use Cases

Use Case 1: External LLM Needs Answer

Use Case 2: Developer Onboarding

Use Case 3: API Documentation Query

Cost Analysis

Without dacli ask (External LLM)

With dacli ask (Claude Code CLI)

With dacli ask (Bedrock)

Security Considerations

Configuration Example

Documentation Updates

Acceptance Criteria

Related Issues

Labels

Priority

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Flexible Configuration via `.dacli/config.toml`