Skip to content

Commit 011120a

Browse files
Merge pull request #1785 from dandi/enh-lad-framework
Add LLM-Assisted Development (LAD) test quality framework
2 parents 0a09670 + c3370e1 commit 011120a

39 files changed

+8508
-0
lines changed

.lad/.copilot-instructions.md

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
# Global Copilot Instructions
2+
3+
* Prioritize **minimal scope**: only edit code directly implicated by the failing test.
4+
* Protect existing functionality: do **not** delete or refactor code outside the immediate test context.
5+
* Before deleting any code, follow the "Coverage & Code Safety" guidelines below.
6+
7+
Copilot, do not modify any files under .lad/.
8+
All edits must occur outside .lad/, or in prompts/ when explicitly updating LAD itself.
9+
10+
Coding & formatting
11+
* Follow PEP 8; run Black.
12+
* Use type hints everywhere.
13+
* External dependencies limited to numpy, pandas, requests.
14+
* Target Python 3.11.
15+
16+
Testing & linting
17+
* Write tests using component-appropriate strategy (see Testing Strategy below).
18+
* Run flake8 with `--max-complexity=10`; keep complexity ≤ 10.
19+
* Every function/class **must** include a **NumPy-style docstring** (Sections: Parameters, Returns, Raises, Examples).
20+
21+
## Testing Strategy by Component Type
22+
23+
**API Endpoints & Web Services:**
24+
* Use **integration testing** - import the real FastAPI/Django/Flask app
25+
* Mock only external dependencies (databases, external APIs, file systems)
26+
* Test actual HTTP routing, validation, serialization, and error handling
27+
* Verify real request/response behavior and framework integration
28+
29+
**Business Logic & Algorithms:**
30+
* Use **unit testing** - mock all dependencies completely
31+
* Test logic in complete isolation, focus on edge cases
32+
* Maximize test speed and reliability
33+
* Test pure business logic without framework concerns
34+
35+
**Data Processing & Utilities:**
36+
* Use **unit testing** with minimal dependencies
37+
* Use test data fixtures for predictable inputs
38+
* Focus on input/output correctness and error handling
39+
40+
## Regression Prevention
41+
42+
**Before making changes:**
43+
* Run full test suite to establish baseline: `pytest -q --tb=short`
44+
* Identify dependencies: `grep -r "function_name" . --include="*.py"`
45+
* Understand impact scope before modifications
46+
47+
**During development:**
48+
* Run affected tests after each change: `pytest -q tests/test_modified_module.py`
49+
* Preserve public API interfaces or update all callers
50+
* Make minimal changes focused on the failing test
51+
52+
**Before commit:**
53+
* Run full test suite: `pytest -q --tb=short`
54+
* Verify no regressions introduced
55+
* Ensure test coverage maintained or improved
56+
57+
## Code Quality Setup (One-time per project)
58+
59+
**1. Install quality tools:**
60+
```bash
61+
pip install flake8 pytest coverage radon flake8-radon black
62+
```
63+
64+
**2. Configure .flake8 file in project root:**
65+
```ini
66+
[flake8]
67+
max-complexity = 10
68+
radon-max-cc = 10
69+
exclude =
70+
__pycache__,
71+
.git,
72+
.lad,
73+
.venv,
74+
venv,
75+
build,
76+
dist
77+
```
78+
79+
**3. Configure .coveragerc file (see kickoff prompt for template)**
80+
81+
**4. Verify setup:**
82+
```bash
83+
flake8 --version # Should show flake8-radon plugin
84+
radon --version # Confirm radon installation
85+
pytest --cov=. --version # Confirm coverage plugin
86+
```
87+
88+
## Installing & Configuring Radon
89+
90+
**Install Radon and its Flake8 plugin:**
91+
```bash
92+
pip install radon flake8-radon
93+
```
94+
This installs Radon's CLI and enables the `--radon-max-cc` option in Flake8.
95+
96+
**Enable Radon in Flake8** by adding to `.flake8` or `setup.cfg`:
97+
```ini
98+
[flake8]
99+
max-complexity = 10
100+
radon-max-cc = 10
101+
```
102+
Functions exceeding cyclomatic complexity 10 will be flagged as errors (C901).
103+
104+
**Verify Radon raw metrics:**
105+
```bash
106+
radon raw path/to/your/module.py
107+
```
108+
Outputs LOC, LLOC, comments, blank lines—helping you spot oversized modules quickly.
109+
110+
**(Optional) Measure Maintainability Index:**
111+
```bash
112+
radon mi path/to/your/module.py
113+
```
114+
Gives a 0–100 score indicating code maintainability.
115+
116+
Coverage & Code Safety
117+
* For safety checks, do **not** run coverage inside VS Code.
118+
Instead, ask the user:
119+
> "Please run in your terminal:
120+
> ```bash
121+
> coverage run -m pytest [test_files] -q && coverage html
122+
> ```
123+
> then reply **coverage complete**."
124+
125+
* Before deleting code, verify:
126+
1. 0% coverage via `coverage report --show-missing`
127+
2. Absence from Level-2 API docs
128+
If both hold, prompt:
129+
130+
Delete <name>? (y/n)
131+
Reason: 0% covered and not documented.
132+
(Tip: use VS Code "Find All References" on <name>.)
133+
134+
Commits
135+
* Use Conventional Commits. Example:
136+
`feat(pipeline-filter): add ROI masking helper`
137+
* Keep body as bullet list of sub-tasks completed.
138+
139+
Docs
140+
* High-level docs live under the target project's `docs/` and are organised in three nested levels using `<details>` tags.
141+
142+
* After completing each **main task** (top-level checklist item), run:
143+
`flake8 {{PROJECT_NAME}} --max-complexity=10`
144+
`python -m pytest --cov={{PROJECT_NAME}} --cov-context=test -q --maxfail=1`
145+
If either step fails, pause for user guidance.
146+
147+
* **Radon checks:** Use `radon raw <file>` to get SLOC; use `radon mi <file>` to check maintainability. If `raw` LOC > 500 or MI < 65, propose splitting the module.

.lad/.vscode/extensions.json

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
{
2+
"recommendations": [
3+
"github.copilot",
4+
"github.copilot-chat",
5+
"ms-python.python",
6+
"ms-python.vscode-pylance",
7+
"hbenl.vscode-test-explorer",
8+
"ryanluker.vscode-coverage-gutters",
9+
"ms-python.flake8"
10+
]
11+
}

.lad/.vscode/settings.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{
2+
"python.testing.pytestEnabled": true,
3+
"python.testing.autoTestDiscoverOnSaveEnabled": true,
4+
"python.testing.pytestArgs": ["-q"],
5+
"coverage-gutters.xmlPath": "coverage.xml",
6+
"python.linting.flake8Enabled": true,
7+
"python.linting.flake8Args": ["--max-complexity=10"]
8+
}

.lad/CLAUDE.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# Project Context for Claude Code LAD Framework
2+
3+
## Architecture Overview
4+
*Auto-updated by LAD workflows - current system understanding*
5+
6+
## Code Style Requirements
7+
- **Docstrings**: NumPy-style required for all functions/classes
8+
- **Linting**: Flake8 compliance (max-complexity 10)
9+
- **Testing**: TDD approach, component-aware strategies
10+
- **Coverage**: 90%+ target for new code
11+
12+
## Communication Guidelines
13+
**Objective, European-Style Communication**:
14+
- **Avoid excessive enthusiasm**: Replace "brilliant!", "excellent!", "perfect!" with measured language
15+
- **Scientific tone**: "This approach has merit" instead of "That's a great idea!"
16+
- **Honest criticism**: State problems directly - "This approach has significant limitations" vs hedging
17+
- **Acknowledge uncertainty**: "I cannot verify this will work" vs "This should work fine"
18+
- **Balanced perspectives**: Present trade-offs rather than unqualified endorsements
19+
- **Focus on accuracy**: Prioritize correctness over making user feel good about ideas
20+
21+
## Maintenance Integration Protocol
22+
**Technical Debt Management**:
23+
- **Boy Scout Rule**: Leave code cleaner than found when possible
24+
- **Maintenance Registry**: Track and prioritize technical debt systematically
25+
- **Impact-based cleanup**: Focus on functional issues before cosmetic ones
26+
- **Progress tracking**: Update both TodoWrite and plan.md files consistently
27+
28+
## Testing Strategy Guidelines
29+
- **API Endpoints**: Integration testing (real app + mocked external deps)
30+
- **Business Logic**: Unit testing (complete isolation + mocks)
31+
- **Data Processing**: Unit testing (minimal deps + test fixtures)
32+
33+
## Project Structure Patterns
34+
*Learned from exploration - common patterns and conventions*
35+
36+
## Current Feature Progress
37+
*TodoWrite integration status and cross-session state*
38+
39+
## Quality Metrics Baseline
40+
- Test count: *tracked across sessions*
41+
- Coverage: *baseline and current*
42+
- Complexity: *monitored for regression*
43+
44+
## Common Gotchas & Solutions
45+
*Accumulated from previous implementations*
46+
47+
### Token Optimization for Large Codebases
48+
**Standard test commands:**
49+
- **Large test suites**: Use `2>&1 | tail -n 100` for pytest commands to capture only final results/failures
50+
- **Coverage reports**: Use `tail -n 150` for comprehensive coverage output to include summary
51+
- **Keep targeted tests unchanged**: Single test runs (`pytest -xvs`) don't need redirection
52+
53+
**Long-running commands (>2 minutes):**
54+
- **Pattern**: `<command> 2>&1 | tee full_output.txt | grep -iE "(warning|error|failed|exception|fatal|critical)" | tail -n 30; echo "--- FINAL OUTPUT ---"; tail -n 100 full_output.txt`
55+
- **Use cases**: Package installs, builds, data processing, comprehensive test suites, long compilation
56+
- **Benefits**: Captures warnings/errors from anywhere in output, saves full output for detailed review, prevents token explosion
57+
- **Case-insensitive**: Catches `ERROR`, `Error`, `error`, `WARNING`, `Warning`, `warning`, etc.
58+
59+
**Rationale**: Large codebases can generate massive output consuming significant Claude Pro allowance. Enhanced pattern ensures critical information isn't missed while optimizing token usage.
60+
61+
## Integration Patterns
62+
*How components typically connect in this codebase*
63+
64+
## Cross-Session Integration Tracking
65+
*Maintained across LAD sessions to prevent duplicate implementations*
66+
67+
### Active Implementations
68+
*Current state of system components and their integration readiness*
69+
70+
| Component | Status | Integration Points | Last Updated |
71+
|-----------|--------|--------------------|--------------|
72+
| *No active implementations tracked* | - | - | - |
73+
74+
### Integration Decisions Log
75+
*Historical decisions to guide future development*
76+
77+
| Feature | Decision | Strategy | Rationale | Session Date | Outcome |
78+
|---------|----------|----------|-----------|--------------|---------|
79+
| *No decisions logged* | - | - | - | - | - |
80+
81+
### Pending Integration Tasks
82+
*Cross-session work that needs completion*
83+
84+
- *No pending integration tasks*
85+
86+
### Architecture Evolution Notes
87+
*Key architectural changes that affect future integration decisions*
88+
89+
- *No architectural changes logged*
90+
91+
### Integration Anti-Patterns Avoided
92+
*Documentation of duplicate implementations prevented*
93+
94+
- *No anti-patterns logged*
95+
96+
---
97+
*Last updated by Claude Code LAD Framework*

0 commit comments

Comments
 (0)