-
Notifications
You must be signed in to change notification settings - Fork 11
docs: Add development setup guide and contribution guidelines #17
Copy link
Copy link
Open
Description
Problem Statement
The current README is minimal and lacks critical information for new contributors and developers who want to set up the project locally. Without clear setup instructions, contribution guidelines, and project structure documentation, potential GSoC contributors and community members will face roadblocks when trying to:
- Install and run the project locally
- Understand the project architecture
- Start contributing to the codebase
- Set up required dependencies and environment variables
Objective
Create comprehensive documentation that helps new developers quickly onboard to GA4GH-RegBot and understand how to contribute effectively.
Acceptance Criteria
1. SETUP.md or update README.md with:
- Python version requirements (specify version, e.g., Python 3.8+)
- Virtual environment setup instructions (venv or conda)
- Step-by-step installation guide
- Clone repository
- Create and activate virtual environment
- Install dependencies:
pip install -r requirements.txt - Verify installation with a simple test
- How to run the project locally
- Environment variables setup (.env template, required API keys, examples)
2. PROJECT STRUCTURE documentation:
- Directory tree with descriptions
- Explanation of each module (src/ contents)
- Key files and their purpose (main.py, requirements.txt, etc.)
- Architecture overview (how LLM, RAG, and vector store fit together)
3. CONTRIBUTING.md with:
- How to fork and clone the repository
- Development workflow (branch naming, commit conventions)
- Setting up development tools (linting, formatting, testing)
- How to run tests (once test suite exists - reference [docs] Add development setup guide and contribution guidelines #18 when available)
- Pull request process
- Code style guidelines (PEP 8 for Python)
- How to report issues
4. Dependencies documentation:
- Update
requirements.txtwith version pins and comments explaining each package - Document why each dependency is needed (LangChain, ChromaDB, etc.)
5. Quick Start guide (in README):
- 5-minute quick start for impatient developers
- Link to full SETUP.md for detailed instructions
Success Metrics
- ✅ New developers can set up and run the project within 15 minutes
- ✅ Architecture is clear without reading source code
- ✅ Contribution workflow is obvious to first-time contributors
- ✅ All documentation is accurate and up-to-date
Related Issues
- Blocked by: None
- Relates to: feat: Implement document ingestion pipeline - load, chunk, embed, vectorstore #9, Build document ingestion pipeline for GA4GH policy documents #11, feat: hybrid retrieval engine - semantic search + BM25 + category filtering #13, feat: compliance checker - study type detection, single LLM call, citation grounding #15 (all feature issues will benefit from this documentation)
Notes
- This is a documentation improvement (low-code, high-value contribution)
- Perfect for GSoC contributors to understand project first
- Once complete, should reduce onboarding time for future contributors
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels