Welcome to the MCP Apache Spark History Server project. We'd love to accept your patches and contributions to this project. For detailed information about how to contribute to Kubeflow, please refer to Contributing to Kubeflow.
Thank you for your interest in contributing! This guide will help you get started with contributing to the Spark History Server MCP project.
- 🐍 Python 3.12+
- ⚡ uv package manager
- 🔥 Docker (for local testing with Spark History Server)
- 📦 Node.js (for MCP Inspector testing)
- 🍴 Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/mcp-apache-spark-history-server.git
cd mcp-apache-spark-history-server- 📦 Install dependencies
uv sync --group dev --frozen- 🔧 Install pre-commit hooks
uv run pre-commit install- 🧪 Run tests to verify setup
uv run pytest# Terminal 1: Start Spark History Server with sample data
./start_local_spark_history.sh
# Terminal 2: Test your changes
npx @modelcontextprotocol/inspector uv run -m spark_history_mcp.core.main
# Opens browser at http://localhost:6274 for interactive testing# Run all tests
uv run pytest
# Run tests with coverage
uv run pytest --cov=. --cov-report=html
# Run specific test file
uv run pytest test_tools.py -v# Lint and format (runs automatically on commit)
uv run ruff check --fix
uv run ruff format
# Type checking
uv run mypy .
# Security scanning
uv run bandit -r . -f json -o bandit-report.json- New MCP Tools: Additional Spark analysis tools
- Performance Improvements: Optimize API calls and data processing
- Error Handling: Better error messages and recovery
- Documentation: Examples, tutorials, and guides
- Testing: More comprehensive test coverage
- Monitoring: Metrics and observability features
- Configuration: More flexible configuration options
- CI/CD: GitHub Actions improvements
- AI Agent Examples: New integration patterns
- Deployment: Additional deployment methods
- Analytics: Advanced Spark job analysis tools
- 🌿 Create a feature branch
git checkout -b feature/your-new-feature
git checkout -b fix/bug-description
git checkout -b docs/improve-readme- 💻 Make your changes
- Follow existing code style and patterns
- Add tests for new functionality
- Update documentation as needed
- Ensure all pre-commit hooks pass
- ✅ Test thoroughly
# Run full test suite
uv run pytest
# Test with MCP Inspector
npx @modelcontextprotocol/inspector uv run -m spark_history_mcp.core.main
# Test with real Spark data if possible- 📤 Submit pull request
- Use descriptive commit messages
- Reference any related issues
- Include screenshots for UI changes
- Update CHANGELOG.md if applicable
We use Ruff for linting and formatting (automatically enforced by pre-commit):
- Line length: 88 characters
- Target: Python 3.12+
- Import sorting: Automatic with Ruff
- Type hints: Encouraged but not required for all functions
When adding new tools, follow this pattern:
@mcp.tool()
def your_new_tool(
app_id: str,
server: Optional[str] = None,
# other parameters
) -> YourReturnType:
"""
Brief description of what this tool does.
Args:
app_id: The Spark application ID
server: Optional server name to use
Returns:
Description of return value
"""
ctx = mcp.get_context()
client = get_client_or_default(ctx, server)
# Your implementation here
return client.your_method(app_id)Don't forget to add tests:
@patch("tools.get_client_or_default")
def test_your_new_tool(self, mock_get_client):
"""Test your new tool functionality"""
# Setup mocks
mock_client = MagicMock()
mock_client.your_method.return_value = expected_result
mock_get_client.return_value = mock_client
# Call the tool
result = your_new_tool("spark-app-123")
# Verify results
self.assertEqual(result, expected_result)
mock_client.your_method.assert_called_once_with("spark-app-123")Include:
- Environment: Python version, OS, uv version
- Steps to reproduce: Clear step-by-step instructions
- Expected vs actual behavior: What should happen vs what happens
- Logs: Relevant error messages or logs
- Sample data: Spark application IDs that reproduce the issue (if possible)
Include:
- Use case: Why is this feature needed?
- Proposed solution: How should it work?
- Alternatives: Other approaches considered
- Examples: Sample usage or screenshots
- README.md: Main project overview and quick start
- TESTING.md: Comprehensive testing guide
- examples/integrations/: AI agent integration examples
- Code comments: Inline documentation for complex logic
- Use emojis consistently for visual appeal
- Include code examples for all features
- Provide screenshots for UI elements
- Keep language clear and beginner-friendly
Contributors are recognized in:
- GitHub Contributors section
- Release notes for significant contributions
- Project documentation for major features
- 🐛 Issues: GitHub Issues
- 📚 Documentation: Check existing docs first
All submissions, including submissions by project members, require review. We use GitHub pull requests for this purpose. Consult GitHub Help for more information on using pull requests.