This project implements a flexible, configuration-driven pipeline for importing HIV/TB health indicators from various Excel/CSV formats into DHIS2. Built on OpenFN and Instant OpenHIE v2, it supports multiple data sources and SFTP-based file uploads.
- Multi-Format Support: Processes CSV/XLSX files with configuration-based column mapping
- Flexible Data Sources: Google Sheets API and SFTP file monitoring
- Automated Processing: Scheduled (cron) and event-driven (webhook) workflows
- Data Validation: Built-in validation rules and transformation capabilities
- Time-Based Protection: Configurable update windows to prevent accidental overwrites
- Docker Swarm Deployment: Production-ready containerized architecture
- Quick Start Guide - Deploy in 15 minutes
- Environment Setup - Full installation guide
- Production Deployment - Government instance setup
- Credential Configuration - DHIS2 & SFTP credentials
- Technical Specifications - SDD artifacts (spec, plan, tasks)
- Development Guide - Workflow development
- OpenFN Design Patterns - Best practices
- Testing Strategy - Testing approach
The project includes automated CI testing via GitHub Actions:
- Environment Setup CI: Tests the complete instant OpenHIE deployment
- Workflow Tests CI: Validates OpenFN workflows using CLI testing framework
View CI Documentation for details.
Uses Docker to run GitHub Actions locally (no installation required):
# Prerequisites: Docker must be running
./scripts/run-ci-locally.sh # Run all CI workflows
./scripts/run-ci-locally.sh --env-setup # Environment setup only
./scripts/run-ci-locally.sh --workflow-tests # Workflow tests only
./scripts/run-ci-locally.sh --list # List available workflows
./scripts/run-ci-locally.sh --verbose # Enable verbose output
./scripts/run-ci-locally.sh --help # Show all optionsNote: The script automatically builds a Docker image with act on first run. This may take a minute initially but subsequent runs are fast.
To run basic checks before pushing:
# Enable the pre-push hook
git config core.hooksPath .githooks
# To disable it later
git config --unset core.hooksPath📖 For detailed setup instructions, see the Environment Setup Guide
- Docker 20.10+ with Swarm mode
- Node.js 18+ and npm
- Git 2.25+
- Ubuntu 20.04+ or similar Linux
- 4GB RAM, 20GB disk space
git clone https://github.com/your-org/malawi-dhis2-pipeline.git
cd malawi-dhis2-pipeline
cp .env.example .env
# Edit .env with your settings./get-cli.sh linux
# Verify: ./instant --version# Build custom Docker images
./build-custom-images.sh all
# Initialize and start all services
./build-image.sh
./instant project up --env-file .envNote: See mk.sh for examples of other useful instant cli commands
After ~5 minutes for initialization:
- OpenFN: http://localhost:4000 (root@openhim.org / instant101)
- DHIS2: http://localhost:8080 (admin / district)
- SFTP: sftp://localhost:2225 (openfn / instant101)
The pipeline requires two credentials in OpenFN for production deployment.
Configure in OpenFN UI → Projects → Credentials → dhis2-credential:
- Host URL:
https://your-dhis2-instance.gov.mw(no trailing slash) - Username: DHIS2 admin account
- Password: Admin password
Configure in OpenFN UI → Credentials → combined-sftp-dhis2-credential:
- SFTP Host: Your SFTP server IP/hostname
- SFTP Port: 2225 (default)
- SFTP Username: openfn
- DHIS2 Host URL: Same as above
- DHIS2 Username:
openfn_integration(service account) - DHIS2 Password: Integration user password
The openfn_integration user in DHIS2 needs:
- Roles: Data Entry (minimum) or ALL authority
- Org Units: Assigned to all facilities data will be imported to
See quickstart.md for step-by-step credential setup.
- Environment Setup Guide - Detailed installation and configuration
- Quick Start Tutorial - Get running in 15 minutes
- Testing Guide - Comprehensive testing procedures
- Troubleshooting Guide - Common issues and solutions
- CSV/XLSX Integration Guide - File format configuration
- OpenFN Workflow Sync - Workflow development and sync
- Google Sheets Setup - Google Sheets API configuration
- SFTP Excel Integration - SFTP file processing
- OpenFN Workflows - Workflow definitions and testing
- Testing Framework - Automated test suite
- Package Documentation - Individual service documentation
- Deliverables - Project requirements and milestones
- Migration Guide - PostgreSQL to Google Sheets migration
The pipeline automatically detects and processes these file types:
- ART Data:
*ART*data*long*.xlsx- ART supervision with age/gender disaggregation - DQ Sites:
*Q*FY*DQ*sites*.xlsx- Data quality reports with completeness scores - Direct Queries:
*Direct*Queries*.xlsx- MoH quarterly reports with multi-sheet support
File type configurations are defined inline in FILE_TYPE_CONFIGS within jobs/00-scan-sftp-for-changes.js. See data-model.md for configuration structure.
# Check sync status
./packages/openfn/instant-workflow-sync.sh status
# Download workflows from UI
./packages/openfn/instant-workflow-sync.sh download
# Upload workflows to UI
./packages/openfn/instant-workflow-sync.sh upload
# Enable auto-sync watch mode
./packages/openfn/instant-workflow-sync.sh watch- Bidirectional Sync: Download from UI or upload from code
- Version Management: Track changes with lock_version support
- Conflict Resolution: Automatic or manual conflict handling
- Snapshot System: Automatic backups before changes
- Watch Mode: Auto-sync on file changes
OPENFN_SYNC_MODE=manual # manual|auto-download|auto-upload
OPENFN_CONFLICT_RESOLUTION=prompt # prompt|local-wins|remote-wins
OPENFN_ENABLE_AUTO_SNAPSHOT=true # Auto-create snapshotsSee Workflow Sync Documentation for full details.
-
Build Workflow Image:
./build-custom-images.sh openfn-workflows- Packages workflow files into Docker image
- Includes YAML configurations and job definitions
-
Deploy with Workflow Loading:
./mk.sh- Sets
OPENFN_LOAD_WORKFLOWS_ON_STARTUP=true - Deploys workflow-loader service that reads from
/app/workflows/ - Uses OpenFN CLI to deploy via provisioning API
- Sets
-
Verify Deployment:
# Test workflow loading cd projects/indicator_workflow_testing ./run-tests.sh --workflows # Check OpenFN UI # Navigate to http://localhost:4000
Workflows are defined in projects/openfn-workflows/workflows/upload-indicator-files-to-dhis2/:
project.yaml- Project configuration with workflows, jobs, triggersjobs/- Individual job definitions (.js files).versions/- Downloaded workflow versions (auto-created).snapshots/- Workflow snapshots (auto-created)README.md- Workflow documentation
Option 1: UI-First Development
- Make changes in OpenFN UI
- Test workflows in UI
- Download to code:
./packages/openfn/instant-workflow-sync.sh download - Commit changes to git
Option 2: Code-First Development
- Edit workflow files locally
- Upload to test:
./packages/openfn/instant-workflow-sync.sh upload - Test in UI
- Commit changes to git
OPENFN_LOAD_WORKFLOWS_ON_STARTUP=true- Enables automatic loadingOPENFN_WORKFLOW_MANUAL_CLI=false- Uses packaged workflows (not external files)OPENFN_SYNC_MODE=manual- Workflow sync modeOPENFN_CONFLICT_RESOLUTION=prompt- How to handle conflicts
For workflow changes without full rebuild:
# Quick sync and redeploy
./packages/openfn/instant-workflow-sync.sh upload
./instant package up -n openfn -d
# Full rebuild (if workflow structure changed)
./mk.shFile type configurations are defined inline in FILE_TYPE_CONFIGS within jobs/00-scan-sftp-for-changes.js.
Each configuration specifies:
- File patterns (glob matching)
- Column mappings (source → DHIS2 fields)
- Period format and extraction rules
- Category configurations for disaggregation
Example structure:
{
fileType: 'pepfar_tx_curr_csv',
filePatterns: ['PEPFAR_TxCURR*.csv'],
periodFormat: 'YYYY-Qx',
columnMappings: {
facility: ['site_id', 'facility_name'],
indicator: 'TX_CURR',
value: ['value', 'count']
},
periodExtraction: 'filename'
}- Edit
FILE_TYPE_CONFIGSinprojects/openfn-workflows/workflows/upload-indicator-files-to-dhis2/jobs/00-scan-sftp-for-changes.js - Add sample file to
projects/sftp/data/samples/for testing - Rebuild workflow image:
./build-custom-images.sh openfn-workflows - Deploy:
./mk.sh
See data-model.md for detailed configuration documentation.
The project includes a comprehensive testing framework for validating workflows and API functionality:
# Run all tests
./projects/indicator_workflow_testing/run-tests.sh
# Run specific test suites
./projects/indicator_workflow_testing/run-tests.sh --api # API connectivity tests
./projects/indicator_workflow_testing/run-tests.sh --excel # Excel parsing tests
./projects/indicator_workflow_testing/run-tests.sh --sftp # SFTP integration tests
./projects/indicator_workflow_testing/run-tests.sh --integration # End-to-end tests- Unit Tests:
npm test - Integration Tests: See Testing Guide
- Validation:
npm run validate-sheets(for Google Sheets)
- Automated Testing Suite - Comprehensive test framework
- API Tests: Health checks, authentication, workflow validation
- Excel Tests: Multi-sheet parsing and data transformation validation
- SFTP Tests: File transfer and workflow integration
- Integration Tests: End-to-end workflow execution with sample data
- OpenFN Dashboard: Workflow execution status
- DHIS2 Import Summary: Data import results
- Docker Service Logs:
docker service logs <service_name>
For detailed troubleshooting, see the Environment Setup Guide or the Troubleshooting Guide.
Common quick fixes:
| Issue | Quick Solution |
|---|---|
| Services not starting | Check logs: docker service logs <service_name> |
| Workflows not loading | Run: docker service update --force openfn-workflows_workflow-loader |
| DHIS2 not accessible | Wait 2-5 minutes for initialization |
| Port conflicts | Change ports in .env file |
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Update documentation
- Submit a pull request
[License information]
- Issues: GitHub Issues
- Documentation: See Documentation section above
- instant v2: https://github.com/openhie/instant-v2
- OpenFN Community: https://community.openfn.org/
- OpenFN Lightning v2.8+ with web UI and API
- Custom Docker images with working SFTP adaptor
- Automated workflow execution and monitoring
- Secure file upload endpoint for partners
- Pre-loaded with sample Excel files for testing
- Automated file monitoring and processing
- DHIS2 v2.39+ configured for Malawi health programs
- Pre-configured metadata for HIV/TB indicators
- RESTful API for data import/export
- Comprehensive test suite for all workflows
- Docker-based testing environment
- CLI and integration testing tools
All workflow documentation has been consolidated in docs/:
- Overview - Project architecture and quick start
- Development Guide - How to create and modify workflows
- Testing Strategy - Comprehensive testing approach
- SFTP to DHIS2 Testing Plan - Detailed testing plan
- Docker Environment - Docker setup and configuration
- Troubleshooting Guide - Common issues and solutions
- OpenFN Design Compliance - Design patterns and best practices
- DHIS2 Pattern Examples - DHIS2 integration patterns
For the complete testing framework documentation, see:
docs/Deliverables.md- Project deliverables and milestonesdocs/MCP-SERVERS.md- MCP server integration
- instant v2 Documentation - Infrastructure orchestration platform
- Docker Swarm Guide - Container orchestration
- OpenFN Platform Docs - Workflow automation platform
- DHIS2 Documentation - Health information system
| Issue | Solution |
|---|---|
| Docker permission denied | Run sudo usermod -aG docker $USER and logout/login |
| instant CLI not found | Ensure /usr/local/bin is in your PATH or use ./instant |
| Services failing to start | Check logs: docker service logs <service_name> |
| Port already in use | Change port mappings in .env file |
| Out of disk space | Run docker system prune -a to clean up |
# Check workflow loader logs
docker service logs openfn-workflows_workflow-loader
# Manually trigger workflow loading
cd packages/openfn/importer/workflows
docker service update --force openfn-workflows_workflow-loader# Check if service is running
docker service ps dhis2-instance_dhis2
# Wait for initialization (can take 2-5 minutes)
docker service logs -f dhis2-instance_dhis2 | grep "Server startup"# Verify SFTP service has bundled files
docker exec $(docker ps -q -f name=sftp-server) ls -la /data/excel-files/
# Should see:
# - ART_data_long_format.xlsx
# - Direct Queries - Q1 2025 MoH Reports.xlsx
# - Q2FY25_DQ_253_sites.xlsxSee the Production Deployment Guide for deploying to government DHIS2 instances.