Extract upcoming & historical odds across 8 sports, 100+ leagues, and dozens of betting markets.
Powered by Playwright browser automation. Output to JSON, CSV, or S3.
# Install
pip install oddsharvester
# Or clone & setup with uv
git clone https://github.com/jordantete/OddsHarvester.git && cd OddsHarvester
pip install uv && uv sync
# Scrape upcoming football matches
oddsharvester upcoming -s football -d 20250301 -m 1x2 --headless
# Scrape historical Premier League odds
oddsharvester historic -s football -l england-premier-league --season 2024-2025 -m 1x2 --headless| Feature | Description | |
|---|---|---|
| Upcoming | Scrape upcoming matches | Fetch odds and event details for upcoming sports matches by date or league |
| Historic | Scrape historical odds | Retrieve past odds and match results for any season |
| Multi-market | Advanced parsing | Structured data: dates, teams, scores, venues, and per-bookmaker odds |
| Storage | Flexible output | JSON, CSV (local), or direct upload to AWS S3 |
| Docker | Container-ready | Run seamlessly in Docker with environment variable configuration |
| Proxy | Proxy support | Route through SOCKS/HTTP proxies for geolocation and anti-blocking |
| Sport | Markets |
|---|---|
| β½ Football | 1x2 btts double_chance draw_no_bet over/under european_handicap asian_handicap |
| πΎ Tennis | match_winner total_sets_over/under total_games_over/under asian_handicap exact_score |
| π Basketball | 1x2 moneyline asian_handicap over/under |
| π Rugby League | 1x2 home_away double_chance draw_no_bet over/under handicap |
| π Rugby Union | 1x2 home_away double_chance draw_no_bet over/under handicap |
| π Ice Hockey | 1x2 home_away double_chance draw_no_bet btts over/under |
| βΎ Baseball | moneyline over/under |
| π American Football | 1x2 moneyline over/under asian_handicap |
100+ leagues supported across all sports β Premier League, La Liga, Serie A, NBA, NFL, MLB, NHL, ATP/WTA Grand Slams, and many more.
OddsHarvester has two main commands: upcoming and historic. They share most options, with a few command-specific ones.
Scrape odds for upcoming matches β by date, by league, or by specific match URL.
# By date
oddsharvester upcoming -s football -d 20250301 -m 1x2 --headless
# By league (scrapes all upcoming matches for that league)
oddsharvester upcoming -s football -l england-premier-league -m 1x2,btts --headless
# Multiple leagues
oddsharvester upcoming -s football -l england-premier-league,spain-laliga -m 1x2 --headless
# Specific match URLs
oddsharvester upcoming -s football --match-link "https://www.oddsportal.com/football/..." -m 1x2
# Preview mode (faster β average odds only, no individual bookmakers)
oddsharvester upcoming -s football -d 20250301 -m over_under --preview-only --headlessScrape historical odds and results for past seasons.
# Single league & season
oddsharvester historic -s football -l england-premier-league --season 2022-2023 -m 1x2 --headless
# Current season
oddsharvester historic -s football -l england-premier-league --season current -m 1x2 --headless
# Limit pagination
oddsharvester historic -s football -l england-premier-league --season 2022-2023 -m 1x2 --max-pages 3 --headless
# Output as CSV
oddsharvester historic -s football -l england-premier-league --season 2024-2025 -m 1x2 -f csv -o premier_league_odds --headless| Option | Short | Description | Default |
|---|---|---|---|
--sport |
-s |
Sport to scrape (football, tennis, basketball, etc.) |
required |
--date |
-d |
Target date in YYYYMMDD format |
β |
--league |
-l |
Comma-separated league slugs (e.g. england-premier-league) |
β |
--market |
-m |
Comma-separated markets (e.g. 1x2,btts) |
β |
--match-link |
Specific match URL (repeatable). Overrides --sport, --date, --league |
β |
upcoming only: --date is required unless --league or --match-link is provided. When --league is set, --date is ignored.
historic only:
| Option | Description | Default |
|---|---|---|
--season |
Season: YYYY, YYYY-YYYY, or current |
required |
--max-pages |
Max number of result pages to scrape | unlimited |
| Option | Short | Description | Default |
|---|---|---|---|
--storage |
local or remote (S3) |
local |
|
--format |
-f |
json or csv |
json |
--output |
-o |
Output file path | scraped_data |
| Option | Short | Description | Default |
|---|---|---|---|
--headless |
Run browser in headless mode | False |
|
--concurrency |
-c |
Concurrent scraping tasks | 3 |
--request-delay |
Delay (sec) between match requests | 1.0 |
|
--user-agent |
Custom browser user agent | β | |
--locale |
Browser locale (e.g. fr-BE) |
β | |
--timezone |
Browser timezone (e.g. Europe/Brussels) |
β |
| Option | Description |
|---|---|
--proxy-url |
Proxy URL (http://... or socks5://...) |
--proxy-user |
Proxy username |
--proxy-pass |
Proxy password |
Tip: For best results, match
--localeand--timezoneto your proxy's region.
| Option | Description | Default |
|---|---|---|
--target-bookmaker |
Filter odds for a specific bookmaker | β |
--odds-history |
Include historical odds movement per match | False |
--odds-format |
Odds display format | Decimal Odds |
--preview-only |
Fast mode β average odds only, no bookmaker details | False |
--bookies-filter |
Bookmaker filter: all, classic, or crypto |
all |
--period |
Match period (sport-specific: full-time, halves, etc.) | sport default |
Preview Mode vs Full Mode
| Aspect | Full Mode | Preview Mode |
|---|---|---|
| Speed | Slower (interactive) | Faster (passive) |
| Data | All submarkets + bookmakers | Visible submarkets + avg odds |
| Bookmakers | Individual bookmaker odds | Average odds only |
| Odds History | Available | Not available |
| Structure | By bookmaker | By submarket (avg odds) |
Preview mode (--preview-only) is useful for quick exploration, testing data format, or light monitoring with reduced resource usage.
All CLI options can be set via environment variables β useful for Docker or CI/CD.
View all environment variables
| Variable | CLI Option | Description |
|---|---|---|
OH_SPORT |
--sport |
Sport to scrape |
OH_LEAGUES |
--league |
Comma-separated leagues |
OH_MARKETS |
--market |
Comma-separated markets |
OH_STORAGE |
--storage |
Storage type (local/remote) |
OH_FORMAT |
--format |
Output format (json/csv) |
OH_FILE_PATH |
--output |
Output file path |
OH_HEADLESS |
--headless |
Run in headless mode |
OH_CONCURRENCY |
--concurrency |
Number of concurrent tasks |
OH_REQUEST_DELAY |
--request-delay |
Delay between requests (sec) |
OH_PROXY_URL |
--proxy-url |
Proxy server URL |
OH_PROXY_USER |
--proxy-user |
Proxy username |
OH_PROXY_PASS |
--proxy-pass |
Proxy password |
OH_USER_AGENT |
--user-agent |
Custom browser user agent |
OH_LOCALE |
--locale |
Browser locale |
OH_TIMEZONE |
--timezone |
Browser timezone ID |
export OH_SPORT=football
export OH_HEADLESS=true
export OH_PROXY_URL=http://proxy.example.com:8080
oddsharvester upcoming -d 20250301 -m 1x2pip install oddsharvestergit clone https://github.com/jordantete/OddsHarvester.git
cd OddsHarvester
pip install uv
uv syncManual setup (venv + pip or poetry)
python3 -m venv .venv
source .venv/bin/activate # Unix/macOS
# .venv\Scripts\activate # Windows
pip install . --use-pep517
# or: poetry installVerify installation:
oddsharvester --help# Build
docker build -t odds-harvester:local --target local-dev .
# Run
docker run --rm odds-harvester:local \
python3 -m oddsharvester upcoming -s football -d 20250301 -m 1x2 --headless
# Or with environment variables
docker run --rm \
-e OH_SPORT=football \
-e OH_HEADLESS=true \
odds-harvester:local python3 -m oddsharvester upcoming -d 20250301 -m 1x2Cloud Deployment (AWS Lambda + Serverless)
OddsHarvester can be deployed on AWS Lambda using the Serverless Framework with a Docker image (Playwright exceeds Lambda's 50MB deployment limit).
Setup:
- Build the Docker image and push to ECR
- Configure
serverless.yamlat the project root:- Set your AWS region, S3 bucket ARN, and IAM permissions
- Default function:
scanAndStoreOddsPortalDataV2(2048MB, 360s timeout) - Triggers via EventBridge every 2 hours by default
- Deploy:
sls deployRefer to the Serverless Framework docs for detailed setup instructions.
Contributions are welcome! Submit an issue or pull request. Please follow the project's coding standards and include clear descriptions for any changes.
This package is intended for educational purposes only. The author is not affiliated with or endorsed by oddsportal.com. Use responsibly and ensure compliance with their terms of service and applicable laws.