🎟️ WTW Cinemas Calendar Scraper

Scrapes upcoming film releases from WTW Cinemas across multiple Cornwall locations and publishes per-cinema iCalendar feeds plus a GitHub Pages index for easy subscription.

Links: Live calendar page · Repository

📚 Table of Contents

⚡ Quick Start
✨ Features
📦 Installation
🚀 Usage
⚙️ Configuration
🤖 GitHub Actions Automation
🌐 GitHub Pages Setup
🧩 Dependencies
🛠️ Troubleshooting
⚠️ Known Limitations
📄 License

⚡ Quick Start

git clone https://github.com/evenwebb/wtw-cinemas-calendar.git
cd wtw-cinemas-calendar
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python3 cinema_scraper.py

✅ Generated output:

docs/wtw-<cinema>.ics (one per enabled cinema)
docs/index.html
cache/history files (.film_cache.json, .tmdb_cache.json, .release_history.json)

✨ Features

Feature	Description
`🎭 Multi-Cinema Support`	Scrapes any enabled combination of St Austell, Newquay, Wadebridge, and Truro.
`📝 Rich Event Details`	Adds runtime, synopsis, cast, and booking URLs where available.
`💾 Smart Caching`	Uses local film/TMDb caches to reduce unnecessary repeat scraping and API usage.
`🔔 Configurable Notifications`	Optional calendar reminders (day-before, same-day, weekly, or custom time).
`📅 Per-Cinema iCal Feeds`	Generates separate `.ics` files for each cinema with stable deduplicated events.
`🧰 Robust Parsing`	Handles multiple date formats and WTW page structures with retry/backoff requests.
`🌐 GitHub Pages Output`	Builds `docs/index.html` with subscribe links and publishes via Pages.
`🤖 Automated Workflow`	Daily GitHub Actions run with retries, conditional commits, and optional failure issue creation.

📦 Installation

git clone https://github.com/evenwebb/wtw-cinemas-calendar.git
cd wtw-cinemas-calendar
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

🚀 Usage

python3 cinema_scraper.py

The script fetches releases for enabled cinemas and updates docs/ output for local use or GitHub Pages.

⚙️ Configuration

Primary settings are in cinema_scraper.py.

Option	Default	Description
`CINEMAS`	all 4 enabled	Cinema locations to scrape (enable/disable individually).
`NOTIFICATION_TIME`	`09:00`	Default reminder time for notifications.
`NOTIFICATIONS`	disabled	Optional VALARM rules in calendar events.
`CACHE_FILE`	`.film_cache.json`	Film details cache file.
`CACHE_EXPIRY_DAYS`	`7`	Film cache retention in days.
`TMDB_CACHE_FILE`	`.tmdb_cache.json`	TMDb enrichment cache file.
`TMDB_CACHE_DAYS`	`30`	TMDb cache retention in days.
`CALENDAR_TIMEZONE` (env)	`Europe/London`	Timezone for generated calendar events.
`TMDB_API_KEY` (env/secret)	unset	Enables TMDb enrichment when set.

🤖 GitHub Actions Automation

This repo includes .github/workflows/scrape_cinema.yml:

⏰ Runs daily at 09:00 UTC
🖱️ Supports manual runs (workflow_dispatch)
🔁 Retries scraper runs before failing (SCRAPER_RUN_ATTEMPTS, default 2)
📝 Commits only changed output/cache/history files
🚨 Optionally opens or updates a GitHub issue on failure (CREATE_FAILURE_ISSUE=true)

Recommended repository secrets:

TMDB_API_KEY (optional)
SCRAPER_RUN_ATTEMPTS (integer)
CREATE_FAILURE_ISSUE (true/false)

🌐 GitHub Pages Setup

Open Settings -> Pages in GitHub.
Choose Deploy from a branch.
Select branch main and folder /docs.
Save.

Published index page:

https://evenwebb.github.io/wtw-cinemas-calendar/

🧩 Dependencies

Package	Purpose
`requests`	HTTP requests for listings/details/TMDb
`beautifulsoup4`	HTML parsing for listing and detail extraction

🛠️ Troubleshooting

🧱 If no films appear, verify WTW page structure hasn’t changed.
🔑 If TMDb metadata is missing, check TMDB_API_KEY and quota status.
📜 Review cinema_log.txt for parsing/runtime errors.
🔁 Increase SCRAPER_RUN_ATTEMPTS if failures are intermittent.

⚠️ Known Limitations

🌐 Scraping depends on current WTW site markup and wording.
🎯 TMDb matching is best-effort and may occasionally choose imperfect results.

📄 License

This project is provided as-is for personal use. Please respect the source website terms of service.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎟️ WTW Cinemas Calendar Scraper

📚 Table of Contents

⚡ Quick Start

✨ Features

📦 Installation

🚀 Usage

⚙️ Configuration

🤖 GitHub Actions Automation

🌐 GitHub Pages Setup

🧩 Dependencies

🛠️ Troubleshooting

⚠️ Known Limitations

📄 License

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
.github/workflows		.github/workflows
docs		docs
.film_cache.json		.film_cache.json
.gitignore		.gitignore
.release_history.json		.release_history.json
.tmdb_cache.json		.tmdb_cache.json
LICENSE		LICENSE
README.md		README.md
cinema_log.txt		cinema_log.txt
cinema_scraper.py		cinema_scraper.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎟️ WTW Cinemas Calendar Scraper

📚 Table of Contents

⚡ Quick Start

✨ Features

📦 Installation

🚀 Usage

⚙️ Configuration

🤖 GitHub Actions Automation

🌐 GitHub Pages Setup

🧩 Dependencies

🛠️ Troubleshooting

⚠️ Known Limitations

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages