A production-style movie recommender built with open movie datasets, hybrid ranking, a FastAPI backend, and a Streamlit interface. The system combines content similarity, collaborative filtering, audience-review signals, and lightweight natural-language query understanding so recommendations feel targeted instead of generic.
- Audience-aware hybrid ranking instead of genre-only matching
- Natural-language search with typed or selected movie seeds
- Content-based, collaborative, and neural recommenders in one pipeline
- FastAPI inference service plus Streamlit product UI
- Open-source LLM support through Ollama or Hugging Face Transformers
- MLflow-ready training and evaluation workflow
flowchart LR
U[User] --> UI[Streamlit UI]
UI --> API[FastAPI API]
API --> SERVICE[MovieRecommenderService]
SERVICE --> HYBRID[Hybrid Ranker]
HYBRID --> CONTENT[Content Recommender]
HYBRID --> AUDIENCE[Audience Signal Ranker]
HYBRID --> COLLAB[Collaborative Models]
CONTENT --> BUNDLE[(Model Bundle)]
AUDIENCE --> BUNDLE
COLLAB --> BUNDLE
flowchart TD
ML[MovieLens ratings and tags] --> PREP[Preprocess and feature engineering]
META[IMDb and metadata catalog] --> PREP
PREP --> FEATURES[Content and audience features]
FEATURES --> TRAIN[Train hybrid models]
TRAIN --> EVAL[Evaluate and tune weights]
EVAL --> BUNDLE[(Saved model bundle)]
BUNDLE --> API[FastAPI inference]
BUNDLE --> UI[Streamlit recommendations]
More detail lives in docs/architecture.md.
src/movie_recommender/
api/ FastAPI schemas and app wiring
cli/ Training and serving commands
config/ Project settings
data/ Downloading, catalog prep, preprocessing
features/ Vectorization and content feature stores
llm/ Query parsing and explanation backends
models/ Matrix factorization and autoencoder models
ranking/ Audience-aware ranking logic
recommenders/ Content, popularity, SVD, and hybrid ranking
services/ Training, evaluation, inference orchestration
apps/ Streamlit application
api/ API entrypoint
tests/ Regression and integration tests
docs/ Deployment notes, architecture, and assets
- Create a virtual environment and install dependencies:
$ python3 -m venv .venv
$ source .venv/bin/activate
$ pip install -e ".[dev]"- Prepare a free movie catalog and optional metadata:
$ movie-recommender prepare-data --dataset starterThis expands the recommendation catalog with a broader external movie list by default. For a lighter local setup, add --catalog-source none.
- Train the models and save the bundle:
$ movie-recommender train --dataset starter- Evaluate the bundle:
$ movie-recommender evaluate- Run the API:
$ movie-recommender serve-api- Run the Streamlit app:
$ movie-recommender serve-uiThe downloader supports these built-in free presets:
starter: quickest local iteration and testingexpanded: larger catalog for broader experimentsbenchmark: larger benchmark-scale trainingclassic: compact classic benchmark
You can also pass a direct HTTPS zip URL or a local .zip catalog path with --dataset.
The app keeps interaction data for personalization, then expands the title catalog with a broader external source so you can search and recommend far more movies than the starter interaction set alone.
Defaults:
- catalog source:
imdb - catalog limit:
250000 - minimum votes:
25
Use a lighter setup:
$ movie-recommender prepare-data --dataset starter --catalog-source noneUse a broader setup:
$ movie-recommender prepare-data --dataset starter --catalog-source imdb --catalog-limit 0The system supports two open-source LLM adapters:
ollama: connects to a local Ollama server and can use models likellama3transformers: loads a local Hugging Face model such asgoogle/flan-t5-small
Both backends are optional. If neither is available, the system falls back to deterministic query parsing and template-based explanations.
The dataset downloader uses streamed HTTPS downloads and automatically falls back to curl when Python SSL verification fails on local proxy or custom-certificate setups.
If your machine uses a custom root certificate, run:
$ movie-recommender prepare-data --ca-bundle /path/to/certificate.pemYou can also set:
$ export MOVIE_RECOMMENDER_CA_BUNDLE=/path/to/certificate.pemAs a last resort on a trusted network:
$ movie-recommender prepare-data --insecure-downloadIf you prefer to download the archive yourself in a browser or with another tool:
$ movie-recommender prepare-data --dataset ~/Downloads/starter.zip