anibridge-mappings is a dataset and pipeline for mapping episode-level relationships between anime entries across various databases, including AniDB, AniList, MAL, TMDB, TVDB, and IMDB. The schema was designed for use in the AniBridge project, but the dataset is open for anyone to use and contribute to.
The mapping payload is generated by a Python pipeline that merges, validates, and serializes data from multiple trusted sources.
Releases are updated daily and you can explore the dataset interactively at https://mappings.anibridge.eliasbenb.dev.
A huge thank you to the primary mappings maintainer, @LuceoEtzio, for contributing over 4,000 mapping edits! ❤️
The latest mappings can be downloaded from the releases page. The release assets include:
mappings.json: the main dataset in JSON format.mappings.json.zst: the main dataset compressed with zstd.mappings.min.json: the main dataset in minified JSON format.provenance.zip: a zip file containing provenance data for each mapping entry (for internal use at https://mappings.anibridge.eliasbenb.dev).stats.json: summary counts about the dataset.
Note: releases are updated daily and tagged with a v{major} version, where breaking changes to the schema or pipeline will increment the major version. Patch releases may be made within the same major version to fix mapping errors or make minor, non-breaking schema adjustments.
- Fetch sources: Download upstream datasets and metadata feeds.
- Build ID graph: Collect cross-database ID links from sources.
- Collect metadata: Fetch relevant metadata (episode counts, durations, season info, etc.) from sources.
- Build episode graph: Normalize and merge episode mappings from all sources.
- Infer mappings: Use techniques like transitive closure and metadata alignment to infer missing episode mappings.
- Apply edits: Overlay mapping overrides from mappings.edits.yaml onto the aggregated data.
- Validate & prune: Validate episode ranges against metadata and remove invalid, overlapping, or inconsistent mappings.
- Emit schema: Serialize to the mappings.schema.json format.
| Source | Metadata | ID Mappings | Episode Mappings | Providers |
|---|---|---|---|---|
| Anime-Lists/anime-lists | No | Yes | Yes | AniDB, IMDB, TMDB, TVDB |
| manami-project/anime-offline-database | Not Yet | Yes | No | AniDB, AniList, MAL |
| notseteve/AnimeAggregations | Yes | Yes | No | AniDB, IMDB, MAL, TMDB |
| varoOP/shinkro-mapping | No | Yes | Yes | MAL, TMDB, TVDB |
| QLever | Yes | Yes | No | AniDB, AniList, IMDB, MAL, TMDB, TVDB |
| AniList GraphQL | Yes | Not Yet | No | AniList |
| MyAnimeList API | Yes | No | No | MAL |
| TMDB API | Yes | No | No | TMDB |
| TVDB API | Yes | No | No | TVDB |
Note: "Not Yet" indicates potential future work.
The output is a JSON object where each key is a source descriptor and each value is a map of target descriptors. Mappings are unidirectional: a mapping from A -> B does not imply B -> A, so reverse lookups require their own explicit entries. Descriptors use the format:
provider:id[:scope]
provider: one ofanidb,anilist,imdb_movie,imdb_show,mal,tmdb_show,tmdb_movie,tvdb_show.id: the provider-specific identifier (e.g. AniDB ID1234or TMDB IDtt1234567).scope: is optional and used to denote some type of subsetting. It is important to understand that this schema is flexible, different providers will have different notations for the scope:imdb_show|tmdb_show|tvdb_show: these will use and require scopes to denote seasons in the formats{season_number}(e.g.s1,s0)anidb: uses and requires scopes to denote episode types:R(regular),S(specials),O(other),C(credits),T(trailers),P(parodies).anilist|mal: these omit scopes, as they don't have a concept of seasons or episode types in the same way (e.g.anilist:12345).imdb_movie|tmdb_movie|tvdb_movie: these omit scopes since movies don't have seasons or episode types (e.g.imdb_movie:tt1234567).
Each target descriptor maps source episode ranges to target ranges:
The key, value of each target descriptor is a map where keys denote a source range and values denote the corresponding target range. For the purposes of this dataset, keys and values will define episode ranges.
Source ranges must be a single contiguous range:
x[-y]
Target ranges support comma-separated segments and an optional trailing ratio:
x[-y][,x2[-y2]...][|ratio]
x: starting episode number (1-based).y: optional ending episode number (inclusive). If omitted, the range is open-ended.ratio: optional ratio indicating the 'weight' of each episode in a target mapping. A positive rationindicates each source episode spansntarget episodes. A negative ratio-nindicates each source episode spans1/ntarget episodes.- Multiple ranges can be comma-separated to denote non-contiguous mappings. Note: non-contiguous ranges are only supported on the target side.
- The ratio must appear at the end of the target range string.
{
"tmdb_show:500:s1": {
"anilist:1003": {
"1-12": "1-12", // source episodes 1-12 map to target episodes 1-12
"14-": "13-", // source episodes 14 and onward map to target episodes 13 and onward
},
"mal:2003": {
"1-12": "1-6,8-13", // source episodes 1-12 map to target episodes 1-6 and 8-13 (skipping 7)
"13-": "14-|2", // source episodes 13 and onward map to target episodes 14 and onward at double granularity
},
},
}Mapping overrides live in mappings.edits.yaml. The format mirrors the schema structure: a source descriptor maps to target descriptors, which in turn map source ranges to target ranges.
Example:
anilist:12345: # Some comment about this mapping
tvdb_show:98765:s1:
"1-12": "1-12"
tmdb_show:54321:s1:
"1-12": "1-12"When the pipeline runs, it removes any existing mappings between the specified source and target scopes and replaces them with your entries.
The CLI entrypoint is main.py. Typical usage:
uv run ./main.pyOptions:
--out: output file path (default:data/out/mappings.json)--edits: path to the edits file (default:mappings.edits.yaml)--compress: emit minified and zstd-compressed outputs todata/out/--stats: emitstats.jsontodata/out/--provenance: emitprovenance.zipcontainingmanifest.json,descriptor-index.json, anddescriptors/*.jsonfiles--log-level: set logging verbosity (default:INFO)
Note: TMDB and TVDB metadata fetching require authentication in TMDB_API_KEY and TVDB_API_KEY. MAL ranking metadata uses MAL_CLIENT_ID only, and falls back to the public client ID baked into the source when unset.
The best way to contribute is by fixing or adding mappings in mappings.edits.yaml. If you need to reference why a mapping was changed, include a comment inside the mapping entry (not at the root level), so the formatter can preserve it.
{ "tvdb_show:2:s0": { "anilist:1001": {}, // from tvdb id 2, season 0 to anilist id 1001 "mal:2001": {}, // from tvdb id 2, season 0 to mal id 2001 }, "tmdb_show:3:s1": { "anilist:1002": {}, // from tmdb id 3, season 1 to anilist id 1002 }, }