multilingual-image-recipe-retrieval

Image-recipe data for "Mitigating Cross-modal Representation Bias for Multicultural Image-to-Recipe Retrieval" (MM 2025).

arXiv and paper

Data preparation

The training files can be downloaded from here. There are five directories, each corresponding to one culture. Take indonesia directory as example, id_train.pkl, id_val.pkl and id_test.pkl contains the recipes information and are in dictionary format, where recipe id is the key and recipe info (title, ingredients, instructions, dish image) are the value.

Under directory of ingredient/top100, there are ingredients information.

ing2label.pkl contains the 100 most frequent ingredients of Indonesia cuisines, which is a dictionary with key as the original ingredient names and value as the corresponding ingredient label.

id2labels_train.pkl, id2labels_val.pkl and id2labels_test.pkl contains the ingredient labels of each recipe for train, test and valiation datasets, which is a dictionary of recipe id as key and ingredient label list as the value.

More details about the dataset can be found in Huggingface.

Training the baseline model OpenCLIP

python -m open_clip_train.main_combined_all --lang id_malaysia_thailand_vietnam_india --lr 1e-5 --batch-size 8 --epochs 100

The training log can be found in ./logs/s1_bs8_e100.log. The checkpoint can be downloaded from here.

To evaluate the model:

python -m open_clip_train.main_combined_all --lang id_malaysia_thailand_vietnam_india --lr 1e-5 --batch-size 8 --epochs 100 --eval

Create the ingredient dictionary

There are five regions and we need to create the ingredient dictionary for each culture so that the proposed debiasing method can utilize for ingredient debiasing.

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang id

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang malaysia

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang thailand

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang vietnam

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang india

The dictionaries are available here.

Training the HT with Ingredient Debiasing

python -m open_clip_train.main_combined_all_debiasing --lang id_malaysia_thailand_vietnam_india --epochs 100 --lr 1e-5 --batch-size 8 --ingredient_num 100 --ingredient_weight 1e-4

The training log can be found in ./logs/s2_bs8_e100.log. The checkpoint can be downloaded from here.

To evaluate the model:

python -m open_clip_train.main_combined_all_debiasing --lang id_malaysia_thailand_vietnam_india --epochs 100 --lr 1e-5 --batch-size 8 --ingredient_num 100 --ingredient_weight 1e-4 --eval

Acknowledgement

My implementation is based on OpenCLIP and query2labels. Great thanks to their contirbutions.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
tests		tests
tutorials		tutorials
.DS_Store		.DS_Store
CITATION.cff		CITATION.cff
HISTORY.md		HISTORY.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
multilingual_environment.yml		multilingual_environment.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-test.txt		requirements-test.txt
requirements-training.txt		requirements-training.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

multilingual-image-recipe-retrieval

Data preparation

Training the baseline model OpenCLIP

Create the ingredient dictionary

Training the HT with Ingredient Debiasing

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

multilingual-image-recipe-retrieval

Data preparation

Training the baseline model OpenCLIP

Create the ingredient dictionary

Training the HT with Ingredient Debiasing

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages