Skip to content

GZWQ/multilingual-image-recipe-retrieval

Repository files navigation

multilingual-image-recipe-retrieval

Image-recipe data for "Mitigating Cross-modal Representation Bias for Multicultural Image-to-Recipe Retrieval" (MM 2025).

arXiv and paper

Data preparation

The training files can be downloaded from here. There are five directories, each corresponding to one culture. Take indonesia directory as example, id_train.pkl, id_val.pkl and id_test.pkl contains the recipes information and are in dictionary format, where recipe id is the key and recipe info (title, ingredients, instructions, dish image) are the value.

Under directory of ingredient/top100, there are ingredients information.

ing2label.pkl contains the 100 most frequent ingredients of Indonesia cuisines, which is a dictionary with key as the original ingredient names and value as the corresponding ingredient label.

id2labels_train.pkl, id2labels_val.pkl and id2labels_test.pkl contains the ingredient labels of each recipe for train, test and valiation datasets, which is a dictionary of recipe id as key and ingredient label list as the value.

More details about the dataset can be found in Huggingface.

Training the baseline model OpenCLIP

python -m open_clip_train.main_combined_all --lang id_malaysia_thailand_vietnam_india --lr 1e-5 --batch-size 8 --epochs 100 

The training log can be found in ./logs/s1_bs8_e100.log. The checkpoint can be downloaded from here.

To evaluate the model:

python -m open_clip_train.main_combined_all --lang id_malaysia_thailand_vietnam_india --lr 1e-5 --batch-size 8 --epochs 100 --eval

Create the ingredient dictionary

There are five regions and we need to create the ingredient dictionary for each culture so that the proposed debiasing method can utilize for ingredient debiasing.

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang id

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang malaysia

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang thailand

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang vietnam

python -m open_clip_train.main_zeroshot --dataset Cookpad --ingredient_num 100 --lang india

The dictionaries are available here.

Training the HT with Ingredient Debiasing

python -m open_clip_train.main_combined_all_debiasing --lang id_malaysia_thailand_vietnam_india --epochs 100 --lr 1e-5 --batch-size 8 --ingredient_num 100 --ingredient_weight 1e-4

The training log can be found in ./logs/s2_bs8_e100.log. The checkpoint can be downloaded from here.

To evaluate the model:

python -m open_clip_train.main_combined_all_debiasing --lang id_malaysia_thailand_vietnam_india --epochs 100 --lr 1e-5 --batch-size 8 --ingredient_num 100 --ingredient_weight 1e-4 --eval

Acknowledgement

My implementation is based on OpenCLIP and query2labels. Great thanks to their contirbutions.

About

Image-recipe data for "Mitigating Cross-modal Representation Bias for Multicultural Image-to-Recipe Retrieval" (MM 2025).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors