Explicit Cross-Modal Reasoning for Visual Commonsense Reasoning

The Code for IEEE TMM paper "Explicit Cross-Modal Reasoning for Visual Commonsense Reasoning"

Setting up and using the repo

Get the dataset. Follow the steps in data/README.md. This includes the steps to get the pretrained BERT embeddings and the parsed results of sentences.
Install cuda 10.0 if it's not available already.
Install anaconda if it's not available already, and create a new environment. You need to install a few things, namely, pytorch 1.1.0, torchvision, and allennlp.

wget https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh
conda update -n base -c defaults conda
conda create --name CMR python=3.6
source activate CMR

conda install numpy pyyaml setuptools cmake cffi tqdm pyyaml scipy ipython mkl mkl-include cython typing h5py pandas nltk spacy numpydoc scikit-learn jpeg

conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch

pip install -r allennlp-requirements.txt
pip install --no-deps allennlp==0.8.0
python -m spacy download en_core_web_sm


# this one is optional but it should help make things faster
pip uninstall pillow && CC="cc -mavx2" pip install -U --force-reinstall pillow-simd

That's it! Now to set up the environment, run source activate CMR.

Train/Evaluate models

Please refer to models/README.md.

Citations

If you make use of this repository for your research, please cite the following paper:

@article{zhang2021explicit,
  title={Explicit Cross-Modal Representation Learning for Visual Commonsense Reasoning},
  author={Zhang, Xi and Zhang, Feifei and Xu, Changsheng},
  journal={IEEE Transactions on Multimedia},
  year={2021},
  publisher={IEEE}
}

Acknowledgement

We refer to the repo r2c and tab-vcr for preprocessing codes.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
dataloaders		dataloaders
models		models
pygcn		pygcn
utils		utils
README.md		README.md
allennlp-requirements.txt		allennlp-requirements.txt
config.py		config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Explicit Cross-Modal Reasoning for Visual Commonsense Reasoning

Setting up and using the repo

Train/Evaluate models

Citations

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Explicit Cross-Modal Reasoning for Visual Commonsense Reasoning

Setting up and using the repo

Train/Evaluate models

Citations

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages