Difformer

The official codebase for FastDiSS: Few-step Match Many-step Diffusion Language Model on Sequence-to-Sequence Generation (Accepted to ACL Findings 2026).

Getting started

Our implementation is based on Difformer with some small modification to the training process. The following command will install the dependencies and this package in a Conda environment (Python 3.9):

conda create -n fastdiss python=3.9 pip=23.0
conda activate fastdiss
pip install -r requirements.txt

Data preparing

We follow the instructions of Fairseq to preprocess the translation datasets. To binarize the distilled and tokenized datasets, run following command (take the IWSLT14 De-En dataset as an example):

fairseq-preprocess \
    --source-lang de --target-lang en \
    --trainpref {PATH-TO-YOUR-DATASET}/train \
    --validpref {PATH-TO-YOUR-DATASET}/valid \
    --testpref {PATH-TO-YOUR-DATASET}/test \
    --destdir data-bin/iwslt14_de_en \
    --workers 20

We provide the pre-processed datasets here: Kaggle

Training

All training and evaluation scripts are put in the ./scripts directory. For example, to train Difformer on the IWSLT14 De-En dataset, modify the save path, data path, and simply run:

bash scripts/iwslt14_de_en/train.sh

Decoding and evaluation

We do not apply checkpoint averaging for evaluation. To evaluate FastDiSS on the IWSLT14 De-En dataset, modify the model path, gen path, and simply run:

bash scripts/iwslt14_de_en/evaluate.sh

License

All material, including source code and pre-trained models, is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Acknowledgements

This work is heavily built upon the code from: difformer, improved-diffusion, and Fairseq

Citation

@inproceedings{nichol2021improved,
  title={Improved denoising diffusion probabilistic models},
  author={Nichol, Alexander Quinn and Dhariwal, Prafulla},
  booktitle={International conference on machine learning},
  pages={8162--8171},
  year={2021},
  organization={PMLR}
}
@article{gao2022difformer,
  title={Empowering Diffusion Model on Embedding Space for Text Generation},
  author={Gao, Zhujin and Guo, Junliang and Tan, Xu and Zhu, Yongxin and Zhang, Fang and Bian, Jiang and Xu, Linli},
  journal={arXiv preprint arXiv:2212.09412},
  year={2022}
}

Development

This is a research reference implementation and is treated as a one-time code drop. As such, we do not accept outside code contributions in the form of pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
fastdiss		fastdiss
improved_diffusion		improved_diffusion
scripts		scripts
.gitignore		.gitignore
README.md		README.md
calculate_rouge.py		calculate_rouge.py
calculate_sacrebleu.py		calculate_sacrebleu.py
calculate_seq2seq.py		calculate_seq2seq.py
generate.py		generate.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Difformer

Getting started

Data preparing

Training

Decoding and evaluation

License

Acknowledgements

Citation

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Difformer

Getting started

Data preparing

Training

Decoding and evaluation

License

Acknowledgements

Citation

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages