Skip to content

Fsoft-AIC/FastDiSS

Repository files navigation

Difformer

The official codebase for FastDiSS: Few-step Match Many-step Diffusion Language Model on Sequence-to-Sequence Generation (Accepted to ACL Findings 2026).

Overview

Getting started

Our implementation is based on Difformer with some small modification to the training process. The following command will install the dependencies and this package in a Conda environment (Python 3.9):

conda create -n fastdiss python=3.9 pip=23.0
conda activate fastdiss
pip install -r requirements.txt

Data preparing

We follow the instructions of Fairseq to preprocess the translation datasets. To binarize the distilled and tokenized datasets, run following command (take the IWSLT14 De-En dataset as an example):

fairseq-preprocess \
    --source-lang de --target-lang en \
    --trainpref {PATH-TO-YOUR-DATASET}/train \
    --validpref {PATH-TO-YOUR-DATASET}/valid \
    --testpref {PATH-TO-YOUR-DATASET}/test \
    --destdir data-bin/iwslt14_de_en \
    --workers 20

We provide the pre-processed datasets here: Kaggle

Training

All training and evaluation scripts are put in the ./scripts directory. For example, to train Difformer on the IWSLT14 De-En dataset, modify the save path, data path, and simply run:

bash scripts/iwslt14_de_en/train.sh

Decoding and evaluation

We do not apply checkpoint averaging for evaluation. To evaluate FastDiSS on the IWSLT14 De-En dataset, modify the model path, gen path, and simply run:

bash scripts/iwslt14_de_en/evaluate.sh

License

Copyright © 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

All material, including source code and pre-trained models, is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Acknowledgements

This work is heavily built upon the code from: difformer, improved-diffusion, and Fairseq

Citation

@inproceedings{nichol2021improved,
  title={Improved denoising diffusion probabilistic models},
  author={Nichol, Alexander Quinn and Dhariwal, Prafulla},
  booktitle={International conference on machine learning},
  pages={8162--8171},
  year={2021},
  organization={PMLR}
}
@article{gao2022difformer,
  title={Empowering Diffusion Model on Embedding Space for Text Generation},
  author={Gao, Zhujin and Guo, Junliang and Tan, Xu and Zhu, Yongxin and Zhang, Fang and Bian, Jiang and Xu, Linli},
  journal={arXiv preprint arXiv:2212.09412},
  year={2022}
}

Development

This is a research reference implementation and is treated as a one-time code drop. As such, we do not accept outside code contributions in the form of pull requests.

About

Official implementation of the paper FastDiSS: Few-step Match Many-step Diffusion Language Model on Sequence-to-Sequence Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors