A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

This repository contains the code and data associated with our RANLP 2025 paper A Framework for Fine-Tuning LLMs using Heterogeneous Feedback by Ryan Aponte, Ryan A. Rossi, Shunan Guo, Franck Dernoncourt, Tong Yu, Xiang Chen, Subrata Mitra, and Nedim Lipka.

Fine-tuning was performed on 8xA100-80GB and Python 3.7 was used. Fine-tuning was performed with Stack-LLaMA and the entire process took under 24 hours per model.

Explanation of Directories

7B_huggingface - the weights for LLaMA in Huggingface format
evaluation - contains scripts to get results and directories for results
finetune_llama - fine-tuned model weights
generative_task - generative task in Appendix E. 3
instruction_following_eval - Script to generate dataset for IFEval.

Citation

If you use this repository, please cite our paper:

@misc{aponte2024frameworkfinetuningllmsusing,
      title={A Framework for Fine-Tuning LLMs using Heterogeneous Feedback}, 
      author={Ryan Aponte and Ryan A. Rossi and Shunan Guo and Franck Dernoncourt and Tong Yu and Xiang Chen and Subrata Mitra and Nedim Lipka},
      year={2024},
      eprint={2408.02861},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.02861}, 
}

License

The evaluation code and needle set data is licensed under the Adobe Research License. The license prohibits commercial use and allows non-commercial research use.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
evaluation		evaluation
finetune_llama		finetune_llama
generative_task		generative_task
instruction_following_eval		instruction_following_eval
7B_huggingface.txt		7B_huggingface.txt
LICENSE.md		LICENSE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

Explanation of Directories

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

adobe-research/heterogeneous-fine-tuning

Folders and files

Latest commit

History

Repository files navigation

A Framework for Fine-Tuning LLMs using Heterogeneous Feedback

Explanation of Directories

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages