CSM Finetuning: Expresso proof of concept

Warning

This repo is a work in progress and is not ready for general use. I've verified that training works for expresso dataset as shown here, but essential features like a proper data cleaning script, LR scheduler, LoRA, checkpointing, and schema are missing.

Do NOT use this repo unless you are a dev and comfortable doing some heavy editing to make it work with your dataset. We will eventually provide a proper script and set of instructions. No support will be provided yet unless you are a contributor.

Usage

Clone the repo and use uv:

uv sync
uv pip install -e .

Then create your dataset for Expresso:

cd data_pipeline
# Should be pretty short: ~2-3 min on 4090
uv run convert_expresso.py

Run the data_pipeline/tokenize_expresso.ipynb notebook.

Finally, edit the config/train_expresso.toml file with your training requirements, then run:

cd training_harness
uv run main.py

If you want to try out your model, use test_expresso.py:

uv run test_expresso.py \
    --speaker_id 0 \ # Speakers 1-4 from Expresso dataset
    --style default \ # Supports: default, happy, laughing, sad, whisper, emphasis, enunciated, confused
    --text "There'll be a funnel cloud Monday, but it'll be mostly sunny Tuesday."

Thanks to nytopop for the repo base.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
config		config
data_pipeline		data_pipeline
modeling		modeling
tests		tests
training_harness		training_harness
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
test_expresso.py		test_expresso.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CSM Finetuning: Expresso proof of concept

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CSM Finetuning: Expresso proof of concept

Usage

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages