@@ -22,6 +22,9 @@ The Mixtral implementation natively supports the following TransformerEngine-pro
2222
2323### Quick start: convert and run
2424
25+ > ** Note:** The snippets below use bare imports (e.g., ` from convert import ... ` ). Run them from the
26+ > ` bionemo-recipes/models/mixtral ` directory, or install dependencies first with ` pip install -r requirements.txt ` .
27+
2528``` python
2629import torch
2730from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -44,7 +47,7 @@ inputs = tokenizer("The quick brown fox", return_tensors="pt")
4447inputs = {k: v.to(" cuda" ) for k, v in inputs.items()}
4548
4649with torch.no_grad():
47- output_ids = model_te.generate(** inputs, max_new_tokens = 16 , use_cache = False )
50+ output_ids = model_te.generate(** inputs, max_new_tokens = 16 )
4851
4952print (tokenizer.decode(output_ids[0 ], skip_special_tokens = True ))
5053```
@@ -57,6 +60,9 @@ inference, and back to Hugging Face Transformers format for sharing and deployme
5760
5861### Converting from HF Transformers to TE
5962
63+ > ** Note:** Run from the ` bionemo-recipes/models/mixtral ` directory, or install dependencies first with
64+ > ` pip install -r requirements.txt ` .
65+
6066``` python
6167from transformers import AutoModelForCausalLM
6268
@@ -69,6 +75,9 @@ model_te.save_pretrained("/path/to/te_checkpoint")
6975
7076### Converting from TE back to HF Transformers
7177
78+ > ** Note:** Run from the ` bionemo-recipes/models/mixtral ` directory, or install dependencies first with
79+ > ` pip install -r requirements.txt ` .
80+
7281``` python
7382from convert import convert_mixtral_te_to_hf
7483from modeling_mixtral_te import NVMixtralForCausalLM
@@ -80,9 +89,18 @@ model_hf.save_pretrained("/path/to/hf_checkpoint")
8089
8190### Validating Converted Models
8291
83- To validate the converted models, refer to the commands in [ Inference Examples] ( #inference-examples ) above to load and
84- test both the original and converted models to ensure loss and logit values are similar. Additionally, refer to the
85- golden value tests in [ test_modeling_mixtral.py] ( tests/test_modeling_mixtral.py ) .
92+ The golden value tests in [ test_modeling_mixtral.py] ( tests/test_modeling_mixtral.py ) verify that the converted TE model
93+ produces numerically equivalent outputs to the original HuggingFace model. Specifically:
94+
95+ - ` test_golden_values_bshd ` — loads both models, runs a forward pass on the same input, and asserts that logits and
96+ loss match within tolerance.
97+ - ` test_round_trip_conversion ` — converts HF → TE → HF and verifies the round-tripped model produces identical outputs.
98+
99+ To run these tests locally:
100+
101+ ``` bash
102+ ./ci/scripts/recipes_local_test.py bionemo-recipes/models/mixtral/
103+ ```
86104
87105## Developer Guide
88106
@@ -94,6 +112,18 @@ To run tests locally, run `recipes_local_test.py` from the repository root with
94112./ci/scripts/recipes_local_test.py bionemo-recipes/models/mixtral/
95113```
96114
115+ ### Exporting to Hugging Face Hub
116+
117+ The model directory includes an ` export.py ` script that bundles all files needed for Hugging Face Hub distribution. To
118+ create the export bundle, run from the model directory:
119+
120+ ``` bash
121+ python export.py
122+ ```
123+
124+ Before publishing, validate the export by running the local test suite via
125+ [ recipes_local_test.py] ( ../../ci/scripts/recipes_local_test.py ) .
126+
97127### Development container
98128
99129To use the provided devcontainer, use "Dev Containers: Reopen in Container" from the VSCode menu, and choose the
0 commit comments