Refering to the active learning for text classification example given here.
In the given example, we have:
transformer_model_name = 'bert-base-uncased'
transformer_model = TransformerModelArguments(transformer_model_name)
clf_factory = TransformerBasedClassificationFactory(
transformer_model,
num_classes,
kwargs=dict({'device': 'cuda', 'mini_batch_size': 32,
'class_weight': 'balanced'}))
In my case, I would like to use the language model meta-llama/Llama-2-7b-chat-hf as a sequence classifier by calling it as
base_model = AutoModelForSequenceClassification.from_pretrained(
pretrained_model_name_or_path="meta-llama/Llama-2-7b-chat-hf",
num_labels=1,
)
Then, I would like to perform supervised training with active learning of the Llama sequence-classifer transformer model on the dataset Birchlabs/openai-prm800k-stepwise-critic.
Questions:
-
How do I modify the example in the repository to get a clf_factory which uses the above base_model instead of providing TransformerModelArguments?
-
How do I use small-text to handle the large model size of Llama and potentially distribute its training over multiple GPUs?
Refering to the active learning for text classification example given here.
In the given example, we have:
In my case, I would like to use the language model meta-llama/Llama-2-7b-chat-hf as a sequence classifier by calling it as
Then, I would like to perform supervised training with active learning of the Llama sequence-classifer transformer model on the dataset Birchlabs/openai-prm800k-stepwise-critic.
Questions:
How do I modify the example in the repository to get a
clf_factorywhich uses the abovebase_modelinstead of providingTransformerModelArguments?How do I use
small-textto handle the large model size of Llama and potentially distribute its training over multiple GPUs?