This projects provides a helper script to transcribe audio/video files using the OG Whisper model from OpenAI
python -m venv venv
source venv/bin/activate
pip install -r requirements_lock.txtTranscribes a file by automatically detecting the language
./transcribe.sh audio_or_video.mp4Use the -l or --language flag to specify the language code (e.g., ja for Japanese).
./transcribe.sh japanese.mp3 --language jaUse the -m or --max-words flag to limit the model output
./transcribe.sh audio_or_video.mp4 --language en --max-words 7Note
The word limited output will be available only for timestamped files like .srt, .vtt, etc.
Use the -t or --translate flag to transcribe and translate into English
./transcribe.sh korean.mp3 --language ko --translateUse the --debug flag to print the exact shell command that will be executed without running it
./transcribe.sh file.mp4 --debug| Flag | Short | Description | Default |
|---|---|---|---|
--language |
-l |
Specify the language code (e.g., en, fr) |
Auto-detect |
--max_words |
-m |
Sets the maximum number of words per line | Auto |
--translate |
-t |
Instruct Whisper to translate the audio into English | Off |
--debug |
-d |
Prints the underlying command without executing it | Off |