AutoShorts AI is a fully automated Python pipeline that creates viral-style "Faceless" YouTube Shorts and TikToks from a single topic. It handles the entire production chain: researching, scriptwriting, voiceover generation, stock footage sourcing, and advanced video editing with transitions and avatar injection.
-
๐ง Intelligent Scriptwriting: Uses Google Gemini 2.0 Flash to write engaging, "Edutainment" style scripts (Vox/Kurzgesagt style) with strict storytelling structures (Hook โ Context โ Mechanism โ Twist).
-
๐ฃ๏ธ Human-Like Voiceovers: Integrated with Suno Bark (via Google Colab/Ngrok) for high-quality, expressive AI narration. Includes "Influencer Mode" for dynamic intonation.
-
๐๏ธ Dual-Visual System: Automatically searches and downloads two distinct stock videos per scene from Pexels, creating a dynamic "A/B Split" visual style to maximize viewer retention.
-
โ๏ธ Advanced FFmpeg Editing:
-
Smart Trimming: Syncs video perfectly to audio duration.
-
A/B Splitting: Cuts every scene in half, switching visuals mid-sentence.
-
Pro Transitions: Randomly applies
xfade(fade, slide, wipes) between scenes. -
Silence Removal: Automatically trims dead air from AI voice generation.
-
๐ค Random Avatar Injection: Automatically inserts a custom "Avatar/Mascot" video into a random middle scene to build channel brand identity.
-
๐ช Windows Ready: Includes specific FFmpeg flags (
yuv420p,faststart) to prevent corruption errors (0x80004005) on Windows Media Player.
Automated-YT-Shorts-AI/
โ
โโโ assets/ # Stores all media files
โ โโโ audio_clips/ # Generated voiceovers (.wav)
โ โโโ video_clips/ # Downloaded stock footage (.mp4)
โ โโโ temp/ # Intermediate processing files
โ โโโ final/ # ๐ The Final Output Video lives here
โ โโโ avatar/ # โ ๏ธ PUT YOUR AVATAR VIDEO HERE
โ โโโ Professional_Girl_Animation_Video_Generation.mp4
โ
โโโ modules/ # Core Logic Modules
โ โโโ brain.py # AI Scriptwriter (Gemini)
โ โโโ audio.py # Voice Generator (Bark Client)
โ โโโ asset_manager.py # Pexels Downloader (Dual-Visual logic)
โ โโโ composer.py # FFmpeg Video Editor (Stitching & Transitions)
โ
โโโ main.py # Entry point (Orchestrator)
โโโ test_audio.py # Diagnostic tool for Bark connection
โโโ requirements.txt # Python dependencies
- Python 3.10+ installed.
- FFmpeg installed and added to your system PATH.
- Windows:
winget install ffmpeg(or download from ffmpeg.org). - Verify: Type
ffmpeg -versionin your terminal.
- API Keys:
- Google Gemini API Key (Free tier available).
- Pexels API Key (Free).
- Ngrok Auth Token (If running Bark on Colab).
git clone https://github.com/yourusername/AutoShorts-AI.git
cd AutoShorts-AI
pip install -r requirements.txt
(If requirements.txt is missing, install manually: pip install google-generativeai requests ffmpeg-python mutagen colorama)
Create the required folders and add your avatar:
- Create folder:
assets/avatar - Place your avatar video inside and name it:
Professional_Girl_Animation_Video_Generation.mp4
You can set them in your environment variables or hardcode them (temporarily) in the modules:
modules/brain.pyโgenai.configure(api_key="YOUR_GEMINI_KEY")modules/asset_manager.pyโself.api_key = "YOUR_PEXELS_KEY"modules/audio.pyโ Updateraw_urlwith your active Ngrok/Colab link.
Since Bark requires a GPU, we run it on Google Colab.
- Open the Colab Notebook provided for this project.
- Paste your Ngrok Token.
- Run the cell.
- Copy the
https://xxxx.ngrok-free.appURL. - Paste this URL into
modules/audio.pyinside theAudioEngineclass.
Run the test script to ensure your local machine can talk to the Cloud GPU.
python test_audio.py
If you see โ
SUCCESS, you are ready.
Run the main script:
python main.py
- Enter a topic (e.g., "The Mystery of the Pyramids").
- Wait for the AI to write the script, generate audio, download stock footage, and edit the video.
- The final video will be saved in
assets/final/final_short.mp4.
- Input: Topic string.
- Logic: Prompts Gemini to create an 8-9 scene JSON script. It asks for two visual keywords per scene (
visual_1,visual_2) to enable the A/B split effect.
- Input: Text script.
- Logic: Sends text to the Colab server. Includes a "Confidence" setting (
text_temp=0.7) to make the voice sound like an influencer. - Post-Processing: Uses FFmpeg to trim silence and boost volume (2x).
- Input: Visual keywords.
- Logic: Searches Pexels for Portrait (9:16) videos. Downloads pairs of videos for every scene. Handles fallbacks (if Video B is missing, reuse Video A).
- Input: Audio files + Video files.
- Logic:
- Scene Processing: Cuts the scene duration in half. Plays Video A for the first half, Video B for the second half.
- Avatar Injection: Identifies a random "middle" scene (not hook/outro) and replaces the stock footage with your Avatar loop.
- Stitching: Merges all scenes using
xfadetransitions (wipes, slides). - Rendering: Exports as
yuv420pH.264 MP4 withfaststartflags for maximum compatibility.
Q: The video is black or corrupt (0x80004005 error).
- Fix: This is usually a Windows codec issue. The updated
composer.pyforcespix_fmt='yuv420p'. Try opening the file with VLC Media Player.
Q: "Avatar file missing" error.
- Fix: Altough not needed, Ensure your folder structure is exactly
assets/avatar/avatar.mp4.
Q: The audio is silent or fails.
- Fix: Your Ngrok tunnel likely expired. Restart the Colab cell and update the URL in
audio.py.
Q: FFmpeg error "Exec format error" or "not found".
- Fix: Ensure FFmpeg is installed and accessible from your command line.
This project is open-source. Feel free to modify and build your own automation empire!