The SaaS Intelligence Bot is a fully automated, multi-user Telegram chatbot that accepts a ZIP file of raw SaaS subscription data, cleans and analyses it, generates AI-powered strategic insights, and delivers a professional PDF intelligence report — all without any manual intervention.
User experience in 3 steps:
- User sends a
.zipfile containing 3 months of SaaS CSV data to the Telegram bot - Bot automatically processes everything in the background (30–90 seconds)
- Bot sends back a complete 4-page PDF intelligence report
| Layer | Tool | Purpose |
|---|---|---|
| Automation Platform | n8n (self-hosted, v1.122+) | Workflow orchestration — all 32 nodes |
| Messaging Interface | Telegram Bot API | User input/output channel |
| AI Provider | Groq API (llama-3.1-8b-instant) |
Strategic analysis + health score narrative |
| Data Processing | JavaScript (n8n Code nodes) | KPI engine, data cleaning, deduplication |
| Chart Generation | Python + Matplotlib | MRR trend, customer growth, churn rate charts |
| PDF Generation | Python + wkhtmltopdf | HTML → professional PDF report |
| File Handling | Python (zipfile, base64, subprocess) | ZIP extraction, script delivery |
| Runtime | Windows + Node.js v24 | n8n host environment |
USER (Telegram)
│
│ sends ZIP file
▼
┌─────────────────────────────────────────────────────────────┐
│ n8n WORKFLOW (32 nodes) │
│ │
│ STAGE 1 — INPUT VALIDATION │
│ Telegram Trigger → Set Bot Config → Has Document? │
│ → Is Valid ZIP? → Get File Path → Download ZIP │
│ → Save to Disk → Notify User │
│ │
│ STAGE 2 — DATA CLEANING & KPI ENGINE │
│ Read CSVs from ZIP (Python) │
│ → Compute KPIs (JavaScript) │
│ • Normalise dates (4 format variants) │
│ • Strip MRR formatting ($, commas, spaces) │
│ • Map status synonyms (new/trial → active, etc.) │
│ • Deduplicate rows (source_month|date|customer_id key) │
│ • Calculate MRR, churn rate, ARPU, growth rate │
│ │
│ STAGE 3 — DATASET INTELLIGENCE │
│ Compute Dataset Intelligence (JavaScript) │
│ • Files processed, rows analysed, unique customers │
│ • Date range, duplicates removed │
│ • Programmatic Health Score (4-component formula) │
│ │
│ STAGE 4 — CHART GENERATION │
│ Save KPIs → Generate Charts (Python/Matplotlib) │
│ • MRR Trend line chart │
│ • Active Customer Growth bar chart │
│ • Monthly Churn Rate bar chart (with 5% risk line) │
│ │
│ STAGE 5 — AI ANALYSIS (2 Groq API calls) │
│ Build Groq Payload 1 → Call Groq AI 1 │
│ • Health score narrative (focused, 300 tokens) │
│ Parse Response 1 → Build Groq Payload 2 → Call Groq AI 2 │
│ • Executive summary, KPI interpretation, churn risk │
│ • Growth opportunities, strategic + actionable recs │
│ │
│ STAGE 6 — PDF REPORT GENERATION │
│ Prepare Report Data → Write Script (3 parts) → Run │
│ • Python generates styled HTML (table-based layout) │
│ • wkhtmltopdf converts to A4 PDF │
│ │
│ STAGE 7 — DELIVERY & CLEANUP │
│ Read PDF Binary → Send PDF to User (Telegram) │
│ → Cleanup all temp files │
└─────────────────────────────────────────────────────────────┘
│
│ sends PDF report
▼
USER (Telegram)
- Handles 4 date formats:
YYYY-MM-DD,MM/DD/YYYY,DD-MM-YYYY,YYYY/MM/DD - Strips MRR formatting:
$52.00,$1,234.00,75(trailing spaces) - Normalises status synonyms:
new/trial/Active/ACTIVE → active,cancelled/inactive/expired → churned - Normalises plan casing:
basic/BASIC/Basic Plan → Basic - Deduplicates rows using composite key (
source_month|date|customer_id)
Formula with 4 weighted components:
- Churn Control — 35 pts (0% churn = 35, 20%+ churn = 0, linear)
- Growth Rate — 35 pts (≥20% growth = 35, negative scales down)
- ARPU Stability — 15 pts (improving = 15, declining scales down)
- Customer Acquisition — 15 pts (relative new customer rate)
Score is colour-coded: 🟢 Healthy (75–100) | 🟠 Moderate (50–74) | 🔴 At Risk (0–49)
Two separate Groq API calls for separation of concerns:
- Call 1 — Focused health score narrative (300 tokens, precise)
- Call 2 — Full strategic analysis returning structured JSON with 6 keys (2000 tokens)
Sections in order:
- Dataset Intelligence Summary
- Business Health Score (badge + breakdown bars + AI narrative)
- Executive KPIs Dashboard (6 metric cards)
- Monthly Trend Breakdown (table)
- Performance Charts (3 charts)
- AI Executive Analysis (5 subsections)
- Actionable Recommendations (highlighted box)
All temporary files are prefixed with chat_id — 100 concurrent users can run simultaneously with zero file collisions.
- Invalid file type → user-friendly error message
- Missing CSV files in ZIP → specific error message
- Chart generation failure → error message
- All AI responses safely parsed with fallback values
| File | Description |
|---|---|
saas_bot_final.json |
Complete n8n workflow — import this into n8n |
SaaS-Intelligence.zip |
Clean sample dataset for testing |
SaaS-Intelligence-DIRTY.zip |
Dirty dataset to test data cleaning capabilities |
552793721_saas_report.pdf |
Sample output PDF report |
- n8n self-hosted (v1.x)
- Python 3.8+ with
matplotlibinstalled (pip install matplotlib) - wkhtmltopdf installed and on PATH → https://wkhtmltopdf.org
- Telegram Bot Token (from @BotFather)
- Groq API Key (free tier) → https://console.groq.com
- Folder
C:\n8n-workspace\created on the host machine
- Import
saas_bot_final.jsoninto n8n - Open
Set Bot Confignode → replaceYOUR_TELEGRAM_BOT_TOKEN_HEREandgsk_YOUR_GROQ_KEY_HERE - Select your saved Telegram credential in every Telegram node
- Activate the workflow
- Send
SaaS-Intelligence.zip(orSaaS-Intelligence-DIRTY.zip) to your bot on Telegram
ZIP must contain month1.csv, month2.csv, month3.csv at the root level.
Each CSV requires columns: date, customer_id, mrr, plan, status
The project includes two test datasets:
- Clean dataset (
SaaS-Intelligence.zip) — 60 rows per month, standard format - Dirty dataset (
SaaS-Intelligence-DIRTY.zip) — 73 rows per month with injected issues:- Mixed date formats, MRR formatting variants, status synonyms, plan casing
- Exact duplicates, near-duplicates, missing fields, bad/negative values
Built with n8n + Groq AI + Python | SaaS Analytics Automation