SaaS Intelligence Bot

What the Project Does

The SaaS Intelligence Bot is a fully automated, multi-user Telegram chatbot that accepts a ZIP file of raw SaaS subscription data, cleans and analyses it, generates AI-powered strategic insights, and delivers a professional PDF intelligence report — all without any manual intervention.

User experience in 3 steps:

User sends a .zip file containing 3 months of SaaS CSV data to the Telegram bot
Bot automatically processes everything in the background (30–90 seconds)
Bot sends back a complete 4-page PDF intelligence report

Tools & Technologies Used

Layer	Tool	Purpose
Automation Platform	n8n (self-hosted, v1.122+)	Workflow orchestration — all 32 nodes
Messaging Interface	Telegram Bot API	User input/output channel
AI Provider	Groq API (`llama-3.1-8b-instant`)	Strategic analysis + health score narrative
Data Processing	JavaScript (n8n Code nodes)	KPI engine, data cleaning, deduplication
Chart Generation	Python + Matplotlib	MRR trend, customer growth, churn rate charts
PDF Generation	Python + wkhtmltopdf	HTML → professional PDF report
File Handling	Python (zipfile, base64, subprocess)	ZIP extraction, script delivery
Runtime	Windows + Node.js v24	n8n host environment

Automation Flow / Architecture

USER (Telegram)
      │
      │  sends ZIP file
      ▼
┌─────────────────────────────────────────────────────────────┐
│                    n8n WORKFLOW (32 nodes)                   │
│                                                             │
│  STAGE 1 — INPUT VALIDATION                                 │
│  Telegram Trigger → Set Bot Config → Has Document?          │
│  → Is Valid ZIP? → Get File Path → Download ZIP             │
│  → Save to Disk → Notify User                               │
│                                                             │
│  STAGE 2 — DATA CLEANING & KPI ENGINE                       │
│  Read CSVs from ZIP (Python)                                │
│  → Compute KPIs (JavaScript)                                │
│    • Normalise dates (4 format variants)                    │
│    • Strip MRR formatting ($, commas, spaces)               │
│    • Map status synonyms (new/trial → active, etc.)         │
│    • Deduplicate rows (source_month|date|customer_id key)   │
│    • Calculate MRR, churn rate, ARPU, growth rate           │
│                                                             │
│  STAGE 3 — DATASET INTELLIGENCE                             │
│  Compute Dataset Intelligence (JavaScript)                  │
│    • Files processed, rows analysed, unique customers       │
│    • Date range, duplicates removed                         │
│    • Programmatic Health Score (4-component formula)        │
│                                                             │
│  STAGE 4 — CHART GENERATION                                 │
│  Save KPIs → Generate Charts (Python/Matplotlib)            │
│    • MRR Trend line chart                                   │
│    • Active Customer Growth bar chart                       │
│    • Monthly Churn Rate bar chart (with 5% risk line)       │
│                                                             │
│  STAGE 5 — AI ANALYSIS (2 Groq API calls)                  │
│  Build Groq Payload 1 → Call Groq AI 1                      │
│    • Health score narrative (focused, 300 tokens)           │
│  Parse Response 1 → Build Groq Payload 2 → Call Groq AI 2  │
│    • Executive summary, KPI interpretation, churn risk      │
│    • Growth opportunities, strategic + actionable recs      │
│                                                             │
│  STAGE 6 — PDF REPORT GENERATION                           │
│  Prepare Report Data → Write Script (3 parts) → Run        │
│    • Python generates styled HTML (table-based layout)      │
│    • wkhtmltopdf converts to A4 PDF                         │
│                                                             │
│  STAGE 7 — DELIVERY & CLEANUP                              │
│  Read PDF Binary → Send PDF to User (Telegram)              │
│  → Cleanup all temp files                                   │
└─────────────────────────────────────────────────────────────┘
      │
      │  sends PDF report
      ▼
USER (Telegram)

Key Features

1. Intelligent Data Cleaning

Handles 4 date formats: YYYY-MM-DD, MM/DD/YYYY, DD-MM-YYYY, YYYY/MM/DD
Strips MRR formatting: $52.00, $1,234.00, 75 (trailing spaces)
Normalises status synonyms: new/trial/Active/ACTIVE → active, cancelled/inactive/expired → churned
Normalises plan casing: basic/BASIC/Basic Plan → Basic
Deduplicates rows using composite key (source_month|date|customer_id)

2. Programmatic Business Health Score (1–100)

Formula with 4 weighted components:

Churn Control — 35 pts (0% churn = 35, 20%+ churn = 0, linear)
Growth Rate — 35 pts (≥20% growth = 35, negative scales down)
ARPU Stability — 15 pts (improving = 15, declining scales down)
Customer Acquisition — 15 pts (relative new customer rate)

Score is colour-coded: 🟢 Healthy (75–100) | 🟠 Moderate (50–74) | 🔴 At Risk (0–49)

3. Dual AI Analysis Pipeline

Two separate Groq API calls for separation of concerns:

Call 1 — Focused health score narrative (300 tokens, precise)
Call 2 — Full strategic analysis returning structured JSON with 6 keys (2000 tokens)

4. Professional PDF Report (4 pages)

Sections in order:

Dataset Intelligence Summary
Business Health Score (badge + breakdown bars + AI narrative)
Executive KPIs Dashboard (6 metric cards)
Monthly Trend Breakdown (table)
Performance Charts (3 charts)
AI Executive Analysis (5 subsections)
Actionable Recommendations (highlighted box)

5. Multi-User Support

All temporary files are prefixed with chat_id — 100 concurrent users can run simultaneously with zero file collisions.

6. Robust Error Handling

Invalid file type → user-friendly error message
Missing CSV files in ZIP → specific error message
Chart generation failure → error message
All AI responses safely parsed with fallback values

Project Files

File	Description
`saas_bot_final.json`	Complete n8n workflow — import this into n8n
`SaaS-Intelligence.zip`	Clean sample dataset for testing
`SaaS-Intelligence-DIRTY.zip`	Dirty dataset to test data cleaning capabilities
`552793721_saas_report.pdf`	Sample output PDF report

Setup Instructions

Prerequisites

n8n self-hosted (v1.x)
Python 3.8+ with matplotlib installed (pip install matplotlib)
wkhtmltopdf installed and on PATH → https://wkhtmltopdf.org
Telegram Bot Token (from @BotFather)
Groq API Key (free tier) → https://console.groq.com
Folder C:\n8n-workspace\ created on the host machine

Steps

Import saas_bot_final.json into n8n
Open Set Bot Config node → replace YOUR_TELEGRAM_BOT_TOKEN_HERE and gsk_YOUR_GROQ_KEY_HERE
Select your saved Telegram credential in every Telegram node
Activate the workflow
Send SaaS-Intelligence.zip (or SaaS-Intelligence-DIRTY.zip) to your bot on Telegram

Expected CSV Format

ZIP must contain month1.csv, month2.csv, month3.csv at the root level. Each CSV requires columns: date, customer_id, mrr, plan, status

Sample Dataset

The project includes two test datasets:

Clean dataset (SaaS-Intelligence.zip) — 60 rows per month, standard format
Dirty dataset (SaaS-Intelligence-DIRTY.zip) — 73 rows per month with injected issues:
- Mixed date formats, MRR formatting variants, status synonyms, plan casing
- Exact duplicates, near-duplicates, missing fields, bad/negative values

Built with n8n + Groq AI + Python | SaaS Analytics Automation

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Visuals		Visuals
sample-data		sample-data
sample-output		sample-output
workflow		workflow
PROJECT_DESCRIPTION.md		PROJECT_DESCRIPTION.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SaaS Intelligence Bot

What the Project Does

Tools & Technologies Used

Automation Flow / Architecture

Key Features

1. Intelligent Data Cleaning

2. Programmatic Business Health Score (1–100)

3. Dual AI Analysis Pipeline

4. Professional PDF Report (4 pages)

5. Multi-User Support

6. Robust Error Handling

Project Files

Setup Instructions

Prerequisites

Steps

Expected CSV Format

Sample Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

SaaS Intelligence Bot

What the Project Does

Tools & Technologies Used

Automation Flow / Architecture

Key Features

1. Intelligent Data Cleaning

2. Programmatic Business Health Score (1–100)

3. Dual AI Analysis Pipeline

4. Professional PDF Report (4 pages)

5. Multi-User Support

6. Robust Error Handling

Project Files

Setup Instructions

Prerequisites

Steps

Expected CSV Format

Sample Dataset

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages