Skip to content

o-danalyst/Saas-Analytics-bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SaaS Intelligence Bot


What the Project Does

The SaaS Intelligence Bot is a fully automated, multi-user Telegram chatbot that accepts a ZIP file of raw SaaS subscription data, cleans and analyses it, generates AI-powered strategic insights, and delivers a professional PDF intelligence report — all without any manual intervention.

User experience in 3 steps:

  1. User sends a .zip file containing 3 months of SaaS CSV data to the Telegram bot
  2. Bot automatically processes everything in the background (30–90 seconds)
  3. Bot sends back a complete 4-page PDF intelligence report

Tools & Technologies Used

Layer Tool Purpose
Automation Platform n8n (self-hosted, v1.122+) Workflow orchestration — all 32 nodes
Messaging Interface Telegram Bot API User input/output channel
AI Provider Groq API (llama-3.1-8b-instant) Strategic analysis + health score narrative
Data Processing JavaScript (n8n Code nodes) KPI engine, data cleaning, deduplication
Chart Generation Python + Matplotlib MRR trend, customer growth, churn rate charts
PDF Generation Python + wkhtmltopdf HTML → professional PDF report
File Handling Python (zipfile, base64, subprocess) ZIP extraction, script delivery
Runtime Windows + Node.js v24 n8n host environment

Automation Flow / Architecture

USER (Telegram)
      │
      │  sends ZIP file
      ▼
┌─────────────────────────────────────────────────────────────┐
│                    n8n WORKFLOW (32 nodes)                   │
│                                                             │
│  STAGE 1 — INPUT VALIDATION                                 │
│  Telegram Trigger → Set Bot Config → Has Document?          │
│  → Is Valid ZIP? → Get File Path → Download ZIP             │
│  → Save to Disk → Notify User                               │
│                                                             │
│  STAGE 2 — DATA CLEANING & KPI ENGINE                       │
│  Read CSVs from ZIP (Python)                                │
│  → Compute KPIs (JavaScript)                                │
│    • Normalise dates (4 format variants)                    │
│    • Strip MRR formatting ($, commas, spaces)               │
│    • Map status synonyms (new/trial → active, etc.)         │
│    • Deduplicate rows (source_month|date|customer_id key)   │
│    • Calculate MRR, churn rate, ARPU, growth rate           │
│                                                             │
│  STAGE 3 — DATASET INTELLIGENCE                             │
│  Compute Dataset Intelligence (JavaScript)                  │
│    • Files processed, rows analysed, unique customers       │
│    • Date range, duplicates removed                         │
│    • Programmatic Health Score (4-component formula)        │
│                                                             │
│  STAGE 4 — CHART GENERATION                                 │
│  Save KPIs → Generate Charts (Python/Matplotlib)            │
│    • MRR Trend line chart                                   │
│    • Active Customer Growth bar chart                       │
│    • Monthly Churn Rate bar chart (with 5% risk line)       │
│                                                             │
│  STAGE 5 — AI ANALYSIS (2 Groq API calls)                  │
│  Build Groq Payload 1 → Call Groq AI 1                      │
│    • Health score narrative (focused, 300 tokens)           │
│  Parse Response 1 → Build Groq Payload 2 → Call Groq AI 2  │
│    • Executive summary, KPI interpretation, churn risk      │
│    • Growth opportunities, strategic + actionable recs      │
│                                                             │
│  STAGE 6 — PDF REPORT GENERATION                           │
│  Prepare Report Data → Write Script (3 parts) → Run        │
│    • Python generates styled HTML (table-based layout)      │
│    • wkhtmltopdf converts to A4 PDF                         │
│                                                             │
│  STAGE 7 — DELIVERY & CLEANUP                              │
│  Read PDF Binary → Send PDF to User (Telegram)              │
│  → Cleanup all temp files                                   │
└─────────────────────────────────────────────────────────────┘
      │
      │  sends PDF report
      ▼
USER (Telegram)

Key Features

1. Intelligent Data Cleaning

  • Handles 4 date formats: YYYY-MM-DD, MM/DD/YYYY, DD-MM-YYYY, YYYY/MM/DD
  • Strips MRR formatting: $52.00, $1,234.00, 75 (trailing spaces)
  • Normalises status synonyms: new/trial/Active/ACTIVE → active, cancelled/inactive/expired → churned
  • Normalises plan casing: basic/BASIC/Basic Plan → Basic
  • Deduplicates rows using composite key (source_month|date|customer_id)

2. Programmatic Business Health Score (1–100)

Formula with 4 weighted components:

  • Churn Control — 35 pts (0% churn = 35, 20%+ churn = 0, linear)
  • Growth Rate — 35 pts (≥20% growth = 35, negative scales down)
  • ARPU Stability — 15 pts (improving = 15, declining scales down)
  • Customer Acquisition — 15 pts (relative new customer rate)

Score is colour-coded: 🟢 Healthy (75–100) | 🟠 Moderate (50–74) | 🔴 At Risk (0–49)

3. Dual AI Analysis Pipeline

Two separate Groq API calls for separation of concerns:

  • Call 1 — Focused health score narrative (300 tokens, precise)
  • Call 2 — Full strategic analysis returning structured JSON with 6 keys (2000 tokens)

4. Professional PDF Report (4 pages)

Sections in order:

  1. Dataset Intelligence Summary
  2. Business Health Score (badge + breakdown bars + AI narrative)
  3. Executive KPIs Dashboard (6 metric cards)
  4. Monthly Trend Breakdown (table)
  5. Performance Charts (3 charts)
  6. AI Executive Analysis (5 subsections)
  7. Actionable Recommendations (highlighted box)

5. Multi-User Support

All temporary files are prefixed with chat_id — 100 concurrent users can run simultaneously with zero file collisions.

6. Robust Error Handling

  • Invalid file type → user-friendly error message
  • Missing CSV files in ZIP → specific error message
  • Chart generation failure → error message
  • All AI responses safely parsed with fallback values

Project Files

File Description
saas_bot_final.json Complete n8n workflow — import this into n8n
SaaS-Intelligence.zip Clean sample dataset for testing
SaaS-Intelligence-DIRTY.zip Dirty dataset to test data cleaning capabilities
552793721_saas_report.pdf Sample output PDF report

Setup Instructions

Prerequisites

  • n8n self-hosted (v1.x)
  • Python 3.8+ with matplotlib installed (pip install matplotlib)
  • wkhtmltopdf installed and on PATH → https://wkhtmltopdf.org
  • Telegram Bot Token (from @BotFather)
  • Groq API Key (free tier) → https://console.groq.com
  • Folder C:\n8n-workspace\ created on the host machine

Steps

  1. Import saas_bot_final.json into n8n
  2. Open Set Bot Config node → replace YOUR_TELEGRAM_BOT_TOKEN_HERE and gsk_YOUR_GROQ_KEY_HERE
  3. Select your saved Telegram credential in every Telegram node
  4. Activate the workflow
  5. Send SaaS-Intelligence.zip (or SaaS-Intelligence-DIRTY.zip) to your bot on Telegram

Expected CSV Format

ZIP must contain month1.csv, month2.csv, month3.csv at the root level. Each CSV requires columns: date, customer_id, mrr, plan, status


Sample Dataset

The project includes two test datasets:

  • Clean dataset (SaaS-Intelligence.zip) — 60 rows per month, standard format
  • Dirty dataset (SaaS-Intelligence-DIRTY.zip) — 73 rows per month with injected issues:
    • Mixed date formats, MRR formatting variants, status synonyms, plan casing
    • Exact duplicates, near-duplicates, missing fields, bad/negative values

Built with n8n + Groq AI + Python | SaaS Analytics Automation

About

n8n workflow automation bot — processes raw SaaS CSV data and returns a full AI-generated PDF intelligence report via Telegram. Powered by Groq AI + Python + wkhtmltopdf.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors