JobsList - AI-Powered Job Aggregation Pipeline

A job aggregation pipeline that collects postings from multiple sources, scrapes job descriptions, uses Claude AI to extract structured fields, and syncs everything to an Airtable dashboard.

View the Airtable Dashboard

How It Works

Sources                    Pipeline                         Output
─────────                  ─────────                        ──────
Google Sheets CSV  ─┐
GitHub (Simplify)  ─┤      Step 1: Fetch Raw Jobs
Y Combinator       ─┼───►  Step 2: Scrape Job Descriptions  ───►  Airtable
JSearch API        ─┤      Step 3: AI Processing (Claude)          Dashboard
Ashby Job Boards   ─┘      Step 4: Sync to Airtable

Pipeline Steps

Step	What it does
1. Fetch	Pulls raw job listings from the selected source
2. Scrape	Visits each apply link and extracts the full job description
3. AI Process	Claude extracts work model, industry, H1B status, qualifications, and tags
4. Sync	Pushes enriched jobs to Airtable in batches

Job Sources

Source	Auth	Description
Google Sheets	None	CSV export from a public Google Sheet
GitHub	None	Parses SimplifyJobs/New-Grad-Positions README
Y Combinator	RapidAPI key	YC company job listings
JSearch	RapidAPI key	Broad job search API (Indeed, LinkedIn, etc.)
Ashby	None	Uses Ashby public posting API, then backend filters by keywords + publishedAt freshness

Quick Start

Prerequisites

Node.js 20+
pnpm 9+
Anthropic API key
Airtable account + API key

Setup

pnpm install

cp .env.example .env
# Edit .env with your API keys

Environment Variables

Variable	Required	Description
`ANTHROPIC_API_KEY`	Yes	Claude API key for AI processing
`AIRTABLE_API_KEY`	Yes	Airtable personal access token
`AIRTABLE_BASE_ID`	Yes	Your Airtable base ID
`RAPIDAPI_KEY`	For YC/JSearch	RapidAPI key for YC and JSearch sources
`CLAUDE_MODEL`	No	Defaults to `claude-3-5-haiku-20241022`
`JOB_COUNT`	No	Default number of jobs to fetch (default: 10)
`ASHBY_KEYWORDS`	No	Comma-separated keywords applied in backend filtering (default: `early career,sde,robotics`)
`ASHBY_PUBLISHED_WITHIN_HOURS`	No	Keep Ashby jobs with `publishedAt` in the last N hours (default: `24`)
`ASHBY_INCLUDE_COMPENSATION`	No	Calls Ashby with `includeCompensation=true` when enabled (default: `true`)
`ASHBY_REQUEST_DELAY_MS`	No	Delay between Ashby company requests to stay within fair use (default: `1000`)

Run Locally

# Start both backend and frontend in dev mode
pnpm dev

Backend: http://localhost:3001
Frontend: http://localhost:5173

Project Structure

JobsList/
├── backend/
│   └── src/
│       ├── services/          # Fetchers (CSV, GitHub, YC, JSearch, Ashby public API)
│       │                      # Ashby filter logic, scraper, AI processor, Airtable sync
│       ├── routes/            # Pipeline + data API routes
│       ├── constants/         # Industry categories
│       ├── config.ts          # Environment config
│       ├── store.ts           # JSON file persistence
│       └── index.ts           # Fastify server
├── frontend/
│   └── src/
│       ├── components/        # Pipeline UI (StepPanel, StepButton, etc.)
│       └── hooks/             # API hooks
├── render.yaml                # Render deployment config
└── docs/
    └── api.md                 # API documentation

API Endpoints

Endpoint	Method	Description
`/api/health`	GET	Health check
`/api/pipeline/step1?source=<src>`	POST	Fetch raw jobs (Ashby supports `companies`, `keywords`, `postedToday`, `publishedWithinHours`, `limit`)
`/api/pipeline/step2`	POST	Scrape job descriptions
`/api/pipeline/step3`	POST	AI process with Claude
`/api/pipeline/step4`	POST	Sync to Airtable
`/api/pipeline/status`	GET	Get all step statuses
`/api/pipeline/reset`	POST	Clear all data
`/api/data/:step`	GET	Get output from step 1-4
`/api/logs/stream`	GET	SSE real-time log stream

Ashby Smart Ingestion Request Example

# Fetch the newest 25 jobs posted today that match keywords from selected Ashby companies
curl -X POST "http://localhost:3001/api/pipeline/step1?source=theirstack&companies=openai,notion,cursor&keywords=sde,robotics&postedToday=true&limit=25"

companies: comma-separated Ashby slugs to search (omit to search all configured companies)
keywords: comma-separated keywords (omit to use ASHBY_KEYWORDS)
postedToday=true: keep only jobs whose publishedAt date is today (UTC)
publishedWithinHours: optional alternative to postedToday (e.g., 24)
No API key is required for Ashby public job board ingestion.

Ashby Slug Verification Utility

Use the helper script to maintain or verify slug candidates:

python scripts/ashby_slugs_verified.py
python scripts/ashby_slugs_verified.py --verify

The backend now includes a larger built-in Ashby slug pool and also supports request-level company selection via companies=<slug1,slug2,...>.

AI Processing

Claude (Haiku) extracts from each job:

Work Model — Remote / Hybrid / Onsite
Industry — 22 categories (Software Engineering, ML/AI, Finance, etc.)
H1B Sponsorship — explicit mention required
Qualifications — key requirements summary
Tags — auto-detected: FAANG+, Quant, Fortune 500, Unicorn, YC, Crypto/Web3
Job Board — Lever, Ashby, Greenhouse, LinkedIn, Workday

Deployment (Render)

The app deploys as a single service on Render's free tier:

Push to GitHub
Go to dashboard.render.com → New → Blueprint
Connect your repo — Render auto-detects render.yaml
Set secret env vars: ANTHROPIC_API_KEY, AIRTABLE_API_KEY, AIRTABLE_BASE_ID, RAPIDAPI_KEY
Deploy

In production, the backend serves the frontend static build, so everything runs from a single URL.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.claude		.claude
.github/workflows		.github/workflows
backend		backend
docs		docs
frontend		frontend
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
JobSyncInterface.pdf		JobSyncInterface.pdf
README.md		README.md
UsersvyasmOneDriveDesktopJobsListtheirstack_test.json		UsersvyasmOneDriveDesktopJobsListtheirstack_test.json
jobSyncdash.jpeg		jobSyncdash.jpeg
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
render.yaml		render.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JobsList - AI-Powered Job Aggregation Pipeline

How It Works

Pipeline Steps

Job Sources

Quick Start

Prerequisites

Setup

Environment Variables

Run Locally

Project Structure

API Endpoints

Ashby Smart Ingestion Request Example

Ashby Slug Verification Utility

AI Processing

Deployment (Render)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

JobsList - AI-Powered Job Aggregation Pipeline

How It Works

Pipeline Steps

Job Sources

Quick Start

Prerequisites

Setup

Environment Variables

Run Locally

Project Structure

API Endpoints

Ashby Smart Ingestion Request Example

Ashby Slug Verification Utility

AI Processing

Deployment (Render)

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages