GitHub - sucv/EasyPaper: Query, retrieve, research, and chat with 80K top CS papers and more.

Easy Paper

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
- Installation
Usage
Venues
Limitation
Contributing
Acknowledgments

🌟 About The Project

EasyPaper is a Human-in-the-Loop AI tool designed for large-scale literature review across fields or topics. It offers:

A CS database with 80K top-tier papers from recent years and the whole arXiv.
Accurate RAG using vector-less tree-based indexing and AI.
Deep Research and Chat agents built with Langchain and DeepAgents.

Built With

Langchain / DeepAgents
React / FastAPI

🚀 Getting Started

The system works as a self-hosted React/FastAPI web application.

🚀 Prerequisites

Make sure you have the followings:

node.js for Frontend
Python 3.12+ for Backend
Marker API for PDF OCR
One Text Embedding API key (Paid API or Ollama) for RAG
One AI API key (Paid API or Ollama) for Chat and Deep Research

🚀 Installation

Clone the repo

git clone https://github.com/sucv/EasyPaper.git

Install NPM packages
```
# under EasyPaper/frontend/
npm install
```

Install Python packages

# under EasyPaper/backend/
pip install -r requirements.txt

(Optional) Download the database (containing 80K top-tier CS papers) from Release, and put it into EasyPaper/chroma_db/

# under EasyPaper/
mkdir chroma_db
# Then put the downloaded chroma.sqlite3 into EasyPaper/chroma_db
# Note that the database is indexed using the qwen3-embedding:8b from Ollama.
# If you use the database, you have to specify qwen3-embedding:8b for embedding_config in your config.yaml

Provide your API keys and/or Ollama base url

# under EasyPaper/
cp .env.example .env
# Then put your API keys or Ollama base url there
# Add LangSmith for free and accurate token usage and pricing tracking

Edit the EasyPaper/config.yaml to specify your providers, models and configurations

Launch the app

# under EasyPaper/
python start.py
# Then visit http://localhost:5173/

(back to top)

📖 Usage

EasyPaper starts with creating a Project. You may also continue from an existing Project. A Project can have multiple Idea, each Idea can have multiple reports from different aspects.

📖 Query the database and Select the paper

EasyPaper supports three data sources:

A sqlite3 database containing 80K top-tier CS papers crawled from a crawler.
- About 80% are downloadable (Top CS conferences).
- The rest 20% contain only title, author, year, and publisher (Mostly top CS journals and a few conferences)
The arXiv API
- Follow the official example or ask AI
Your own PDFs
- Could be the top journal papers you manually downloaded
- If you are non-CS people and have some PDFs to analyze just like the PaperQA.

Choose your query method from Vector, Boolean Expression, or arXiv, select the Year and Venue, then click Search. Select the interested paper in the results or check the citation count.

Both the Vector and Boolean Expression will query with the Title only. Currently there is no way to add other field. See Example for the Boolean Expression.
For arXiv, it's the official API, which allows "Title, Author`, and many more. See the [official example].

If you are interested in adding more venues or years to the database, please refer to paperCrawler and crawl by yourself. Then generate the Chroma vectorstore by running:

# under EasyPaper/
populate_chroma.py your_crawl.csv

(back to top)

📖 Add to the cart

Add the selected paper to the Cart (also load your local PDFs to the cart), select or create an Idea and send the cart papers to your Idea Panel.

(back to top)

📖 Define your Idea

The Idea serve as the prompt, guiding the AI to find relevant information from the papers in an Idea Panel. The Idea can be named as one single word or sentences. It is your best interest to think about your Idea.

Content in the Idea Panel is preserved on your file system.

# Example 1 for getting the methodology details:
methodology details

# Example 2 for getting methodology and experiment  results
method and experiment

# Example 3 for getting the related works and reference
relevant works of the proposed method and the reference section

(back to top)

📖 Download and Retrieve

Click the Download , followed by the Retrieve. EasyPaper will download the paper, run the OCR, run the vector-less indexing, and retrieve the relevant nodes using AI.

Before Retrieval, EasyPaper will show the page number of each PDF and the estimated cost. You may choose the page range for length ones. If a pdf has already been downloaded or retrieved, it will be skipped, respectively.

Once the relevant segments are retrieved, you can view or export them from the Column Action.

(back to top)

📖 Deep Research

Choose the task and LLM, then click the Run the Task. Once completed, you may view or export the report. The research starts by delegating paper-wise task to the worker LLM, aggregating the results, finally adding the summary.

You may add your custom task by defining the yaml file in EasyPaper/tasks. Once any new yaml is added, EasyPaper will display it in the dropdown manual before Run the Task.

More than one Research can be conducted for an Idea, sequentially.

📖 Chat

In the Paper QA panel, you may create or continue a chat, and also choose the Scope of the conversation to one or all Idea(s).

The Chat agent has the tools of list all papers and see the paper structure (i.e., the tree index). The agent will delegate paper-wise task to subagents, following by running the RAG.

The chat history is preserved on your file system.

(back to top)

💲 Cost

In the top panel, EasyPaper shows an approximation to the cost so far. For accurate cost tracking, provide your LangSmith API key LANGCHAIN_API_KEY in your .env and set langsmith_enabled to true. You will then be able to see the detailed usage, content, and cost via LangSmith's console.

🏛️ Venues

Type	Venue	Year
Conference	CVPR, ICCV, ECCV, ICLR, ICML, Neurips, AAAI, IJCAI, MM, KDD, WWW, ACL, EMNLP, NAACL, Interspeech, ICASSP	Recent 5 years
Journal	Nature, PNAS, Nat. Mach. Intell, TPAMI, IJCV, Proc. IEEE, ACM Comput. Surv., J. ACM, Info. Fusion., SIGGRAPH, TNNLS, etc...	Recent 10 years

All journals and some conferences (MM, KDD, WWW) papers are not downloadable (up to 20% of the whole database) These entries have only title, venue, year, and author, without abstract and pdf url. The user need to download manually should they match the query and interest.

(back to top)

💀 Limitation

EasyPaper degrades to a GUI-version PaperQA for non-CS papers, as it can only rely on user provided papers or arXiv papers in that case.

🤝 Contributing

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

(back to top)

👍 Acknowledgments

The RAG chunking and indexing are inspired by PageIndex.
The surprisingly accurate OCR is powered by Marker.
The 80K papers are crawled using paperCrawler.
The tool is made using Claude AI.

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
asset		asset
backend		backend
custom_tools		custom_tools
frontend		frontend
tasks		tasks
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.MD		CHANGELOG.MD
LICENSE		LICENSE
ReadMe.md		ReadMe.md
config.yaml		config.yaml
populate_chroma.py		populate_chroma.py
requirements.txt		requirements.txt
start.py		start.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Easy Paper

🌟 About The Project

Built With

🚀 Getting Started

🚀 Prerequisites

🚀 Installation

📖 Usage

📖 Query the database and Select the paper

📖 Add to the cart

📖 Define your Idea

📖 Download and Retrieve

📖 Deep Research

📖 Chat

💲 Cost

🏛️ Venues

💀 Limitation

🤝 Contributing

👍 Acknowledgments

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Easy Paper

🌟 About The Project

Built With

🚀 Getting Started

🚀 Prerequisites

🚀 Installation

📖 Usage

📖 Query the database and Select the paper

📖 Add to the cart

📖 Define your Idea

📖 Download and Retrieve

📖 Deep Research

📖 Chat

💲 Cost

🏛️ Venues

💀 Limitation

🤝 Contributing

👍 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages