Literature Review Pipeline for LLM-Powered Wargames

A comprehensive pipeline for conducting systematic literature reviews on LLM-powered wargaming research. This tool automates paper discovery, screening, information extraction, and analysis.

Overview

This pipeline implements the systematic review protocol defined in review_protocol_v0_3.md for studying LLM-powered wargames. It provides:

Multi-source paper harvesting from Google Scholar, arXiv, Semantic Scholar, and Crossref
Intelligent deduplication using DOI matching and fuzzy title comparison
PDF fetching with fallback strategies including Sci-Hub
LLM-powered extraction of key information using OpenAI GPT-4
Failure mode detection using regex patterns
Visualization generation for publication trends and analysis
Export packaging with Zenodo integration

Project Structure

WordplayWorkshop2025/
├── src/lit_review/
│   ├── harvesters/       # Paper discovery modules
│   ├── processing/       # Data cleaning and PDF handling
│   ├── extraction/       # LLM extraction and tagging
│   ├── analysis/         # Failure detection and metrics
│   ├── visualization/    # Chart generation
│   └── utils/           # Configuration and utilities
├── data/
│   ├── raw/             # Harvested papers
│   ├── processed/       # Screening progress
│   ├── extracted/       # Extraction results
│   └── templates/       # Data structure templates
├── outputs/             # Visualizations and exports
├── pdf_cache/           # Downloaded PDFs
├── logs/                # SQLite logs
├── tests/               # Comprehensive test suite
├── notebooks/           # Jupyter notebooks
├── scripts/             # Utility scripts
├── config/              # Configuration files
└── run.py              # CLI interface

Installation

Prerequisites

Python 3.13+
UV package manager
OpenAI API key (for LLM extraction)
Optional: Semantic Scholar API key

Setup

Clone the repository:

git clone <repository-url>
cd WordplayWorkshop2025

Create virtual environment with UV:

uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies:

uv pip install -e .

Copy and configure settings:

cp config/config.yaml.example config/config.yaml
# Edit config/config.yaml with your API keys and preferences

Quick Start

⚠️ Important: Always use run.py as the main entry point. Do NOT use scripts in the scripts/ directory for pipeline execution - they are deprecated and may use incorrect settings (like limiting sources).

1. Harvest Papers

Search for papers from ALL configured sources (arXiv, Semantic Scholar, Google Scholar, CrossRef):

python run.py harvest

Or specify sources explicitly:

python run.py harvest --sources arxiv semantic_scholar google_scholar crossref

Options:

--query: Use preset queries or provide custom search string (default: uses query from config)
--sources: Specify sources (default: ALL configured sources)

Note: The harvest command now automatically saves a snapshot of your configuration alongside the results for reproducibility.

--max-results: Maximum results per source (default: 100)
--parallel/--no-parallel: Enable parallel searching

2. Prepare Screening Sheet

Generate Excel file for manual screening:

python run.py prepare-screen --input data/raw/papers_raw.csv

The Excel file includes:

Paper metadata and abstracts
Screening decision columns
Data validation for include/exclude decisions
Statistics and instructions sheets

3. Extract Information

Use LLM to extract structured information:

python run.py extract --input data/processed/screening_progress.csv

Extracts:

Venue type (conference, journal, workshop, tech-report)
Game type (seminar, matrix, digital, hybrid)
Open-ended vs quantitative classification
LLM family and role
Evaluation metrics
Failure modes

4. Generate Visualizations

Create charts and analysis:

python run.py visualise --input data/extracted/extraction.csv

Generates:

Publication timeline
Venue distribution
Failure modes frequency
LLM families usage
Game types distribution
Creative-Analytical Scale distribution (1-7)

5. Export Dataset

Package results for sharing:

python run.py export \
    --papers data/raw/papers_raw.csv \
    --extraction data/extracted/extraction.csv

Creates a ZIP package with:

All data files (CSV format)
Visualizations
README and metadata
Optional: Upload to Zenodo for DOI

Configuration

Edit config/config.yaml to customize:

Search Settings

search:
  queries:
    preset1: '"LLM" AND ("wargaming" OR "wargame")'
    preset2: '"Large Language Model" AND "strategic game"'
  sources:
    google_scholar:
      enabled: true
      max_results: 100

API Keys

api_keys:
  openai: ${OPENAI_API_KEY}  # Can use environment variables
  semantic_scholar: your-key-here

Failure Vocabularies

failure_vocabularies:
  escalation: [escalation, nuclear, brinkmanship]
  bias: [bias, biased, unfair, skew]
  hallucination: [hallucination, confabulate, fabricate]

Important: Avoid Using Individual Scripts

The scripts/ directory contains various utility and test scripts, but DO NOT use them for running the pipeline. These scripts may:

Use hardcoded source lists (e.g., only arxiv + semantic_scholar)
Skip important sources like Google Scholar and CrossRef
Not save configuration for reproducibility
Create inconsistent results

Always use the main run.py CLI or python -m src.lit_review for all pipeline operations.

Advanced Usage

Using Different LLM Models

Configure in config/config.yaml:

extraction:
  model: gpt-4  # or gpt-3.5-turbo
  temperature: 0.3
  max_tokens: 4000

Custom Search Queries

python run.py harvest --query '"transformer model" AND "military simulation"'

Selective Source Harvesting

python run.py harvest --sources arxiv,crossref --max-results 50

Monitoring Pipeline Status

python run.py status

Shows:

Log summary by level
Recent activity
Error tracking

Running Tests

# Run all tests
./scripts/run_tests.sh

# Run specific test categories
./scripts/run_tests.sh unit
./scripts/run_tests.sh fast

Development

Running Tests with Make

# Run all tests
make test

# Run tests with coverage
make test-coverage

# Run tests verbosely
make test-verbose

Code Quality

# Format code
make format

# Run linting
make lint

# Run type checking
make type-check

# Run all quality checks
make all

Pre-commit Hooks

Pre-commit hooks are configured to run automatically before each commit:

make pre-commit

Troubleshooting

Common Issues

Import errors: Ensure you're in the virtual environment
API rate limits: Configure rate limits in config/config.yaml
PDF download failures: Check internet connection and try Sci-Hub mirrors
LLM extraction errors: Verify OpenAI API key and quota

Debug Mode

Enable detailed logging:

python run.py --debug harvest

Database Logs

Query the SQLite log database:

from lit_review.utils import LoggingDatabase
db = LoggingDatabase('logs/logging.db')
errors = db.query_logs(level='ERROR')

Contributing

Fork the repository
Create a feature branch
Run tests: ./scripts/run_tests.sh
Submit pull request

Citation

If you use this pipeline in your research, please cite:

@software{llm_wargame_review,
  title = {Literature Review Pipeline for LLM-Powered Wargames},
  author = {Your Name},
  year = {2024},
  url = {repository-url}
}

License

[Specify your license here]

Acknowledgments

This pipeline was developed to support systematic reviews of LLM-powered wargaming research. Special thanks to the developers of the scholarly, arxiv, and other libraries that make this tool possible.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
backup_tests_20250801		backup_tests_20250801
config		config
data		data
docs		docs
notebooks		notebooks
output		output
pdf_cache		pdf_cache
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
CLEANUP_SUMMARY_20250801.md		CLEANUP_SUMMARY_20250801.md
Makefile		Makefile
PROJECT_INDEX.md		PROJECT_INDEX.md
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
run.py		run.py

eilab-gt/WordplayWorkshop2025

Folders and files

Latest commit

History

Repository files navigation

Literature Review Pipeline for LLM-Powered Wargames

Overview

Project Structure

Installation

Prerequisites

Setup

Quick Start

1. Harvest Papers

2. Prepare Screening Sheet

3. Extract Information

4. Generate Visualizations

5. Export Dataset

Configuration

Search Settings

API Keys

Failure Vocabularies

Important: Avoid Using Individual Scripts

Advanced Usage

Using Different LLM Models

Custom Search Queries

Selective Source Harvesting

Monitoring Pipeline Status

Running Tests

Development

Running Tests with Make

Code Quality

Pre-commit Hooks

Troubleshooting

Common Issues

Debug Mode

Database Logs

Contributing

Citation

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages