Sales RL Agent

This repository contains a script-driven implementation of a risk-sensitive distributional actor-critic for sales dialogue control, together with the paper, synthetic simulator, benchmark tooling, and inference utilities.

The project is intentionally separated from the presentation repo. The nested bantr-presentation/ directory is ignored here and remains its own GitHub repository.

Repository layout

paper.tex and paper.pdf: research paper
sales_rl_core.py: simulator, models, training loops, evaluation, plotting, checkpoint helpers
train_sales_rl_agent.py: train scalar or distributional controllers and save checkpoints
use_sales_rl_agent.py: load a checkpoint, score a manual state, or run simulator test rollouts
run_sales_benchmark.py: reproduce the benchmark figures used in the paper
generate_architecture_figure.py: regenerate the external architecture diagram used in the paper
sample_state.json: example state input for the inference script
figures/: paper figures
artifacts/: benchmark summaries and metrics

Environment setup

For local CPU work:

python -m pip install torch
python -m pip install -r requirements.txt

For an NVIDIA A100:

python -m pip install --upgrade pip
python -m pip install --index-url https://download.pytorch.org/whl/cu128 torch torchvision torchaudio
python -m pip install -r requirements.txt

Train the agent

Train the distributional agent on GPU and save a checkpoint:

python train_sales_rl_agent.py \
  --algorithm distributional_a2c \
  --device cuda \
  --batch-envs 256 \
  --hidden-dim 256 \
  --total-updates 480 \
  --evaluate-episodes 512

Train both scalar and distributional baselines and regenerate comparison figures:

python train_sales_rl_agent.py \
  --algorithm both \
  --device cuda \
  --batch-envs 256 \
  --hidden-dim 256 \
  --total-updates 480 \
  --evaluate-episodes 512

Use the trained agent

Score a single manually specified sales state:

python use_sales_rl_agent.py \
  --checkpoint checkpoints/distributional_a2c.pt \
  --device cuda \
  --state-file sample_state.json

Run greedy simulator tests with the saved checkpoint:

python use_sales_rl_agent.py \
  --checkpoint checkpoints/distributional_a2c.pt \
  --device cuda \
  --simulate-episodes 5

Reproduce paper assets

python generate_architecture_figure.py
python run_sales_benchmark.py
pdflatex -interaction=nonstopmode -halt-on-error paper.tex
pdflatex -interaction=nonstopmode -halt-on-error paper.tex

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sales RL Agent

Repository layout

Environment setup

Train the agent

Use the trained agent

Reproduce paper assets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
artifacts		artifacts
figures		figures
.gitignore		.gitignore
README.md		README.md
generate_architecture_figure.py		generate_architecture_figure.py
paper.pdf		paper.pdf
paper.tex		paper.tex
requirements.txt		requirements.txt
run_sales_benchmark.py		run_sales_benchmark.py
sales_rl_core.py		sales_rl_core.py
sample_state.json		sample_state.json
train_sales_rl_agent.py		train_sales_rl_agent.py
use_sales_rl_agent.py		use_sales_rl_agent.py

Folders and files

Latest commit

History

Repository files navigation

Sales RL Agent

Repository layout

Environment setup

Train the agent

Use the trained agent

Reproduce paper assets

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages