Stateless LLM as Zero-Shot Planners with Trajectory-Aware Prompt and Querying Human Feedback

Paper: Overleaf Report | Author: Chi-Hui Lin

Key Takeaways

This research investigates how reliable stateless LLMs (APIs with no memory between calls) are for online, zero-shot household task planning, and proposes three simple, training-free mechanisms to substantially improve their performance.

Problem: Three Dominant Failure Modes of Stateless LLMs

A naive stateless baseline — prompting the LLM at each step with only the goal, current state, and available actions — consistently fails in three ways:

Error Mode	Description
Repeated strategies	The LLM cycles through the same action or short action sequence in a loop
Non-meaningful actions	The LLM selects actions that leave the environment state unchanged
Over-explanation	The model interleaves natural language with its action output, producing an invalid response

These failures cause unnecessary consumption of planning steps and unexpected episode termination.

Solution: Three Training-Free Mechanisms

Trajectory-aware prompting — Appends the history of executed actions to each prompt, giving the LLM context about what has already been tried.
LLM-initiated queries — Instructs the agent to ask a clarifying question instead of guessing when uncertain; the resulting Q&A pairs are replayed in subsequent prompts.
Post-response check with a second chance — Detects invalid LLM outputs, sends a short corrective feedback message, and lets the model retry before terminating.

Results

Experiments on six household tasks with three frontier LLMs (Gemini-3-Pro, GPT-5.1, Grok-4.1-Fast) in the TextHouse-Gym benchmark show

Gemini-3-Pro: success rate improved from ~10% → 66%
GPT-5.1 and Grok-4.1-Fast: raised to near-perfect performance

Core Insight

Three lightweight, prompt-level interventions — trajectory history, on-demand human queries, and output validation — are sufficient to dramatically improve stateless LLM planning without any fine-tuning or internal memory.

Installation

1. Clone the repository

git clone https://github.com/Ttopiac/llm_world.git
cd llm_world

2. Create and activate the conda environment

conda create -n llm_world python=3.12.12 -y
conda activate llm_world

This gives you an isolated environment with the exact Python version you expect.

3. Install required Python packages

From the repo root (v directory):

pip install "numpy==2.3.5" "pandas==2.3.3" "openai==2.8.1" "gymnasium==1.2.2"

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
GPT-Planner		GPT-Planner
envs/pddlgym_envs		envs/pddlgym_envs
pddlsim		pddlsim
results		results
utils		utils
.gitignore		.gitignore
error_modes.png		error_modes.png
llm_zs_exps.py		llm_zs_exps.py
readme.md		readme.md
show_results.py		show_results.py
success_rate.png		success_rate.png
transition_graph.py		transition_graph.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stateless LLM as Zero-Shot Planners with Trajectory-Aware Prompt and Querying Human Feedback

Key Takeaways

Problem: Three Dominant Failure Modes of Stateless LLMs

Solution: Three Training-Free Mechanisms

Results

Core Insight

Installation

1. Clone the repository

2. Create and activate the conda environment

3. Install required Python packages

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Stateless LLM as Zero-Shot Planners with Trajectory-Aware Prompt and Querying Human Feedback

Key Takeaways

Problem: Three Dominant Failure Modes of Stateless LLMs

Solution: Three Training-Free Mechanisms

Results

Core Insight

Installation

1. Clone the repository

2. Create and activate the conda environment

3. Install required Python packages

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages