Generative Model of Policies: Exploring the Latent Space with Human Feedback

Installation

This project requires Python 3.10 or later and a working JAX installation. To install JAX, refer to the instructions.

pip install --upgrade pip
pip install -r requirements.txt

Overview

There are three main scripts. Each have a number of command line arguments that can be obtained by running: python <script_name>.py --help.

Training

To run a training, use the train.py script. This will create a folder in the directory results/ which contains a config file. By the end of the training a tasks.png visualization should also automatically be created. See --help for more information on the hyperparameters.

Human Feedback

To optimize in the latent space with human feedback, run the humanfeedback.py script. You can precise the run folder with --run_path or the environment with --env. See --help for more information.

At the end, a pathhf.npy should be created, as well as a plot representing the path inside the latent space.

Interpolation between behaviors

To linearly interpolate between behaviors, run the interpolation.py script. This will directly fetch the successes.npz file created after training the agent, calculate the barycenters of each task in the latent space and start the visualization. You can move the slider to move between behaviors.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
envs		envs
evaluation		evaluation
gmp		gmp
resources		resources
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
humanfeedback.py		humanfeedback.py
interpolation.py		interpolation.py
make_plots.sh		make_plots.sh
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative Model of Policies: Exploring the Latent Space with Human Feedback

Installation

Overview

Training

Human Feedback

Interpolation between behaviors

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Raffaelbdl/GMP-human-feedback

Folders and files

Latest commit

History

Repository files navigation

Generative Model of Policies: Exploring the Latent Space with Human Feedback

Installation

Overview

Training

Human Feedback

Interpolation between behaviors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages