RL-X

Documentation

Table of Contents:

Repository Link: RL-X

READMEs

Most documentation is available in the README.md files in the respective directories:

Detailed Installation Guide

1. Conda

For Linux, MacOS and Windows, a conda environment is recommended.
All the code was tested with Python 3.11.4, other versions might work as well.

conda create -n rlx python=3.11.4
conda activate rlx

2. RL-X

For Linux, MacOS and Windows, RL-X has to be cloned.

git clone git@github.com:nico-bohlinger/RL-X.git
cd RL-X

3. Dependencies

For Linux, all dependencies can be installed with the following command:

pip install -e .[all]

For MacOS and Windows, EnvPool is currently not supported. Therefore, the following command has to be used:

pip install -e .

To keep linting support when registering algorithms or environments outside of RL-X, add the editable_mode=compat argument, e.g.:

pip install -e .[all] --config-settings editable_mode=compat

4. PyTorch

For Linux, MacOS and Windows, PyTorch has to be installed separately to use the CUDA 11.8 version such that there are no conflicts with JAX. If PyTorch was previously installed with CUDA 12.X (potentially even through pip install -e .) then it is necessary to uninstall the related packages.

pip uninstall $(pip freeze | grep -i '\-cu12' | cut -d '=' -f 1) -y

Afterwards, PyTorch can be installed with the following command:

pip install "torch>=2.4.1" --index-url https://download.pytorch.org/whl/cu118 --upgrade

5. JAX

For Linux, JAX with GPU support can be installed with the following command:

pip install -U "jax[cuda12]"

For MacOS and Windows, JAX with GPU support is not supported out-of-the-box. However, it can be done with some extra effort (see here for more information).

Google Colab

To run experiments in Google Colab take a look experiments/colab_experiment.ipynb or directly open it here: Open In Colab

Run custom MJX environment

python experiment.py --algorithm.name=ppo.flax --environment.name=custom_mujoco.ant_mjx --runner.track_console=True --environment.nr_envs=4000 --algorithm.nr_steps=10 --algorithm.minibatch_size=1000 --algorithm.nr_epochs=5 --algorithm.evaluation_frequency=-1

Asynchronous vectorized environments with skipping

The gymnasium, custom MuJoCo and custom interface environments support parallel asynchronous vectorized environments with skipping.

When using many parallel environments, it can happen that some environments are faster than others at a given time step. With the default implementation of the AsyncVectorEnv wrapper from gymnasium, a combined step is only completed once all environments have finished their step, which can lead to a lot of idle waiting time.
Therefore, the AsyncVectorEnvWithSkipping wrapper allows to skip up to the slowest x% of environments and sends dummy values for the skipped environments to the algorithm instead. Be careful, this can lead to a learning performance decrease, depending on how many environments are skipped and how well the dummy values align with the environment.
Even when no environment should be skipped, the AsyncVectorEnvWithSkipping wrapper can still lead to a runtime improvement compared to the default gymnasium wrapper, because the latter waits sequentially for each environment to finish its step, while the former keeps looping over all environments until they are all finished. Therefore, it can already collect the data from some environments while the others are still running their step.

To set the maximum percentage of environments that can be skipped, set the corresponding command line argument:

No environment is skipped:

--environment.async_skip_percentage=0.0

Up to 25% of the environments can be skipped:

--environment.async_skip_percentage=0.25

Up to 100% of the environments can be skipped:

--environment.async_skip_percentage=1.0