XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library

pypi docs license downloads stars forks watchers PyTorch TensorFlow MindSpore gymnasium pettingzoo Python Benchmarks docs:chinese

XuanCe is an open-source ensemble of Deep Reinforcement Learning (DRL) algorithm implementations. The name “XuanCe” (玄策) comes from two Chinese characters:

  • “Xuan” (玄) means incredible, mysterious, or profound.

  • “Ce” (策) means policy or strategy.

Together, XuanCe represents “incredible policies”, reflecting the goal of discovering optimal policies through DRL.

DRL algorithms are sensitive to hyper-parameters tuning, varying in performance with different tricks, and suffering from unstable training processes, therefore, sometimes DRL algorithms seems elusive and “Xuan”. That is why this project exists: to provide clean, easy-to-understand implementations of DRL algorithms. We hope it can help uncover some of the “magic” behind DRL and make it a bit less mysterious.

We are also working to make XuanCe compatible with popular deep learning frameworks like PyTorchtorch ), TensorFlowtensorflow ), and MindSporemindspore ). Our goal is to turn it into a full-fledged DRL “zoo” where you can explore and experiment with a wide variety of algorithms.

Why XuanCe?

XuanCe is designed to streamline the implementation and development of deep reinforcement learning algorithms. It empowers researchers to quickly grasp fundamental principles, making it easier to dive into algorithm design and development. Here are its key features:

  • Highly Modular: Designed with a modular structure to enhance flexibility and scalability.

  • User-Friendly: Easy to learn, install, and use, making it accessible for users of all levels.

  • Flexible Model Integration: Supports seamless combination and customization of models.

  • Diverse Algorithms: Offers a rich collection of algorithms catering to various tasks.

  • Versatile Task Support: Handles both deep reinforcement learning (DRL) and multi-agent reinforcement learning (MARL) scenarios.

  • Broad Compatibility: Supports PyTorch, TensorFlow, MindSpore, and runs efficiently on CPU, GPU, and across Linux, Windows, and macOS.

  • High Performance: Delivers fast execution speeds, leveraging vectorized environments for efficiency.

  • Distributed Training: Enables multi-GPU training for scaling up experiments.

  • Hyperparameters Tuning: Supports automatically hyperparameters tuning.

  • Enhanced Visualization: Provides intuitive and comprehensive visualization with tools like TensorBoard and Weights & Biases (wandb).

List of Algorithms

        flowchart LR

CORE["Unified Framework (Modularized) <br/>Representation + Policy + Communication (for MARL) + Learner + Agent"]
CORE --> Value[Value-based]
CORE --> Policy[Policy-based]
CORE --> MARL[MARL]
CORE --> Model[Model-based]
CORE --> Contrastive[Contrastive RL]
CORE --> Offline[Offline RL]

Value --> DQN[DQN/DDQN/DuelDQN...]
Policy --> ON[On-policy: PG/A2C/PPO...]
Policy --> OFF[Off-policy: DDPG/SAC/TD3...]

MARL --> ONMA[On-policy: VDAC/COMA/IPPO/MAPPO...]
MARL --> OFFMA[Off-policy: VDN/QMIX/MADDPG/MASAC...]
MARL --> COMMMA[Communication: CommNet/IC3Net/TarMAC...]

Model --> MBRL[DreamerV2/DreamerV3/HarmonyDreamer...]

Contrastive --> CRL[CURL/DrQ/SPR...]

Offline --> OFFLINERL[TD3BC...]
    

Value-based:

Policy-based:

MARL-based:

Model-based:

Contrastive RL:

Offline RL:

The Framework of XuanCe

The overall framework of XuanCe is shown as below.

_images/XuanCe-Framework.png

XuanCe contains four main parts:

  • Part I: Configs. The configurations of hyper-parameters, environments, models, etc.

  • Part II: Common tools. Reusable tools that are independent of the choice of DL backend.

  • Part III: Environments. The supported simulated environments.

  • Part IV: Algorithms. The key part to build DRL algorithms.

Who Is XuanCe For?

XuanCe is designed for a wide range of users, including:

  • Researchers exploring new reinforcement learning methods

  • Developers building DRL-based applications

  • Students and beginners learning about intelligent decision-making

  • AI practitioners interested in single-agent and multi-agent systems



Contents