Documentation | Installation | Examples | Release Notes
ChainerMN is an additional package for Chainer, a flexible deep learning framework. ChainerMN enables multi-node distributed deep learning with the following features:
- Scalable --- it makes full use of the latest technologies such as NVIDIA NCCL and CUDA-Aware MPI,
- Flexible --- even dynamic neural networks can be trained in parallel thanks to Chainer's flexibility, and
- Easy --- minimal changes to existing user code are required.
This blog post provides our benchmark results using up to 128 GPUs.
ChainerMN can be used for both inner-node (i.e., multiple GPUs inside a node) and inter-node settings. For inter-node settings, we highly recommend to use high-speed interconnects such as InfiniBand.
In addition to Chainer, ChainerMN depends on the following software libraries: CUDA-Aware MPI, NVIDIA NCCL, and a few Python packages. After setting them up, ChainerMN can be installed via PyPI:
pip install chainermn
Please refer to the installation guide for more information.
You can invoke MNIST example with four workers by the following command:
mpiexec -n 4 python examples/mnist/train_mnist.py
- Chainer Tutorial --- If you are new to Chainer, we recommend to start from this.
- ChainerMN Tutorial --- In this tutorial, we explain how to modify your existing code using Chainer to enable distributed training with ChainerMN in a step-by-step manner.
- Examples --- The examples are based on the official examples of Chainer and the differences are highlighted.
Any contribution to ChainerMN would be highly appreciated. Please refer to Chainer Contribution Guide.