Installation Guide¶
Requirements¶
In addition to Chainer, ChainerMN depends on the following software libraries: CUDA-Aware MPI, NVIDIA NCCL, and a few Python packages including MPI4py and cupy.
Chainer¶
ChainerMN adds distributed training features to Chainer; thus it naturally requires Chainer. Please refer to the official instructions to install.
CUDA-Aware MPI¶
ChainerMN relies on MPI. In particular, for efficient communication between GPUs, it uses CUDA-aware MPI. For details about CUDA-aware MPI, see this introduction article. (If you use only the CPU mode, MPI does not need to be CUDA-Aware. See Non-GPU environments for more details.)
The CUDA-aware features depend on several MPI packages, which need to be configured and built properly. The following are examples of Open MPI and MVAPICH.
Open MPI (for details, see the official instructions):
$ ./configure --with-cuda
$ make -j4
$ sudo make install
MVAPICH (for details, see the official instructions):
$ ./configure --enable-cuda
$ make -j4
$ sudo make install
$ export MV2_USE_CUDA=1 # Should be set all the time when using ChainerMN
NCCL¶
To enable efficient intra-node GPU-to-GPU communication, we use NVIDIA Collective Communications Library (NCCL). See the official instructions for installation.
ChainerMN requires NCCL even if you have only one GPU per node. The only exception is when you run ChainerMN on CPU-only environments. See Non-GPU environments for more details.
Note
We reccomend NCCL 2 but NCCL 1 can be used.
When you use CUDA 7.0 and 7.5, please install NCCL 1 because NCCL 2 is not supported with CUDA 7.0 and 7.5.
However, for NCCL 1, PureNcclCommunicator
is not supported in ChainerMN.
If you use NCCL 1, please properly configure environment variables to expose NCCL both when you install and use ChainerMN.
Typical configurations should look like the following:
export NCCL_ROOT=<path to NCCL directory>
export CPATH=$NCCL_ROOT/include:$CPATH
export LD_LIBRARY_PATH=$NCCL_ROOT/lib/:$LD_LIBRARY_PATH
export LIBRARY_PATH=$NCCL_ROOT/lib/:$LIBRARY_PATH
MPI4py¶
ChainerMN depends on a few Python packages, which are automatically installed when you install ChainerMN.
However, among them, we need to be a little careful about MPI4py. It links to MPI at installation time, so please be sure to properly configure environment variables so that MPI is available at installation time. In particular, if you have multiple MPI implementations in your environment, please expose the implementation that you want to use both when you install and use ChainerMN.
In addition, Cython may not be installed automatically. It can be installed manually via pip:
$ pip install cython
Tested Environments¶
We tested ChainerMN on all the following environments.
- OS
- Ubuntu 14.04 LTS 64bit
- Python 2.7.13 3.5.1 3.6.1
- Chainer 3.1.0 3.2.0
- MPI
- openmpi 1.6.5 1.10.3 2.1.1
- mvapich 2.2
- MPI4py 2.0.0
- NCCL 1.3.4 2.0.4
Installation¶
Install via pip¶
We recommend to install ChainerMN via pip:
$ pip install chainermn
NOTE: If you need sudo to use pip, you should be careful about environment variables. The sudo command DOES NOT inherit the environment, and thus you need to specify the variables explicitly.
$ sudo CPATH=${CPATH} LIBRARY_PATH=${LIBRARY_PATH} pip install chainermn
Install from Source¶
You can use setup.py
to install ChainerMN from source:
$ tar zxf chainermn-x.y.z.tar.gz
$ cd chainermn-x.y.z
$ python setup.py install
Non-GPU environments¶
For users who want to try ChainerMN in a CPU-only environment,
typically for testing for debugging purpose, ChainerMN can be built
with the --no-nccl
flag.:
$ python setup.py install --no-nccl
In this case, the MPI does not have to be CUDA-aware.
Only naive
communicator works with the CPU mode.