[关闭]
@nrailgun 2016-10-31T20:24:28.000000Z 字数 2341 阅读 1951

MXNet

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

强力软件


MXNet is a

machine learning library to ease the development of ML algorithms, especially for deep neural networks. MXNet is computation and memory efficient and runs on various heterogeneous systems.

1. Introduction

The scale and complexity of machine learning algorithm are becoming incresingly large. Almost all recent ImageNet chanllenge winners employ neural networks with very deep layers, requiring billions of floating point operations to process one single sample. The rise of computational complexity poses interesting challenges to ML system design and implementation.

How the computation is carried out:

Compare to other popular open-source ML libraries

System Core language Devices Distributed
Caffe C++ CPU / GPU
Torch Lua CPU / GPU / FPGA
TensorFlow C++ CPU / GPU
MXNet C++ CPU / GPU

2 Programming Interface

2.1 Symbol: Declarative Symbolic Expressions

2.2 NDArray: Imperative Tensor Computation

2.3 KVStore: Data Synchronization Over Devices

The KVStore is a distributed key-value store for data synchronization over multiple devices (machines, GPUs). It supports 2 primitives:

The following example implements the distributed gradient descent by data parallelization.

  1. while (1) {
  2. kv.pull(net.w);
  3. net.forward_backward();
  4. kv.push(net.g);
  5. }

where the weight updating function is registered to the KVStore, and each worker repeatedly pull the newest weight from the store and then pushes out the locally computed gradient.

The above mixed implementation has the same performance comparing to a single declarative program, because the actual data push and pull are executed by lazy evaluation, which are scheduled by the backend engine just like others.

3. Implementation

3.1 Computation Graph

3.2 Dependency Engine

3.3 Data communication

We implemented KVStore based on the parameter server. It differs to previous works in 2 aspects: First, we use the engine to schedule the KVStore operations and manage the data consistency. Second, we adopt an 2-level structure. Level 1 server managers the data synchronization between the devices withnin a single machine, while a level 2 server manages intermachine synchronization.

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注