@nrailgun
2016-10-31T20:24:28.000000Z
字数 2341
阅读 1951
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
强力软件
MXNet is a
machine learning library to ease the development of ML algorithms, especially for deep neural networks. MXNet is computation and memory efficient and runs on various heterogeneous systems.
The scale and complexity of machine learning algorithm are becoming incresingly large. Almost all recent ImageNet chanllenge winners employ neural networks with very deep layers, requiring billions of floating point operations to process one single sample. The rise of computational complexity poses interesting challenges to ML system design and implementation.
How the computation is carried out:
Compare to other popular open-source ML libraries
System | Core language | Devices | Distributed |
---|---|---|---|
Caffe | C++ | CPU / GPU | |
Torch | Lua | CPU / GPU / FPGA | |
TensorFlow | C++ | CPU / GPU | |
MXNet | C++ | CPU / GPU |
The KVStore
is a distributed key-value store for data synchronization over multiple devices (machines, GPUs). It supports 2 primitives:
The following example implements the distributed gradient descent by data parallelization.
while (1) {
kv.pull(net.w);
net.forward_backward();
kv.push(net.g);
}
where the weight updating function is registered to the KVStore, and each worker repeatedly pull the newest weight from the store and then pushes out the locally computed gradient.
The above mixed implementation has the same performance comparing to a single declarative program, because the actual data push and pull are executed by lazy evaluation, which are scheduled by the backend engine just like others.
We implemented KVStore
based on the parameter server. It differs to previous works in 2 aspects: First, we use the engine to schedule the KVStore
operations and manage the data consistency. Second, we adopt an 2-level structure. Level 1 server managers the data synchronization between the devices withnin a single machine, while a level 2 server manages intermachine synchronization.