Kavarakuntla, Tulasi, Han, Liangxiu ORCID: https://orcid.org/0000-0003-2491-7473, Lloyd, huw ORCID: https://orcid.org/0000-0001-6537-4036, Latham, Annabel ORCID: https://orcid.org/0000-0002-8410-7950 and Akintoye, Samson B (2022) Performance Analysis of Distributed Deep Learning Frameworks in a Multi-GPU Environment. In: 2021 20th International Conference on Ubiquitous Computing and Communications, 20 December 2021 - 22 December 2021, London, UK.
Accepted Version
File will be available on: 3 March 2024. Download (427kB) |
Abstract
Deep Learning frameworks, such as TensorFlow, MXNet, Chainer, provide many basic building blocks for designing effective neural network models for various applications (e.g. computer vision, speech recognition, natural language processing). However, run-time performance of these deep learning frameworks varies significantly even when training identical deep network models on the same GPUs. This study presents an experimental analysis and performance model for assessing deep learning models (Convolutional Neural Networks (CNNs), Multilayer Perceptrons (MLP), Autoencoder) on three frameworks: TensorFlow, MXNet, and Chainer, in a multi-GPU environment. We analyse factors that influence these frameworks' performance by computing the running time of each framework in our proposed model, taking load imbalance factor into account. The evaluation results highlight significiant differences in the scalability of the frameworks, and the importance of load balance in parallel distributed deep learning.
Impact and Reach
Statistics
Additional statistics for this dataset are available via IRStats2.