A US-based semiconductor company asked Auriga to analyze and optimize Machine Learning and Deep Learning libraries’ performance on new processors.
- Research and optimization of the ML/DL libraries’ performance: TensorFlow, Caffe, MXNET, scikit-learn, etc.
- Low-level analysis of bottlenecks in basic mathematical algorithms (vector/matrix multiplication).
- Parallel calculations features.
- Comparison with state-of-the-art benchmarks of the leading hardware manufacturers.
- Neural networks libraries’ bottlenecks revealed in the course of detailed performance analysis.
- Performance maximized due to optimal hardware/software configurations found.
- Developed benchmarks demonstrate 20-30% higher performance for deep neural networks training on new processors compared to competing platforms.