Preview

Proceedings of the Southwest State University

Advanced search

ALGORITHMIC OPTIMIZATION OF SOFTWARE IMPLEMENTATION OF ALGORITHMS FOR MULTIPLYING DENSE REAL MATRICES ON GRAPHICS PROCESSORS WITH OPENGL TECHNOLOGY SUPPORT

https://doi.org/10.21869/2223-1560-2017-21-5-06-15

Abstract

In the article was given statement of a problem of matrix multiplication. Is is show that desired problem can be simpl formulated but for its solving may be required both heuristic methods and set of algorithmic modifications relating to algorithmic and high-level software optimization taking into account the particular problem and allow to increase the multiplication performance. These include: a comparative analysis of the performance of the actions performed without GPU-specific optimizations and with optimizations, which showed that computations without optimizing the work with global GPU memory have low processing performance. Optimizing data distribution in global and local memory The GPU allows you to reuse the calculation time and increase real performance. To compare the performance of the developed software implementations for OpenGL and CUDA technologies, identical calculations on identical GPUs were performed, which showed higher real performance when using CUDA cores. Specific values of generation performance measured for multi-threaded software implementation on GPU are given for all of described optimizations. It is shown that the most effective approach is based on the method we can get much more performance by technique of caching sub-blocks of the matrices (tiles) in the GPU's on-chip local memory, that with specialized software implementation is provide the performance of 275,3 GFLOP/s for GPU GeForce GTX 960M.

About the Authors

Y. A. Zatolokin
Southwest State University
Russian Federation


E. I. Vatutin
Southwest State University
Russian Federation


V. S. Titov
Southwest State University
Russian Federation


References

1. Ватутин Э.И., Зотов И.В. Построение матрицы отношений в задаче оптимального разбиения параллельных уп-равляющих алгоритмов // Известия Курского государственного технического университета. 2004. № 2. С. 85-89.

2. Ватутин Э.И., Мартынов И.А., Титов В.С. Оценка реальной производительности современных видеокарт с поддержкой технологии CUDA в задаче умножения матриц // Известия Юго-За-падного государственного университета. Серия: Управление, вычислительная техника, информатика. Медицинское приборостроение. 2014. № 2. С. 8-17.

3. OpenGL. URL: https://ru.wikipedia.org/ wiki/ OpenGL (дата обращения: 01.02.2017).

4. APP SDK - A Complete Development Platform // AMD: website. URL: http: // developer.amd.com/tools-and-sdks/ OpenGL- zone/amd-accelerated-parallel-pro-cessing-app-sdk/ (дата обращения: 01.02.2017).

5. CUDA АЛЬМАНАХ / Май 2015 г. 5 мая 2015. NVIDIA: сайт. URL: http: //www.nvidia.ru/docs/IO/141194/ CUDA-альманах-may-2015.pdf (дата обращения 01.02.2017).

6. Ватутин Э.И., Мартынов И.А., Титов В.С. Оценка реальной производительности современных процессоров в задаче умножения матриц для однопоточной программной реализации // Известия Юго-Западного государственного университета. Серия: Управление, вычислительная техника, информатика. Медицинское приборостроение. 2013. № 4. С. 11-20.

7. Казеннов А.М. Основы технологии CUDA и OpenGL // НОЦ СКТ МФТИ: сайт. URL: http://hpc.mipt.ru/ wp-content/ uploads/ 2013/11/ CUDA+ OpenGL.pdf (дата обращения 01.02.2017).


Review

For citations:


Zatolokin Y.A., Vatutin E.I., Titov V.S. ALGORITHMIC OPTIMIZATION OF SOFTWARE IMPLEMENTATION OF ALGORITHMS FOR MULTIPLYING DENSE REAL MATRICES ON GRAPHICS PROCESSORS WITH OPENGL TECHNOLOGY SUPPORT. Proceedings of the Southwest State University. 2017;21(5):6-15. (In Russ.) https://doi.org/10.21869/2223-1560-2017-21-5-06-15

Views: 480


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2223-1560 (Print)
ISSN 2686-6757 (Online)