The use of hardware accelerators, based on code and data offloading devoted to overcoming the CPU limitations in cores, is one of the main distinctive trends in high-end computing and related applications in the last decade. However, while code offloading is convenient for performance improvement, becoming a commonly used paradigm, memory access and management are a source of bottlenecks due to the need to interact with different address spaces. In this regard, NVidia introduced the CUDA Unified Memory model to avoid explicit memory copies between the machine hosting the accelerator device and the device itself and vice-versa. This paper shows a novel design and implementation of the support to the CUDA Unified Memory in open-source GPGPU virtualization services. The performance evaluation demonstrates that the overhead due to the virtualization and remoting is acceptable considering the possibility of sharing CUDA-enabled GPUs between various and heterogeneous machines hosted at the edge, in cloud infrastruc- tures, or as accelerator nodes in an HPC scenario. A prototype implementation of the proposed solution is available as open- source.

Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels / R., Montella; Di Luccio, D.; De Vita, C. G.; Mellone, G.; Lapegna, M.; Laccetti, G.; Giunta, G.; Kosta, S.. - (2022), pp. 834-841. ( 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing CCGrid 2022 Taormina (Italy) 16-19 maggio 2022) [10.1109/CCGrid54584.2022.00099].

Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels

M. Lapegna;G. Laccetti;
2022

Abstract

The use of hardware accelerators, based on code and data offloading devoted to overcoming the CPU limitations in cores, is one of the main distinctive trends in high-end computing and related applications in the last decade. However, while code offloading is convenient for performance improvement, becoming a commonly used paradigm, memory access and management are a source of bottlenecks due to the need to interact with different address spaces. In this regard, NVidia introduced the CUDA Unified Memory model to avoid explicit memory copies between the machine hosting the accelerator device and the device itself and vice-versa. This paper shows a novel design and implementation of the support to the CUDA Unified Memory in open-source GPGPU virtualization services. The performance evaluation demonstrates that the overhead due to the virtualization and remoting is acceptable considering the possibility of sharing CUDA-enabled GPUs between various and heterogeneous machines hosted at the edge, in cloud infrastruc- tures, or as accelerator nodes in an HPC scenario. A prototype implementation of the proposed solution is available as open- source.
2022
978-1-6654-9956-9
Enabling the CUDA Unified Memory model in Edge, Cloud and HPC offloaded GPU kernels / R., Montella; Di Luccio, D.; De Vita, C. G.; Mellone, G.; Lapegna, M.; Laccetti, G.; Giunta, G.; Kosta, S.. - (2022), pp. 834-841. ( 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing CCGrid 2022 Taormina (Italy) 16-19 maggio 2022) [10.1109/CCGrid54584.2022.00099].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/886561
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 1
social impact