In this work, we present a parallel implementation of Hestenes-Jacobi-One-sided method exploiting the CUDA environment of Graphics Processing Units (GPUs). Our approach is based on a scheme which performs multiple orthogonalization processes in parallel, across multiple rows and columns. Driven by an outer loop, executed on the CPU, the algorithm configures the CUDA grid with threads and blocks in order to allow the CUDA-kernels to use the shared memory and avoid multiple accesses to global memory. We use this GPU-parallel algorithm in order to accelerate the Singular Value Decomposition (SVD) process which has a variety of applications in scientific computing, signal processing, automatic control and many other areas. Preliminar experiments show a significant improvements in terms of performances with respect to the CPU version and our previuos GPU version.

A Parallel Implementation of the Hestenes-Jacobi-One-Sides Method Using GPU-CUDA / Cuomo, Salvatore; Marcellino, Livia; Navarra, Guglielmo. - (2018), pp. 722-725. (Intervento presentato al convegno 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018 tenutosi a gbr nel 2018) [10.1109/PDP2018.2018.00118].

A Parallel Implementation of the Hestenes-Jacobi-One-Sides Method Using GPU-CUDA

Cuomo, Salvatore
;
2018

Abstract

In this work, we present a parallel implementation of Hestenes-Jacobi-One-sided method exploiting the CUDA environment of Graphics Processing Units (GPUs). Our approach is based on a scheme which performs multiple orthogonalization processes in parallel, across multiple rows and columns. Driven by an outer loop, executed on the CPU, the algorithm configures the CUDA grid with threads and blocks in order to allow the CUDA-kernels to use the shared memory and avoid multiple accesses to global memory. We use this GPU-parallel algorithm in order to accelerate the Singular Value Decomposition (SVD) process which has a variety of applications in scientific computing, signal processing, automatic control and many other areas. Preliminar experiments show a significant improvements in terms of performances with respect to the CPU version and our previuos GPU version.
2018
9781538649756
A Parallel Implementation of the Hestenes-Jacobi-One-Sides Method Using GPU-CUDA / Cuomo, Salvatore; Marcellino, Livia; Navarra, Guglielmo. - (2018), pp. 722-725. (Intervento presentato al convegno 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018 tenutosi a gbr nel 2018) [10.1109/PDP2018.2018.00118].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/728293
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact