Dynamic Parallelism in reflectarray antenna analysis and synthesis on GPU

Capozzoli, Amedeo; Curcio, Claudio; Liseno, Angelo; Toso, Giovanni

Microstrip reflectarrays combine some of the appealing features of microstrip patch arrays and reflectors and have recently raised interest in many applications, as direct broadcast satellite (DBS) services, Earth remote sensing and micro-spacecraft missions, to mention just a few. Indeed, they are flexible like arrays, and they do not require the use of cumbersome beamforming networks as reflectors. They are low cost, can be faceted or conformed and can be aesthetically pleasing for commercial applications. To achieve high performance reflectarrays, as needed for example in satellite telecommunications, radiative models accounting for all the antenna degrees of freedom and exploiting the minimum number of approximations are required. To deal with them, accurate, efficient and effective design strategies, possibly based on sophisticated multistep strategies to guarantee a satisfactory degree of reliability are requested. Electrically large, high performance re°ectarrays having hundreds or thousands of control parameters entail high computational burden, so implementations on massively parallel architectures are mandatory to perform large-scale, accurate computations. We discuss the computationally critical steps of the synthesis procedure and present their implementation and optimization on Graphic Processing Units (GPUs) in Compute Unified Device Architecture (CUDA) language. The attention is focused on two points: the fast evaluations of the radiation operator and of the functional gradient as well as the fast implementation of the optimization algorithms. Concerning the former, different tools are described, depending on the features of the considered reflectarray and on the involved model simplifications, as the `p-series' representation and the 2D Non-Uniform FFT (NUFFT) of both NER (Non-Equispaced Results) and NED (Non-Equispaced Data) types. A deep optimization of the respective CUDA codes has been carried out in terms of memory management and transfers, latency hiding, branch path analysis, instruction selection,use of atomic instructions, paying particular attention to the improvements achieved by one of the ultimate features of NVIDIA Kepler cards, namely, dynamic parallelism. Dynamic parallelism is the capability of a GPU of assigning tasks to itself and aims at simultaneously simplifying programming and improving the performance.As Matlab has become a common platform for technical computing, interfacing of these procedures to standard Matlab scripts is also detailed. Results are presented by using cards from the latest NVIDIA GPU family architecture, namely, Kepler K20c.

Dynamic Parallelism in reflectarray antenna analysis and synthesis on GPU / Capozzoli, A., Curcio, C., Liseno, A., Giovanni, T.. - (2013), pp. 1003-1003. (34th Progress In Electromagnetics Research Symposium Stockholm, Sweden 12-15 Agosto 2013).