This paper addresses the problem of monodimensional (1D) FFT parallel computation on constant-valence multicomputers, i.e. on parallel systems made up of processing elements (PEs) which do not share memory and are connected to a bounded number of neighbours. After a qualitative analysis of several possible partitionings of the DIT FFT algorithm, a decomposition is introduced that has good scalability properties and makes it possible to use sections of sequential code based on the most common 1D-FFT algorithms. If a computing architecture with indirect binary n-cube interconnection network is used, the proposed decomposition guarantees strictly local communications and therefore requires no through-routing support. These characteristics have a positive impact on software development and also on overall performance. Furthermore, thanks to a pipelined organization of the PEs, the resulting architecture has high potentialities for real-time signal processing. As these useful features are obtained at the 'expense' of an uneven workload distribution, computing efficiency is relatively low but does not significantly change in a wide range of the number of processors. An implementation on a Transputer based system is presented along with the performance results obtained. Finally a simple analytical model of the architecture is shown, that allows the values of the main performance parameters to be obtained as a function of the number of processors used and of the elementary response times of the first stage of PEs.

Parallel 1d-fft Computation On Constant-valence Multicomputers

MAZZEO, ANTONINO;VILLANO, UMBERTO
1995

Abstract

This paper addresses the problem of monodimensional (1D) FFT parallel computation on constant-valence multicomputers, i.e. on parallel systems made up of processing elements (PEs) which do not share memory and are connected to a bounded number of neighbours. After a qualitative analysis of several possible partitionings of the DIT FFT algorithm, a decomposition is introduced that has good scalability properties and makes it possible to use sections of sequential code based on the most common 1D-FFT algorithms. If a computing architecture with indirect binary n-cube interconnection network is used, the proposed decomposition guarantees strictly local communications and therefore requires no through-routing support. These characteristics have a positive impact on software development and also on overall performance. Furthermore, thanks to a pipelined organization of the PEs, the resulting architecture has high potentialities for real-time signal processing. As these useful features are obtained at the 'expense' of an uneven workload distribution, computing efficiency is relatively low but does not significantly change in a wide range of the number of processors. An implementation on a Transputer based system is presented along with the performance results obtained. Finally a simple analytical model of the architecture is shown, that allows the values of the main performance parameters to be obtained as a function of the number of processors used and of the elementary response times of the first stage of PEs.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/467234
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 2
social impact