Dimensionality reduction is defined as the search for a low-dimensional space that captures the “essence” of the original high-dimensional data. Principal Component Analysis (PCA) is one of the most used dimensionality reduction technique in psychology and behavioral sciences for data analysis and measure development. However, PCA can capture linear correlations between variables, but fails when this assumption is violated. In recent years, a variety of nonlinear dimensionality reduction techniques have been proposed in other research fields to overcome this limitation. In this paper, we focus on non-linear autoencoder, a multi-layer perceptron, with as many inputs as outputs and a smaller number of hidden nodes. We investigate the relation between the intrinsic dimensionality of data and the autoencoder's internal nodes in a simulation study, comparing autoencoders and PCA performances in term of reconstruction error. The evidence from this study suggests that autoencoder's ability in dimensionality reduction is very similar to PCA, and that there is a relation between internal nodes and data dimensionality.

Autoencoders as an alternative approach to Principal Component Analysis for dimensionality reduction. An application on simulated data from psychometric models

Casella M.;Dolce P.;Ponticorvo M.;Marocco D.
2021

Abstract

Dimensionality reduction is defined as the search for a low-dimensional space that captures the “essence” of the original high-dimensional data. Principal Component Analysis (PCA) is one of the most used dimensionality reduction technique in psychology and behavioral sciences for data analysis and measure development. However, PCA can capture linear correlations between variables, but fails when this assumption is violated. In recent years, a variety of nonlinear dimensionality reduction techniques have been proposed in other research fields to overcome this limitation. In this paper, we focus on non-linear autoencoder, a multi-layer perceptron, with as many inputs as outputs and a smaller number of hidden nodes. We investigate the relation between the intrinsic dimensionality of data and the autoencoder's internal nodes in a simulation study, comparing autoencoders and PCA performances in term of reconstruction error. The evidence from this study suggests that autoencoder's ability in dimensionality reduction is very similar to PCA, and that there is a relation between internal nodes and data dimensionality.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/898616
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact