In this work, we introduce an active learning approach for the estimation of chemical concentrations from spectroscopic data. Its main objective is to opportunely collect training samples in such a way as to minimize the error of the regression process while minimizing the number of training samples used, and thus to reduce the costs related to training sample collection. In particular, we propose two different active learning strategies developed for regression approaches based on partial least squares regression, ridge regression, kernel ridge regression, and support vector regression. The first strategy uses a pool of regressors in order to select the samples with the greatest disagreements among the different regressors of the pool, while the second one is based on adding samples that are distant from the current training samples in the feature space. For support vector regression, a specific strategy based on the selection of the samples distant from the support vectors is proposed. Experimental results on three different real data sets are reported and discussed.

Active learning for spectroscopic data regression / Douak, Fouzi; Melgani, Farid; Alajlan, Naif; Pasolli, Edoardo; Bazi, Yakoub; Benoudjit, Nabil. - In: JOURNAL OF CHEMOMETRICS. - ISSN 0886-9383. - 26:7(2012), pp. 374-383. [10.1002/cem.2443]

Active learning for spectroscopic data regression

Pasolli, Edoardo;
2012

Abstract

In this work, we introduce an active learning approach for the estimation of chemical concentrations from spectroscopic data. Its main objective is to opportunely collect training samples in such a way as to minimize the error of the regression process while minimizing the number of training samples used, and thus to reduce the costs related to training sample collection. In particular, we propose two different active learning strategies developed for regression approaches based on partial least squares regression, ridge regression, kernel ridge regression, and support vector regression. The first strategy uses a pool of regressors in order to select the samples with the greatest disagreements among the different regressors of the pool, while the second one is based on adding samples that are distant from the current training samples in the feature space. For support vector regression, a specific strategy based on the selection of the samples distant from the support vectors is proposed. Experimental results on three different real data sets are reported and discussed.
2012
Active learning for spectroscopic data regression / Douak, Fouzi; Melgani, Farid; Alajlan, Naif; Pasolli, Edoardo; Bazi, Yakoub; Benoudjit, Nabil. - In: JOURNAL OF CHEMOMETRICS. - ISSN 0886-9383. - 26:7(2012), pp. 374-383. [10.1002/cem.2443]
File in questo prodotto:
File Dimensione Formato  
Douak_2012.pdf

solo utenti autorizzati

Tipologia: Documento in Post-print
Licenza: Accesso privato/ristretto
Dimensione 1.07 MB
Formato Adobe PDF
1.07 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/732804
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 31
  • ???jsp.display-item.citation.isi??? 27
social impact