When dealing with binary regression problems, it is common practice to address first variable selection and model fitting on a training set, and then to assess its prediction performance on a test set. In this setting, the paper casts a feature selection algorithm for logistic regression that jointly optimizes explicative and predictive abilities of the available information set. To this aim, a forward search is implemented within the covariate space that iteratively selects the predictor whose inclusion in the model yields the highest significant increase in the Area Under the ROC curve (AUC) with respect to the previous step. The resulting procedure adheres to a parsimony principle and returns the relative contribution of each regressor in the prediction accuracy of the final model. The proposal is show-cased with a study on financial literacy and pension planning, on the wake of the survey on Household Income and Wealth run by the Bank of Italy in 2020. Indeed, recent literature in behavioral economic and finance highlight that boosting financial literacy among the population is a key strategy to sustain efficacy of both public policies and individual well-being face to the ageing of the population and times of economic crises.

A feature selection algorithm optimizing fitting and predictive performance of logistic regression: a case study on financial literacy and pension planning / Simone, Rosaria; Coppola, Mariarosaria. - In: ANNALS OF OPERATIONS RESEARCH. - ISSN 0254-5330. - (2025). [10.1007/s10479-025-06970-5]

A feature selection algorithm optimizing fitting and predictive performance of logistic regression: a case study on financial literacy and pension planning

Simone, Rosaria
;
Coppola, Mariarosaria
2025

Abstract

When dealing with binary regression problems, it is common practice to address first variable selection and model fitting on a training set, and then to assess its prediction performance on a test set. In this setting, the paper casts a feature selection algorithm for logistic regression that jointly optimizes explicative and predictive abilities of the available information set. To this aim, a forward search is implemented within the covariate space that iteratively selects the predictor whose inclusion in the model yields the highest significant increase in the Area Under the ROC curve (AUC) with respect to the previous step. The resulting procedure adheres to a parsimony principle and returns the relative contribution of each regressor in the prediction accuracy of the final model. The proposal is show-cased with a study on financial literacy and pension planning, on the wake of the survey on Household Income and Wealth run by the Bank of Italy in 2020. Indeed, recent literature in behavioral economic and finance highlight that boosting financial literacy among the population is a key strategy to sustain efficacy of both public policies and individual well-being face to the ageing of the population and times of economic crises.
2025
A feature selection algorithm optimizing fitting and predictive performance of logistic regression: a case study on financial literacy and pension planning / Simone, Rosaria; Coppola, Mariarosaria. - In: ANNALS OF OPERATIONS RESEARCH. - ISSN 0254-5330. - (2025). [10.1007/s10479-025-06970-5]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/1020114
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact