Please use this identifier to cite or link to this item:
https://hdl.handle.net/11000/38617
An adaptation of Random Forest to estimate convex non-parametric production technologies: an empirical illustration of efficiency measurement in education
View/Open: Int Trans Operational Res - 2024 - España.pdf
1,34 MB
Adobe PDF
Share:
This resource is restricted
Title: An adaptation of Random Forest to estimate convex non-parametric production technologies: an empirical illustration of efficiency measurement in education |
Authors: España Roch, Victor Javier Aparicio, Juan Barber i Vallés, Josep Xavier |
Editor: Wiley |
Department: Departamentos de la UMH::Estadística, Matemáticas e Informática |
Issue Date: 2025 |
URI: https://hdl.handle.net/11000/38617 |
Abstract:
This paper presents a novel approach to conduct non-parametric estimations of production technologies that adhere to the basic assumptions of production theory axioms, including free disposability in inputs and outputs and convexity. The methodology is rooted in adapting the highly effective machine learning techniques associated with Random Forest and the use of splines. The new method features a piecewise linear estimator analogous to data envelopment analysis (DEA); however, it distinguishes itself by addressing DEA's overfitting and lack of robustness via randomization of data and input variables in the construction of the models. In this paper, the virtues of employing machine learning techniques for assessing the efficiency of public services, particularly in the realm of educational institutions, are underscored. The new approach has the capability to predict outputs based on inputs, even for units not included in the observed sample. Furthermore, it enables the identification of the most relevant inputs in relation to output production. To demonstrate the advantages of our method, an estimation of the educational production function is conducted for Spanish regions utilizing data sourced from the Program for International Student Assessment.
|
Keywords/Subjects: data envelopment analysis machine learning random forest prediction importance of variables |
Knowledge area: CDU: Ciencias sociales: Demografía. Sociología. Estadística: Estadística CDU: Ciencias puras y naturales: Matemáticas: Análisis |
Type of document: info:eu-repo/semantics/article |
Access rights: info:eu-repo/semantics/closedAccess Attribution-NonCommercial-NoDerivatives 4.0 Internacional |
DOI: https://doi.org/10.1111/itor.13561 |
Published in: International Transactions in Operational Research |
Appears in Collections: Artículos - Estadística, Matemáticas e Informática
|
???jsp.display-item.text9???