Please use this identifier to cite or link to this item: https://hdl.handle.net/11000/38617

An adaptation of Random Forest to estimate convex non-parametric production technologies: an empirical illustration of efficiency measurement in education


no-thumbnailView/Open:

 Int Trans Operational Res - 2024 - España.pdf



1,34 MB
Adobe PDF
Share:

This resource is restricted

Title:
An adaptation of Random Forest to estimate convex non-parametric production technologies: an empirical illustration of efficiency measurement in education
Authors:
España Roch, Victor Javier
Aparicio, Juan
Barber i Vallés, Josep Xavier
Editor:
Wiley
Department:
Departamentos de la UMH::Estadística, Matemáticas e Informática
Issue Date:
2025
URI:
https://hdl.handle.net/11000/38617
Abstract:
This paper presents a novel approach to conduct non-parametric estimations of production technologies that adhere to the basic assumptions of production theory axioms, including free disposability in inputs and outputs and convexity. The methodology is rooted in adapting the highly effective machine learning techniques associated with Random Forest and the use of splines. The new method features a piecewise linear estimator analogous to data envelopment analysis (DEA); however, it distinguishes itself by addressing DEA's overfitting and lack of robustness via randomization of data and input variables in the construction of the models. In this paper, the virtues of employing machine learning techniques for assessing the efficiency of public services, particularly in the realm of educational institutions, are underscored. The new approach has the capability to predict outputs based on inputs, even for units not included in the observed sample. Furthermore, it enables the identification of the most relevant inputs in relation to output production. To demonstrate the advantages of our method, an estimation of the educational production function is conducted for Spanish regions utilizing data sourced from the Program for International Student Assessment.
Keywords/Subjects:
data envelopment analysis
machine learning
random forest
prediction
importance of variables
Knowledge area:
CDU: Ciencias sociales: Demografía. Sociología. Estadística: Estadística
CDU: Ciencias puras y naturales: Matemáticas: Análisis
Type of document:
info:eu-repo/semantics/article
Access rights:
info:eu-repo/semantics/closedAccess
Attribution-NonCommercial-NoDerivatives 4.0 Internacional
DOI:
https://doi.org/10.1111/itor.13561
Published in:
International Transactions in Operational Research
Appears in Collections:
Artículos - Estadística, Matemáticas e Informática



Creative Commons ???jsp.display-item.text9???