Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/11000/35611

Una nueva metodología basada en Gradient Boosting para la estimación de fronteras de mejores prácticas


Vista previa

Ver/Abrir:
 TESIS SF MariaGuillenGarcia (1).pdf

3,74 MB
Adobe PDF
Compartir:
Título :
Una nueva metodología basada en Gradient Boosting para la estimación de fronteras de mejores prácticas
Autor :
Guillén García, María Dolores
Tutor:
Aparicio, Juan  
Editor :
Universidad Miguel Hérnández de Elche
Departamento:
Departamentos de la UMH::Estadística, Matemáticas e Informática
Fecha de publicación:
2024
URI :
https://hdl.handle.net/11000/35611
Resumen :
Dentro de los campos de la econometría y la ingeniería de producción, un tema de interés es la evaluación de la eficiencia técnica de entidades a partir de la estimación de la frontera de mejores prácticas, la cual delimita el conjunto de posibilidades de producción o tecnología. Por definición, un...  Ver más
In econometrics and production engineering, a topic of interest is the evaluation of technical efficiency of firms from the estimation of the best practice frontier, which delineates the production possibility set or technology. By definition, a technology must satisfy a set of microeconomic postulates. Likewise, a valid estimator of a technology should meet the same set of axioms. Among non-parametric approaches, Data Envelopment Analysis (DEA) and Free Disposal Hull (FDH) stand out. Both methodologies are deterministic and fulfill the minimal extrapolation principle. This implies that they are susceptible to random and systematic measurement errors due to noise, and to overfitting of the sample data used to generate the estimator, limiting their ability for inference outside the data sample. Recent literature has explored the use of machine learning techniques to improve the estimation of production frontiers. However, the use of boosting techniques, a machine learning methodology based on the sequential combination of multiple weak models to improve the final prediction, has not been explored. In this Thesis, a new methodology based on the Gradient Tree Boosting algorithm for the estimation of production frontiers is developed. As pointed out in the very beginning, the Thesis is a compendium of three published articles, gathered in Appendices A, B and C. In the first of these, the original algorithm is adapted so that the resulting estimator meets the axioms of monotonicity and free disposability (compulsory for production frontier estimators), leading to the EATBoosting algorithm. In the second one, it is shown how to calculate different measures of technical efficiency using the technology generated by the new estimator as a basis. Nevertheless, from a computational point of view, the new approach involves thousands of decision variables, making it difficult to solve. To address this issue, a heuristic approximation to exact efficiency measures is also proposed. Finally, to facilitate the use of this new methodology by other researchers and professionals, an R library called BoostingDEA has been developed, which includes the main functionalities of DEA, FDH, and EATBoosting. The main advantage of the new approach lies in its ability to tackle the problem of overfitting. Unlike traditional techniques, our methodology does not systematically underestimate the real inefficiency of the Decision Making Units (DMUs), functioning more as an inferential tool rather than merely descriptive. This allows for greater discriminatory power, leading to a more precise identification of inefficiencies, outperforming FDH in the simulated scenarios in both mean squared error and bias. Additionally, our approach provides a potential solution to the curse of dimensionality problem, which occurs when the ratio between the number of DMUs and the number of variables is low. The application of EATBoosting in these cases allows for a more robust and precise efficiency analysis.
Palabras clave/Materias:
Inteligencia Artificial
Computación en Estadística
Estadística económica
Área de conocimiento :
CDU: Ciencias puras y naturales: Matemáticas
Tipo de documento :
info:eu-repo/semantics/doctoralThesis
Derechos de acceso:
info:eu-repo/semantics/openAccess
Attribution-NonCommercial-NoDerivatives 4.0 Internacional
Aparece en las colecciones:
Tesis doctorales - Ciencias e Ingenierías



Creative Commons La licencia se describe como: Atribución-NonComercial-NoDerivada 4.0 Internacional.