Resumen :
This study develops a comprehensive workflow integrating Headspace Solid-Phase Microextraction Gas Chromatography–Mass Spectrometry (HS-SPME-GC-MS) with advanced
supervised machine learning to authenticate the botanical origin of honeys from five distinct
floral sources—coriander, orange blossom, astragalus, rosemary, and chehelgiah. While
HS-SPME-GC-MS combined with traditional chemometrics (e.g., PCA, LDA, OPLS-DA) is
well-established for honey discrimination, the application and direct comparison of Random
Forest (RF), eXtreme Gradient Boosting (XGBoost), and Neural Network (NN) models
represent a significant advancement in multiclass prediction accuracy and model robustness.
A total of 57 honey samples were analyzed to generate detailed volatile organic compound
(VOC) profiles. Key chemotaxonomic markers were identified: anethole in coriander and
chehelgiah, thymoquinone in astragalus, p-menth-8-en-1-ol in orange blossom, and dill ester
(3,6-dimethyl-2,3,3a,4,5,7a-hexahydrobenzofuran) in rosemary. Principal component analysis (PCA) revealed clear separation across botanical classes (PC1: 49.8%; PC2: 22.6%). Three
classification models—RF, XGBoost, and NN—were trained on standardized, stratified data.
The NN model achieved the highest accuracy (90.32%), followed by XGBoost (86.69%) and
RF (83.47%), with superior per-class F1-scores and near-perfect specificity (>0.95). Confusion
matrices confirmed minimal misclassification, particularly in the NN model. This work
establishes HS-SPME-GC-MS coupled with deep learning as a rapid, sensitive, and reliable
tool for multiclass honey botanical authentication, offering strong potential for real-time
quality control, fraud detection, and premium market certification.
|