Please use this identifier to cite or link to this item: https://hdl.handle.net/11000/30550
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBosch-Romeu, Raquel-
dc.contributor.authorLibrero, Julian-
dc.contributor.authorSenent Valero, Marina-
dc.contributor.authorSanfeliu-Alonso, Maria Carmen-
dc.contributor.authorSalinas-Serrano, Jose Maria-
dc.contributor.authorFores Martos, Jaume-
dc.contributor.authorSuay-Garcia, Beatriz-
dc.contributor.authorCliment, Joan-
dc.contributor.authorFalco, Antonio-
dc.contributor.authorPastor-Valero, Maria-
dc.contributor.otherDepartamentos de la UMH::Salud Pública, Historia de la Ciencia y Ginecologíaes_ES
dc.date.accessioned2024-01-22T17:40:25Z-
dc.date.available2024-01-22T17:40:25Z-
dc.date.created2023-01-
dc.identifier.urihttps://hdl.handle.net/11000/30550-
dc.description.abstractBackground: One of the main drawbacks in constructing a classification model is that some or all of the covariates are categorical variables. Classical methods either assign labels to each output of a categorical variable or are summarised measures (frequencies and percentages), which can be interpreted as probabilities. Methods: We adopted a novel mathematical procedure to construct a classification model from categorical variables based on a non-classical probability approach. More specifically, we codified the variables following the categorical data representation from the Discriminant Correspondence Analysis before constructing a non-classical probability matrix system that represents an entangled system of dependent-independent variables. We then developed a disentangled procedure to obtain an empirical density function for each representative class (minimum of two classes). Finally, we constructed our classification model using the density functions. Results: We applied the proposed procedure to build a classification model of the malignancy of Solitary Pulmonary Nodule (SPN) after five years of follow up using routine clinical data. First, with 2/3 (270) of the sample of 404 patients with SPN, we constructed the classification model, and then validated it with the remaining 1/3(134) we validated it. We tested the procedure’s stability by repeating the analysis randomly 1000 times. We obtained a model accuracy of 0.74, an F1 score of 0.58, a Cohen’s Kappa value of 0.41 and a Matthews Correlation Coefficient of 0.45. Finally, the area under the ROC curve was 0.86. Conclusion: The proposed procedure provides a machine learning classification model with an acceptable performance of a classification model of solitary pulmonary nodule malignancy constructed from routine clinical data and mainly composed of categorical variables. It provides an acceptable performance, which could be used by clinicians as a tool to classify SPN malignancy in routine clinical practice.es_ES
dc.formatapplication/pdfes_ES
dc.format.extent27es_ES
dc.language.isoenges_ES
dc.rightsinfo:eu-repo/semantics/openAccesses_ES
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectClassiffication methodses_ES
dc.subjectnon classical probabilitieses_ES
dc.subjectsolitary pulmonary nodulees_ES
dc.subject.otherCDU::6 - Ciencias aplicadas::61 - Medicinaes_ES
dc.titleA novel approach to learning through categorical variables applicable to the classification of solitary pulmonary nodule malignancyes_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.relation.publisherversionhttps://doi.org/10.21203/rs.3.rs-2502360/v1es_ES
Appears in Collections:
Artículos Salud Pública, Historia de la Ciencia y Ginecología


Thumbnail

View/Open:
 18-v1_covered_fb50faaf-0e35-4f7c-8796-f5461b050739 (1).pdf

416,63 kB
Adobe PDF
Share:


Creative Commons ???jsp.display-item.text9???