Please use this identifier to cite or link to this item: https://hdl.handle.net/11000/38779

An automated process for the repository-based analysis of ontology structural metrics


Thumbnail

View/Open:
 An automated process for the repository.pdf

2,31 MB
Adobe PDF
Share:
Title:
An automated process for the repository-based analysis of ontology structural metrics
Authors:
Bernabé-Díaz, José Antonio
Franco-Nicolás, Manuel
Vivo-Molina, Juana María
Quesada-Martínez, Manuel
Duque-Ramos, Astrid
Fernández-Breis, Jesualdo Tomás
Editor:
IEEE
Department:
Departamentos de la UMH::Estadística, Matemáticas e Informática
Issue Date:
2020-08
URI:
https://hdl.handle.net/11000/38779
Abstract:
Quantitative metrics are generally applied by scientists to measure and assess the properties of data and knowledge resources. In ontology engineering, a number of metrics have been developed to analyse different features of ontologies in the last few years. However, this community has not generated any standard framework for studying the properties of ontologies or generated suf cient knowledge about the usefulness and validity as the measurement instrument of these metrics for evaluating and comparing ontologies. Recently, 19 ontology structural metrics were studied using the OBO Foundry and AgroPortal ontology repositories. This study was based on how each metric partitioned the two datasets into ve groups by applying the k-means algorithm. The results suggested that the use of ve clusters for every metric might be suboptimal. In this paper, we propose an automated process for the study of ontology structural metrics by including the selection of an optimal number of clusters for each metric. This optimal number is automatically obtained by using statistical properties of the generated clusters. Moreover, the cosine similarity is used for estimating the similarity of two repositories from the perspective of the behaviour of the same set of metrics. The results on the two datasets allow for a more realistic perspective on the behaviour of the metrics. In this paper, we show and discuss the difference observed in the comparative behaviour of the metrics on the two repositories when using the optimal number with respect to a predetermined number of clusters for every metric. The proposed method is not speci c for ontology metrics and therefore, can be applied to other types of metrics.
Keywords/Subjects:
Knowledge-based systems
Knowledge engineering
Clustering methods
Biomedical informatics
Biomedical ontologies
Quality metrics
Type of document:
info:eu-repo/semantics/article
Access rights:
info:eu-repo/semantics/openAccess
Attribution-NonCommercial-NoDerivatives 4.0 Internacional
DOI:
10.1109/ACCESS.2020.3015789
Published in:
IEEE Access, Nº8 (2020)
Appears in Collections:
Artículos - Estadística, Matemáticas e Informática



Creative Commons ???jsp.display-item.text9???