Numerical Data Clustering Ontology Approach
Abstract
Clustering algorithm tasks are used to group given objects defined by a set of numerical properties in such a way that the objects within a group are more similar than the objects in different groups. All clustering algorithms have common parameters the choice of which characterizes the effectiveness of clustering. The most important parameters characterizing clustering are: metrics, number of clusters and cluster validity criteria. In classic clustering algorithms semantic knowledge is ignored. This creates difficulties in interpreting the results of clustering. At present, the use of ontology opportunities is developing very rapidly, that provide an explicit model for structuring concepts, together with their interrelationship, which allows you to gain knowledge of a particular data model. According to the previously obtained results of clustering study, the author will make an attempt to create ontology-based concept from numerical data using similarity measures, cluster numbers, cluster validity and others characteristic features. To scientific novelty should be attributed the combination of approaches of classical data analysis and ontological approach to their structuring, that increases the efficiency of their use in engineering practice.
References
Everitt, B.S.: Cluster analysis. John Viley and Sons, London (1993)
Xu, R., Wunsch, D.C.: Clustering. John Wiley & Sons (2010)
Rui, X., Wunsch, D.: Survey of clustering algorithms. Neural Networks, IEEE Transactions, 16(3), 645–678 (2005)
Hoppner, F., Klawonn, F., Kruse, R., Runkler, T.: Fuzzy Cluster Analysis. John Whiley and Sons, New York (1999)
Crawen, M., Shavlik, J.: Using sampling and queries to extract rules from trained neural networks. In: Machine Learning: Proceedings of the Eleventh International Conference, San Francisco, CA (1994)
Gašević, D., Djurić, D., Devedžić, V.: Model driven architecture and ontology development. Springer-Verlag (2006)
Gan, G., Ma, C., Wu, J.: Data clustering: Theory, algorithms and applications. ASA-SIAM series on Statistics and Applied Probability, SIAM, Philadelphia, ASA, Alexandria, VA (2007)
Kaufman, L., Rousseeuw, P.J.: Finding groups in data, An introduction to cluster analysis. John Wiley & Sons (2005)
Andrews, R., Gewa, S.: RULEX and CEBP networks as the basis for a rule refinement system. In: J. Hallam et al, editor, Hybrid Problems, Hybrid Solutions. IOS Press (1995)
Vitanyi, P.: Universal similarity. ITW2005, Rotorua, New Zealand (2005)
Li, M., Chen, X., Ma, B., Vitanyi, P.: The similarity metric. IEEE Transactions on Information Theory, vol.50, No. 12, 3250-3264 (2004)
Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice Hall PTR (1988)
Hush, D.R., Horne, B.G.: Progress in Supervised Neural Networks. What’s new since Lippmann?. IEEE Signal Processing Magazine, vol.10, No 1. 8-39 (1993)
Gruber, T.R.: A translation approach to portable ontologies. Knowledge Acquisition, 5(2), 199-220 (1993)
Guarino, N.: Formal Ontology in Information Systems. In: 1st International Conference on Formal Ontology in Information Systems, FOIS, Trento, Italy, IOS Press, 3-15 (1998)
Protégé project homepage. https://protege.stanford.edu [Online, accessed 2018/03/31]
MENDEL open access articles are normally published under a Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/ . Under the CC BY-NC-SA 4.0 license permitted 3rd party reuse is only applicable for non-commercial purposes. Articles posted under the CC BY-NC-SA 4.0 license allow users to share, copy, and redistribute the material in any medium of format, and adapt, remix, transform, and build upon the material for any purpose. Reusing under the CC BY-NC-SA 4.0 license requires that appropriate attribution to the source of the material must be included along with a link to the license, with any changes made to the original material indicated.