Tampilan Petugas: Schema Matching for Large-Scale Data Based on Ontology Clustering Method

Schema Matching for Large-Scale Data Based on Ontology Clustering Method

Main Authors:	Alani, Harith Oraibi; Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 Bangi, Selangor Darul Ehsan, Malaysia, Saad, Saidah; Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 Bangi, Selangor Darul Ehsan, Malaysia
Other Authors:	Universiti Kebangsaan Malaysia
Format:	Article info application/pdf eJournal
Bahasa:	eng
Terbitan:	International Journal on Advanced Science, Engineering and Information Technology , 2017
Subjects:	automatic schema matching large-scale data ontology clustering web interfaces
Online Access:	http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/2133 http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/2133/pdf_546

ctrlnum	article-2133
fullrecord	<?xml version="1.0"?> <dc schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd"><title lang="en-US">Schema Matching for Large-Scale Data Based on Ontology Clustering Method</title><creator>Alani, Harith Oraibi; Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 Bangi, Selangor Darul Ehsan, Malaysia</creator><creator>Saad, Saidah; Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 Bangi, Selangor Darul Ehsan, Malaysia</creator><subject lang="en-US">automatic schema matching; large-scale data; ontology; clustering; web interfaces</subject><description lang="en-US">Holistic schema matching is the process of identifying semantic correspondences among multiple schemas at once. The key challenge behind holistic schema matching lies in selecting an appropriate method that has the ability to maintain effectiveness and efficiency. Effectiveness refers to the quality of matching while efficiency refers to the time and memory consumed within the matching process. Several approaches have been proposed for holistic schema matching. These approaches were mainly dependent on clustering techniques. In fact, clustering aims to group the similar fields within the schemas in multiple groups or clusters. However, fields on schemas contain much complicated semantic relations due to schema level. Ontology which is a hierarchy of taxonomies, has the ability to identify semantic correspondences with various levels. Hence, this study aims to propose an ontology-based clustering approach for holistic schema matching. Two datasets have been used from ICQ query interfaces consisting of 40 interfaces, which refer to Airfare and Job. The ontology used in this study has been built using the XBenchMatch which is a benchmark lexicon that contains rich semantic correspondences for the field of schema matching. In order to accommodate the schema matching using the ontology, a rule-based clustering approach is used with multiple distance measures including Dice, Cosine and Jaccard. The evaluation has been conducted using the common information retrieval metrics; precision, recall and f-measure. In order to assess the performance of the proposed ontology-based clustering, a comparison among two experiments has been performed. The first experiment aims to conduct the ontology-based clustering approach (i.e. using ontology and rule-based clustering), while the second experiment aims to conduct the traditional clustering approaches without the use of ontology. Results show that the proposed ontology-based clustering approach has outperformed the traditional clustering approaches without ontology by achieving an f-measure of 94% for Airfare and 92% for Job datasets. This emphasizes the strength of ontology in terms of identifying correspondences with semantic level variation.</description><publisher lang="en-US">International Journal on Advanced Science, Engineering and Information Technology</publisher><contributor lang="en-US">Universiti Kebangsaan Malaysia</contributor><date>2017-10-30</date><type>Journal:Article</type><type>Other:info:eu-repo/semantics/publishedVersion</type><type>Other:</type><type>File:application/pdf</type><identifier>http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/2133</identifier><identifier>10.18517/ijaseit.7.5.2133</identifier><source lang="en-US">International Journal on Advanced Science, Engineering and Information Technology; Vol 7, No 5 (2017); 1790-1797</source><source>2460-6952</source><source>2088-5334</source><language>eng</language><relation>http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/2133/pdf_546</relation><rights lang="en-US">Authors who publish with this journal agree to the following terms:Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).</rights><recordID>article-2133</recordID></dc>
language	eng
format	Journal:Article Journal Other:info:eu-repo/semantics/publishedVersion Other Other: File:application/pdf File Journal:eJournal
author	Alani, Harith Oraibi; Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 Bangi, Selangor Darul Ehsan, Malaysia Saad, Saidah; Center for Artificial Intelligence Technology (CAIT), Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia 43600 Bangi, Selangor Darul Ehsan, Malaysia
author2	Universiti Kebangsaan Malaysia
title	Schema Matching for Large-Scale Data Based on Ontology Clustering Method
publisher	International Journal on Advanced Science, Engineering and Information Technology
publishDate	2017
topic	automatic schema matching large-scale data ontology clustering web interfaces
url	http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/2133 http://insightsociety.org/ojaseit/index.php/ijaseit/article/view/2133/pdf_546
contents	Holistic schema matching is the process of identifying semantic correspondences among multiple schemas at once. The key challenge behind holistic schema matching lies in selecting an appropriate method that has the ability to maintain effectiveness and efficiency. Effectiveness refers to the quality of matching while efficiency refers to the time and memory consumed within the matching process. Several approaches have been proposed for holistic schema matching. These approaches were mainly dependent on clustering techniques. In fact, clustering aims to group the similar fields within the schemas in multiple groups or clusters. However, fields on schemas contain much complicated semantic relations due to schema level. Ontology which is a hierarchy of taxonomies, has the ability to identify semantic correspondences with various levels. Hence, this study aims to propose an ontology-based clustering approach for holistic schema matching. Two datasets have been used from ICQ query interfaces consisting of 40 interfaces, which refer to Airfare and Job. The ontology used in this study has been built using the XBenchMatch which is a benchmark lexicon that contains rich semantic correspondences for the field of schema matching. In order to accommodate the schema matching using the ontology, a rule-based clustering approach is used with multiple distance measures including Dice, Cosine and Jaccard. The evaluation has been conducted using the common information retrieval metrics; precision, recall and f-measure. In order to assess the performance of the proposed ontology-based clustering, a comparison among two experiments has been performed. The first experiment aims to conduct the ontology-based clustering approach (i.e. using ontology and rule-based clustering), while the second experiment aims to conduct the traditional clustering approaches without the use of ontology. Results show that the proposed ontology-based clustering approach has outperformed the traditional clustering approaches without ontology by achieving an f-measure of 94% for Airfare and 92% for Job datasets. This emphasizes the strength of ontology in terms of identifying correspondences with semantic level variation.
id	IOS1116.article-2133
institution	Indonesian Society for Knowledge and Human Development
institution_id	204
institution_type	library:special library
library	Indonesian Society for Knowledge and Human Development
library_id	78
collection	International Journal on Advanced Science, Engineering and Information Technology
repository_id	1116
subject_area	Program Komputer dan Teknologi Informasi
city	-
province	DKI JAKARTA
repoId	IOS1116
first_indexed	2017-10-19T19:39:21Z
last_indexed	2017-11-09T19:33:21Z
recordtype	dc
_version_	1722528162884091904
score	17.6175

Schema Matching for Large-Scale Data Based on Ontology Clustering Method

Lihat Juga