Deep sequence to sequence semantic embedding with attention for entity linking in context of incomplete linked data
Abstract
In contemporary times, Linked Data has emerged as a prominent approach for publishing data on the internet. This data is typically represented in the form of RDF (Resource Description Framework) triples, which are interconnected, thus enhancing the relevance of search results for users. Despite its advantages, Linked Data suffers from various limitations, such as erroneous data, imprecise information, and missing links between resources. Existing solutions to address the issue of missing links in RDF triples often overlook the semantic relationship between the subject and object. To address this gap, we present a novel approach called LinkED-S2S (Linking Entities deeply with Sequence To Sequence model), which employs an encoder-decoder model with an attention mechanism. Our proposed model incorporates an embedding layer to enhance data representation, along with GRU (Gated Recurrent Unit) cells to mitigate the vanishing gradient problem. This work demonstrates significant improvement in predicting missing links compared to baseline models. We evaluated our model's performance using a comprehensive set of metrics on the widely-used DBpedia dataset and standard benchmark datasets. Our model achieved very good results on these metrics, highlighting its effectiveness in predicting missing links.