Research & Innovation

Mikhail GalkinKemele M. EndrisMaribel AcostaDiego CollaranaMaria Esther VidalSören Auer

Join operators are particularly important in SPARQL query engines that collect RDF data using Web access interfaces. State-of-the-art SPARQL query engines rely on binary join operators tailored for merging results from SPARQL queries over Web access interfaces.
However, in queries with a large number of triple patterns, binary joins constitute a significant burden on the query performance. 

Ali KhaliliKlaas Andries de Graaf

Most of the existing Web user interfaces (UIs) are hard-coded by their developers to address certain predefined types of data, and hence are blind to the semantics of data they are dealing with. When talking about unstructured data or data without an explicit semantic representation, our expectations of data-awareness are lower. However, when we consider Linked Data UIs, where we have both structured data and semantics, we indeed expect more awareness from the UI which renders the data. In this paper we present an architecture for data-aware UIs, called Linked Data Reactor, implemented based on Web components and Semantic Web technologies. The proposed UIs can understand users' data and are capable to interact with users accordingly.

Felix Leif KeppmannAndreas Harth

Currently, we are witnessing the rise of new technology-driven trends such as the Internet of Things, Web of Things, and Factories of the Future that are accompanied by an increasingly heterogeneous landscape of small, embedded, and highly modularized devices and applications, multitudes of manufactures and developers, and pervasion of things within all areas of life. At the same time, we can observe increasing complexity of the task of integrating subsets of heterogeneous components into applications that fulfil certain needs by providing value-added functionality beyond the pure sum of their components. Enabling integration in these multi-stakeholder scenarios requires new architectural approaches for adapting components, while building on existing technologies and thus ensuring broader acceptance. To this end, we present our approach on adaptation, that introduces adaptable interfaces, interactions, and processing for Linked Data Platform components. In addition, we provide an implementation of our approach that enables the adaptation of components via a thin meta-layer defined on top of the components' domain data and functionality. Finally, we evaluate our implementation by using a benchmark environment and adapting interfaces, interactions, and processing of the involved components at runtime.

Abderrahmane KhiatMaximilian MackeprangClaudia Müller-Birn

Enhancing creativity has been paid much attention recently, especially with the emergence of online collaborative ideation. Prior work has shown that in addition to the exposure of diverse and creative examples, visualizing the solution space enables ideators to be inspired and thus, come-up with more creative ideas. However, existing automated approaches which assess the diversity of a set of examples fail on unstructured short text due to their reliance on similarity computation. Furthermore, the conceptual divergence cannot be easily captured for such representation. To overcome these issues, in this paper we introduce an ontology-based approach. The proposed solution formalizes user ideas into ontology-based concepts and then an ontology matching system is used to compute the similarity between users' ideas. Based on this approach, we aim also to create a visualization of the solution space based on the similarity matrix obtained by matching process between all ideas.

Auriol Degbelo

Ontologies are key to information retrieval, semantic integration of datasets, and semantic similarity analyzes. Evaluating ontologies (especially defining what constitutes a "good" or "better" ontology) is therefore of central importance for the Semantic Web community. Various criteria have been introduced in the literature to evaluate ontologies, and this article classifies them according to their relevance to the design or the implementation phase of ontology development. In addition, the article compiles strategies for ontology evaluation based on ontologies published between until 2017 in two outlets: the Semantic Web Journal, and the Journal of Web Semantics. Gaps and opportunities for future research on ontology evaluation are exposed towards the end of the paper.

Kolawole John AdebayoLuigi Di CaroGuido Boella

We propose a task independent neural networks model, based on a Siamese twin architecture. Our model specifically benefits from two forms of attention scheme which we use to extract high-level feature representation of the underlying texts, both at the word level (intra-attention) as well as at the sentence level (inter-attention). The inter-attention scheme uses one of the text to create a contextual interlock with the other text, thus paying attention to mutually important parts. We evaluate our system on three tasks, i.e. Textual Entailment, Paraphrase Detection and answer-sentence selection. We set a near state-of-the-art result on the textual entailment task with the SNLI corpus while obtaining strong performance across the other tasks that we evaluate our model on.

Despina-Athanasia PantaziGeorge PapadakisKonstantina BeretaThemis PalpanasManolis Koubarakis

Discovering matching entities in different Knowledge Bases constitutes a core task in the Linked Data paradigm. Due to its quadratic time complexity, Entity Resolution typically scales to large datasets through blocking, which restricts comparisons to similar entities. For Big Linked Data, Meta-blocking is also needed to restructure the blocks in a way that boosts precision, while maintaining high recall. Based on blocking and Meta-blocking, JedAI Toolkit implements an end-to-end ER workflow for both relational and RDF data. However, its bottleneck is the time-consuming procedure of Meta-blocking, which iterates over all comparisons in each block. To accelerate it, we present a suite of parallelization techniques that are suitable for multi-core processors. We present 2 categories of parallelization strategies, with each one comprising 4 different approaches that are orthogonal to Meta-blocking algorithms. We perform extensive experiments over a real dataset with 3.4 million entities and 13 billion comparisons, demonstrating that our methods can process it within few minutes, while achieving high speedup.

Roman ProkofyevDjellel DifallahMichael LuggenPhilippe Cudre-Mauroux

Webpages are an abundant source of textual information with manually annotated entity links, and are often used as a source of training data for a wide variety of machine learning NLP tasks. However, manual annotations such as those found on Wikipedia are sparse, noisy, and biased towards popular entities. Existing entity linking systems deal with those issues by relying on simple statistics extracted from the data. While such statistics can effectively deal with noisy annotations, they introduce bias towards head entities and are ineffective for long tail (e.g., unpopular) entities. In this work, we first analyze statistical properties linked to manual annotations by studying a large annotated corpus composed of all English Wikipedia webpages, in addition to all pages from the CommonCrawl containing English Wikipedia annotations. We then propose and evaluate a series of entity linking approaches, with the explicit goal of creating highly-accurate (precision > 95\%) and broad annotated corpuses for machine learning tasks. Our results show that our best approach achieves maximal-precision at usable recall levels, and outperforms both state-of-the-art entity-linking systems and human annotators.

Kris McGlinnChristophe DebruyneLorraine McNerneyDeclan O’Sullivan
Isaiah Onando Mulang'Kuldeep SinghFabrizio Orlandi

Research has seen considerable achievements concerning the translation of natural language patterns into formal queries for Question Answering based on Knowledge Graphs (KG). The main challenge exists on how to identify which property within a Knowledge Graph matches the predicate found in a Natural Language (NL) relation. Current approaches for formal query generation attempt to resolve this problem mainly by first retrieving the named entity from the KG together with a list of its predicates, then filtering out one from all the predicates of the entity. We attempt an approach to directly match an NL predicate to KG properties that can be employed within QA pipelines. In this paper, we specify a systematic approach as well as providing a tool that can be employed to solve this task. Our approach models KB relations with their underlying parts of speech, we then enhance this with extra attributes obtained from Wordnet and Dependency parsing characteristics. From a question, we model a similar representation of query relations. We then define distance measurements between the query relation and the properties representations from the KG to identify which property is referred to by the relation within the query. We report substantive recall values and considerable accuracy from our evaluation.

Pages

Subscribe to RSS - Research & Innovation