Information Extraction with Intelligence Augmentation

March 21, 2019 by Stefan Summesberger

SEMANTiCS 2019 Workshops & Tutorials Chair Anna Lisa Gentile is a researcher in the Intelligence Augmentation group at IBM Research Almaden, USA. The main areas of her research are Information Extraction (IE), Natural Language Processing (NLP) and Semantic Web where she is principally focused on studying methods and techniques for semantic annotation of unstructured and semi-structured content. In this interview Anna Lisa shares some insights into these areas, a preview of what to expect from this year’s workshops as well as she addresses potential participants who still hesitate with a promising incentive to submit a proposal for a workshop or tutorial.

You are a researcher in the Intelligence Augmentation group at IBM Research Almaden, USA. This sounds pretty exciting! What is going on there?

The Intelligence Augmentation group at IBM Research-Almaden helps organizations put their data to work. We research and develop methods to derive accurate insights from textual datasets, designing systems that work within the realities and constraints of proprietary, enterprise data.

Our main strategy is to develop human-in-the-loop techniques to quickly extract structured information from text, where the user – i.e. the subject matter expert - can easily define the conceptual model of the data to extract.

Discuss the potential of Intelligence Augmentation with Anna Lisa. Register for SEMANTiCS 2019!

Tell me more about the concept of  Augmented Intelligence, why is it important and what are the most promising applications you see coming up in the near future?

By “Intelligence Augmentation” we mean the ability to expedite the understating of a new domain. As we primarily work with unstructured text, we use the expression “new domain” to identify a previously unseen coherent text corpus. Our goal is to enable users to gain insights in a domain in a fast and efficient way, with an on-demand approach - the more the user is willing to explore, the more insights they get, nonetheless useful results are returned immediately can be used for further understanding. The pillar of all our proposed techniques is the partnership between human and machine, where the machine can “augment” intelligence. The machine is a brainstorming peer: its suggestions might not be great all the time, but they always trigger “good discussion,” exposing peculiarities of the underlying domain that might not be considered by the human eyes at first.

Where do you see Augmented Intelligence in the business context?

In a business context we experienced that the best way to achieve results is to use lightweight semantics: we let the users define their goals and needs by example, i.e. by providing examples of the type of information they want to extract.

Our pathway to success is training the extraction models with time and budget constraints: the user can interact with the system until she is satisfied with the results, while in the background we construct a representation of the domain.

What are the most exciting research projects that are currently on your radar? Which research results you see implemented in future applications?

Currently we are applying our human-in-the-loop methodology to various Information Extraction (IE) tasks, including building dictionaries from text corpora in various languages, performing entity recognition for various medical text, e.g. extracting mentions of adverse drug reaction from various user generated content as well as identifying from the text which drug is causing which adverse drug reaction.

At SEMANTiCS 2019 you will be chairing the Workshops and Tutorials Track. What are your expectations in this respect? Any occurrences/experiences worth sharing yet? What do you especially like about this year’s submissions?

From the current submissions the outlook is already extremely promising, as they address interesting applications and use cases of knowledge engineering, including transportation, and patent search, as well as classical methodological topics that the semantic technologies are suited to address, such as data integration and Natural Language Processing (NLP). Moreover participants will have the opportunity to become certified “Semantic Web & Knowledge Engineering Specialist” with one of the offered tutorials (which will be announced soon).

Any last words to those who haven’t submitted yet, researchers facing constraints during the work on their projects or those who still have a lot on their ToDo-Lists before the calls close in/on April 8.

I would encourage researchers and practitioners to bring their own challenges to SEMANTiCS. Organizing a workshop at SEMANTiCS will give them the opportunity to create a forum with researchers and practitioners in semantic technologies and AI.

Your project would benefit from an expert group discussion? ACT NOW: Submit a Workshop or Tutorial!


The annual SEMANTiCS conference is the meeting place for professionals who make semantic computing work, and understand its benefits and know its limitations. Every year, SEMANTiCS attracts information managers, IT-architects, software engineers, and researchers, from organisations ranging from NPOs, universities, public administrations to the largest companies in the world.