In engineering design projects, engineers have to enter component data into their design tools. This process is usually manual, because even though data sheets are mostly provided in digital format (e.g. PDF files), the data is not machine-readable. Furthermore, the data sheets follow no standard in terms of design and terminology, resulting in misunderstanding and lack of data provenance information. Ideally, all component suppliers would offer data in machine-readable, standard format. However, this is an unrealistic scenario, at least for the time being. Using computer-aided methods to handle data sheets is the next best approach.
To implement the automated information extraction from the data sheets, domain ontologies can be used. However, ontologies rely solely on the experiences and perspectives of their creators at the time of creation. To automate the evolution of ontologies, we developed a system -- ConTrOn (Continuously Trained Ontology) -- that automatically enriches them.
Results show that the iterative ontology enrichment process can improve the information extraction process in comparison to a text-based manual search, especially when domain experts are involved.