The paper “Frame-based Ontology Population with PIKES” by Francesco Corcoglioniti, Marco Rospocher, Alessio Palmero Aprosio has been published on the IEEE Transactions on Knowledge & Data Engineering journal.
We present an approach for ontology population from natural language English texts that extracts RDF triples according to FrameBase, a Semantic Web ontology derived from FrameNet. Processing is decoupled in two independently-tunable phases. First, text is processed by several NLP tasks, including Semantic Role Labeling (SRL), whose results are integrated in an RDF graph of mentions , i.e., snippets of text denoting some entity / fact. Then, the mention graph is processed with SPARQL-like rules using a specifically created mapping resource from NomBank / PropBank / FrameNet annotations to FrameBase concepts, producing a knowledge graph whose content is linked to DBpedia and organized around semantic frames , i.e., prototypical descriptions of events and situations. A single RDF/OWL representation is used where each triple is related to the mentions / tools it comes from. We implemented the approach in PIKES, an open source tool that combines two complementary SRL systems and provides a working online demo. We evaluated PIKES on a manually annotated gold standard, assessing precision / recall in (i) populating FrameBase ontology, and (ii) extracting semantic frames modeled after standard predicate models, for comparison with state-of-the-art tools for the Semantic Web. We also evaluated (iii) sampled precision and execution times on a large corpus of 110K Wikipedia-like pages.