You are here
KD: Keyphrase Digger
Keyphrase Digger (KD) is a rule-based system for keyphrase extraction. It is a Java re-implementation of KX tool (Pianta and Tonelli, 2010) with a new architecture and new features. KD combines statistical measures with linguistic information given by PoS patterns to identify and extract weighted keyphrases from texts.
• Extraction of multi-words
• Multilinguality (EN, IT, and DE)
• Easily extendible to other languages
• Higher customizability than KX
• High processing speed
• Clustering of keyphrases under the same lemma
• Various accepted formats and PoS tagsets: Stanford PoS Tagger (EN), TreeTagger (IT and EN), TextPro (IT and EN)
• Boost of specific PoS patterns
• Integration of Apache Lucene Library
Moretti, G., Sprugnoli, R., Tonelli, S. "Digging in the Dirt: Extracting Keyphrases from Texts with KD". In Proceedings of the Second Italian Conference on Computational Linguistics (CLiC-it 2015), Trento, Italy.
DOWNLOAD KD SOFTWARE PACKAGE.
[Current release v1.2: German added + new function to add a new language + bug fixes.]
TRY THE ONLINE DEMO.