START Conference Manager    

An Extensible Toolkit for Computational Semantics (FP)

Dan Garrette and Ewan Klein

Eighth International Conference on Computational Semantics (IWCS-8 2009)
Tilburg University, Netherlands, January 7-9, 2009


Summary

The Natural Language Toolkit (NLTK) is an open-source collection of code and corpora that can be used to perform a wide range of natural language processing tasks. NLTK is developing rapidly, and the last two years have seen a huge increase in the amount of semantics-related functionality. NLTK, which is written entirely in the Python programming language, now contains modules for first order logic, Discourse Representation Theory, typed logic, Hole Semantics, Glue Semantics, theorem proving, model building, model checking, discourse processing, and recognizing textual entailment. Providing all of these tools, along with lexical and syntactic capabilities, within the unified framework of NLTK makes natural language processing easier to teach and makes applications and prototypes easier to build.

This paper will explore the semantics offerings of NLTK, giving concrete examples from each of the various modules. It will also explain the differences between NLTK's offerings and those of other frameworks, such as the one provided by Blackburn and Bos. In addition, the paper will describe how the NLTK framework has been designed with extensibility in mind so as to encourage students and researchers alike to expand the code to suit their needs, fostering learning and experimentation.


START Conference Manager (V2.56.8 - Rev. 414)