This tutorial gives an overview of WordNet senses and the NL-Soar WordNet interface.

Previous versions of NL-Soar were very limited in their vocabulary, and all words handled had to be hand-coded into the system via productions. There was considerable variance in the data entered, and there was very little coverage of standard English vocabulary.

This release includes WordNet, a large-scale lexical database developed by the Princeton Cognitive Science Department. More information on WordNet can be found at this site.

Use of WordNet gives NL-Soar many great advantages (coverage, systematic data, etc.) but also introduces some significant problems (scaling up of processing, indeterminacy, complexity). In this tutorial we survey some of the advantages and disadvantages to using WordNet. Also, the tutorial discusses how WordNet is used by NL-Soar.

Every time a lexical access operator fires, the system goes to WordNet to get the related information for this word.


WordNet needs to be installed on the user's machine. It's freely available from this website, and instructions on how to install it can be found there.

Generally, the WordNet distribution files are used during processing. However, for efficiency reasons, NL-Soar looks into a custom-made index called wnet.offsets1 that contains positions in the WordNet files for data associated with each word's part of speech and word sense. During lexical access this information is accessed, and then NL-Soar

All of the information related to a word's WordNet data is stored in the top-state ^sentence attribute for that word, under an attribute called ^wnetdata.

Most of the WordNet interfacing is performed via the TSI using several commands written in C, particularly where access time is crucial. Other commands are written in Tcl. The C and Tcl commands are generally called as right-hand functions in NL-Soar productions. A considerable number of productions also assure that the WordNet-derived information is integrated with more traditional NL-Soar processing.  Note that, in the WordNet-based versions of NL-Soar, most of the words from the previous NL-Soar version's lexicon are removed since WordNet supplies that information.

First, the word is morphologically reduced to its base form(s), or lemma(s). For example, the word "runs" reduces to the noun "run" and the verb "run"; the word "axes" reduces to the nouns "ax", "axe", and "axis" and to the verbs "ax" and "axe". WordNet's Morphy component supplies this information to NL-Soar.

Next, word senses are accessed for each base form. These senses are derived from the lexicographer files which are described in WordNet documentation. They show up on the semantics grapher window.

Next, verb frames are collected for each verb base form. This information is parsed out from the WordNet data files. Frame information is used to decide which constituents should be joined to which other ones.

Finally, WordNet-derived semantic classes are used for some kinds of semantic attachment decisions.