University of Zurich | Institute of Computational Linguistics | Contact for questions and resources: Simon Clematide

Many lexical entries are taken from Pledari Grond (available by the CC 4.0 license from Lia Rumantscha).

Morphological Analysis of Individual Words

Words which are unknown to the morphological analyzer are analyzed by an unknown word guesser and their analyses are marked by the tag +UNKNOWN. Words which cannot be guessed by the unknown word guesser get the string "+?" as their analysis.

Input Format: one word per line
Output

Morphological Analysis of Sentences

Expects tokenized input, one line per sentence, all tokens separated by space.

Input
Output

Generator

Input
Output

Tokenization

Input
Output

References

Note: We use a conditional random field sequence classifier (wapiti) for selecting the most probable analysis of a word in the context of the sentence. Unknown words get the analysis which seems most probable in the context. Unfortunately, the most probable tag according to our model (derived from only 4'500 words) can be the wrong tag in reality.

A detailed description of the morphological tags and the implementation of the morphological analysis can be found in the documentation.