University of Zurich | Institute of Computational Linguistics | Contact for questions and resources: Simon Clematide
Many lexical entries are taken from Pledari Grond (available by the CC 4.0 license from Lia Rumantscha).
Expects tokenized input, one line per sentence, all tokens separated by space.
Note: We use a conditional random field sequence classifier (wapiti) for selecting the most probable analysis of a word in the context of the sentence. Unknown words get the analysis which seems most probable in the context. Unfortunately, the most probable tag according to our model (derived from only 4'500 words) can be the wrong tag in reality.
A detailed description of the morphological tags and the implementation of the morphological analysis can be found in the documentation.