This is an old revision of the document!
Our work relies on a multingual corpus that we automatically annotate on several levels (tagging, lemmatization, dependency parsing) and align on a sentence and word level (see Figure sample_sentence). We calculate intralingual and interlingual association measures as described here, which helps us identifying phrasemes. Phrasemes are characterized by being more than the sum of their lexical units. Exploiting this property, we can derive lists of support verb constructions such as the ones shown in Table svc_leko by combining intralingual and interlingual association measures.
Here, we provide a web interface to explore the properties of different association measures: https://pub.cl.uzh.ch/projects/sparcling/visual_association_measures/
Furthermore, we publish the visualisation library we implemented for illustrating annotated and aligned sentences such as shown in Figure sample_sentence: https://gitlab.cl.uzh.ch/sparcling/SentStructure.js A feature demonstration of SentStructure.js is available at https://pub.cl.uzh.ch/projects/sparcling/SentStructureDemo/.
<figure sample_sentence> <caption>Sample sentence with tagging, dependency parsing and word alignment. The hightlighted tokens belong to a support verb candidate in English and the respective aligned tokens in French.</caption> </figure>
<table svc_leko> <caption>Support verb constructions identified by employing intralingual and interlinguals association measures (source).</caption> </table>