QueryVis - Workshop on Innovative Corpus Query and Visualization Tools

at Nodalida 2015, Vilnius (Lithuania), May, 11th, 2015

Recent years have seen an increased interest in and availability of many different kinds of corpora. These range from small, but carefully annotated treebanks to large parallel corpora and very large monolingual corpora for big data research. It remains a challenge to query the multilayer annotations of small corpora, to efficiently access large corpora as well as to visualize the query results.

When dealing with large corpora, query tools need to scale in terms of processing speed and reporting through statistical information and visualization options. This becomes evident, for example, when dealing with very large corpora (such as complete Wikipedia corpora) or multi-parallel corpora (such as Europarl or JRC Acquis). The goal of the workshop is to gather researchers who develop or evaluate new corpus query and visualization tools for linguistics, language technology or related disciplines.

QueryVis Workshop Program

The proceedings have been published online at Linköping University Electronic Press.


13:30h to 13:45h	Martin Volk	Introduction to the Workshop
	Session 1 (Chair: Andrius Utka)
13:45h to 14:15h	Lucia Kocincová, Vít Baisa, Miloš Jakubíček and Vojtěch Kovář	Interactive Visualizations of Corpus Data in Sketch Engine
14:15h to 14:45h	Michał Kosek, Anders Nøklestad, Joel Priestley, Kristin Hagen and Janne Bondi Johannessen	Visualisation in speech corpora: maps and waves in the Glossa system
30 min	Coffee break
	Session 2 (Chair: Simon Clematide)
15:15h-16:00h	Marc Kupietz (Institut für Deutsche Sprache, Mannheim)	Invited Talk: Scaling out corpus technology: the open source query and analysis engine KorAP
16:00h-16:30h	Joachim Bingel and Nils Diewald	KoralQuery - A General Corpus Query Protocol
15 min	Break
	Session 3 (Chair: Johannes Graën)
16:45-17:15h	Ruprecht von Waldenfels	ParaViz: A vizualization tool for crosslinguistic functional comparisons based on a parallel corpus
17:15h-17:45h	Simon Clematide	Reflections and Proposals for a Query and Reporting Language for Richly Annotated Multiparallel Corpora
17:45h-18:00h	Gintare Grigonyte	Closing Session

Invited Talk

Scaling out corpus technology: the open source query and analysis engine KorAP

Marc Kupietz, Institut für Deutsche Sprach, Mannheim

Abstract: With the growing importance of empiricism and a rapidly growing amount of research data, progress in linguistic research nowadays requires more and more sophisticated and methodologically sound technical infrastructure, far beyond of what typical university computing centres or typical research projects can deliver. Unfortunately however, the funding conditions in linguistics are still not as well adapted to this circumstance as in more established data-intensive research fields and even large scale e-infrastructure initiatives like CLARIN have provided a solid basis of standards and best practises, but nothing coming close to a sufficiently general tool for corpus based research. The talk will introduce KorAP, an open-source corpus analysis platform, mainly developed at the Institut für Deutsche Sprache. It will sketch KorAP's background, how it deals with current and upcoming scientific and technological challenges, how it tries to achieve long-term sustainability despite the aforementioned constraints and how it tries to contribute to progress in linguistic research.

Workshop Topics

Querying corpora with multiple levels of annotation
Querying parallel and multi-parallel corpora
Visualization of annotation and alignment
Visualization of query results over very large corpora
Querying by example
Querying multimodal corpora

Workshop organizers and Program Committee

Janne Bondi Johannesen (Oslo University)
Noah Bubenhofer (University of Zurich)
Simon Clematide (University of Zurich)
Johannes Graën (University of Zurich)
Gintarė Grigonytė (Stockholm University)
Milos Jakubicek (Sketch Engine)
Andrius Utka (Vytautas Magnus University, Kaunas)
Martin Volk (University of Zurich)
Robert Östling (Stockholm University)

Contact

Gintarė Grigonytė (Stockholm University)

queryvis_nodalida@ifi.uzh.ch