Computational Linguistics for COVID-19 !

Table of Contents


The Biomedical Text Mining group at the Institute of Computational Linguistics of the University of Zurich has been active for many years in the area of automatic analysis of biomedical text, including scientific literature, clinical reports, and social media.

In relation to the COVID19 pandemic we are performing text mining activities that might be relevant for research in this deadly disease, which is having a major impact on our society.

In particular we focus on the following areas of research:

  • Literature Based Discovery

    Our goal is to process automatically COVID19-related scientific publications, in order to detect mentions of domain specific entities of particular relevance (such as genes, symptoms, drugs, organs, etc.). The primary purpose of this work is enhancing accessibility to the literature, for example simplifying the search of papers dealing with a particular gene, or identifying unexpected connections between different entities.

    We process and make available two datasets:

    We also provide access via API to the pipeline that we use to automatically annotate the articles.

  • Social Media Mining

    A second line of research involves the analysis of social media conversations (twitter in particular) related to the COVID19 pandemic. Different types of visualization and analysis enable to investigate variable trends in the public perception of the disease, and of the measures taken to deal with it.

Annotation API

We provide a fast, efficient, accurate document annotation service. It will find mentions of biomedically relevant entities in any document provided as input. Please find information here.


We collaborate with other research groups on COVID19-related tasks:

  • with the NLP group at FBK, Italy, in order to extract from the literature relationships between COVID-19 and other relevant domain entities.
  • with the Hunter Group at the University of Colorado, Denver. The goal is to provide rich annotations on a large literature dataset, such as CORD-19.
  • with the RegulonDB group (Julio Collado Vides) of UNAM, Mexico. The purpose is to create a "linked dataset" of COVID19 literature, which will show interrelationships among different publications.
  • with a multi-institutional group based in Barcelona in an activity aiming at creating Manually classified collection of COVID-19 literature, relevant in clinical contexts, with Spanish translations (see Other useful resources).

Other NLP+COVID19 initiatives

Other useful resources

General information about COVID-19

Who are we?

This page is maintained by the Biomedical Text Mining group at the Institute of Computational Linguistics, University of Zurich.

For additional information about the tools and research activities described in this page, please contact Fabio Rinaldi.

Go back to main page

Author: Fabio Rinaldi

Created: 2020-06-28 Sun 17:49