Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
public:pacoco:horizonte [2019-07-17 23:13] – [Horizonte] tkewpublic:pacoco:horizonte [2023-09-15 20:33] (current) – external edit 127.0.0.1
Line 1: Line 1:
 +~~NOTOC~~
 ====== Horizonte ====== ====== Horizonte ======
  
-{{  :public:pacoco:snf_horizonte_119.jpeg?300|Horizonte nr 119}}+{{  snf_horizonte_119.jpeg?200|Horizonte nr 119}}
  
 The Horizonte corpus is built upon the magazine of the same name, published by the Swiss National Science Foundation ([[http://www.snf.ch/en/Pages/default.aspx|SNSF]]). The Horizonte corpus is built upon the magazine of the same name, published by the Swiss National Science Foundation ([[http://www.snf.ch/en/Pages/default.aspx|SNSF]]).
Line 7: Line 8:
 This corpus consists of magazine articles in German, French and English related to popular science and research projects in and around Switzerland. This corpus consists of magazine articles in German, French and English related to popular science and research projects in and around Switzerland.
  
-==== Horizonte Online ====+ 
 +===== Horizonte Online =====
  
 The Horizonte Online corpus consists of articles available on the [[https://www.horizonte-magazin.ch/|Horizons magazine website]], collected in 2018. These articles span 4 years, from 2014 until 2018. The Horizonte Online corpus consists of articles available on the [[https://www.horizonte-magazin.ch/|Horizons magazine website]], collected in 2018. These articles span 4 years, from 2014 until 2018.
  
-^lang ^ tokens ^ types ^ lemmas ^ sents ^ texts ^ +^ lang   ^ tokens  ^ types  ^ lemmas  ^ sents  ^ texts  
-|de | 114084 | 19318 | 10609 | 8584 | 158 | +de      114084 |  19318 |   10609 |   8584 |    158 | 
-|en | 131146 | 13324 | 8404 | 8035 | 157 | +en      131146 |  13324 |    8404 |   8035 |    157 | 
-|fr | 126333 | 15010 | 7315 | 7583 | 158 | +fr      126333 |  15010 |    7315 |   7583 |    158 | 
-^Total ^ 371563 ^ 47652 ^ 26328 ^ 24202 ^ 473 ^ +^ Total   371563 ^  47652 ^   26328 ^  24202 ^    473 ^
  
-==== Horizonte PDF ====+==== Alignment ==== 
 +The corpus has been aligned on the document level.
  
-The Horizonte PDF corpus consists of articles taken from electronic PDFs of the Horizonte magazine from their online archive. The articles span 12 years, from 2005 until 2017. 
  
-^lang ^ tokens ^ types ^ lemmas ^ sents ^ texts ^ +===== Horizonte PDF =====
-|de | 1025245 | 85221 | 35577 | 75014 | 1237 | +
-|en | 392975 | 24793 | 14209 | 23865 | 395 | +
-|fr | 1193874 | 51562 | 17557 | 71995 | 1237 | +
-^Total ^ 2612094 ^ 161576 ^ 67343 ^ 170874 ^ 2869 ^+
  
 +The Horizonte PDF corpus consists of articles taken from electronic PDFs of the Horizonte magazine from their online archive. The articles span 12 years, from 2005 until 2017.
  
 +^ lang   ^ tokens   ^ types   ^ lemmas  ^ sents   ^ texts  ^
 +^ de      1025245 |   85221 |   35577 |   75014 |   1237 |
 +^ en       392975 |   24793 |   14209 |   23865 |    395 |
 +^ fr      1193874 |   51562 |   17557 |   71995 |   1237 |
 +^ Total  ^  2612094 ^  161576 ^   67343 ^  170874 ^   2869 ^
  
----------+==== Alignment ==== 
 +The corpus has been aligned on the document level.
  
-=== Relevant links === 
  
 +===== Relevant links =====
  
-=== Publications ===+  * SNSF Horizonte [[http://www.snf.ch/de/fokusForschung/forschungsmagazin-horizonte/Seiten/default.aspx]]
  

CL Wiki

Institute of Computational Linguistics – University of Zurich