Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
public:pacoco:text_berg [2019-07-18 22:24] – [Table] Johannes Graënpublic:pacoco:text_berg [2019-07-18 22:46] Johannes Graën
Line 17: Line 17:
 The corpus has been divided into its language specific subsections. The table below provides an overview of corpus statistics for each subsection. The corpus has been divided into its language specific subsections. The table below provides an overview of corpus statistics for each subsection.
  
-=== SAC ===+ 
 +===== SAC =====
 ^ lang   ^ tokens  ^ types  ^ lemmas  ^ sents  ^ texts  ^ ^ lang   ^ tokens  ^ types  ^ lemmas  ^ sents  ^ texts  ^
 ^ de       23.4m |   769k |    325k |   1.3m |    12k | ^ de       23.4m |   769k |    325k |   1.3m |    12k |
Line 27: Line 28:
 ^ Total  ^   38.6m ^   1.1m ^    429k ^   2.1m ^    21k ^ ^ Total  ^   38.6m ^   1.1m ^    429k ^   2.1m ^    21k ^
  
-===EdA===+ 
 +===== EdA =====
 ^ lang  ^ tokens  ^ types  ^ lemmas  ^ sents  ^ texts  ^ ^ lang  ^ tokens  ^ types  ^ lemmas  ^ sents  ^ texts  ^
 ^ fr    |    7.4m |   185k |     40k |   376k |   4.5k | ^ fr    |    7.4m |   185k |     40k |   376k |   4.5k |
  
-===BAC===+ 
 +===== BAC =====
 ^ lang  ^ tokens  ^ types  ^ lemmas  ^ sents  ^ texts  ^ ^ lang  ^ tokens  ^ types  ^ lemmas  ^ sents  ^ texts  ^
 ^ en    |    6.5m |   181k |     60k |   289k |   1.5k | ^ en    |    6.5m |   181k |     60k |   289k |   1.5k |
  
------------------------------- 
  
-=== Relevant links === +===== Publications =====
- +
- +
-  * [[http://textberg.ch/site/en/corpora/|Text+Berg Project Website]] +
-  * [[https://www.sac-cas.ch/|Swiss Alpine Club]] +
- +
-=== Publications ===+
   * Detection and annotation of code-switching [[https://www.zora.uzh.ch/id/eprint/100577/|Clematide and Volk 2014]]   * Detection and annotation of code-switching [[https://www.zora.uzh.ch/id/eprint/100577/|Clematide and Volk 2014]]
   * Crowdsourced correction of OCR errors [[https://www.zora.uzh.ch/id/eprint/124786/|Clematide et al. 2016]], [[https://www.zora.uzh.ch/id/eprint/162395/|Clematide et al. 2018]]   * Crowdsourced correction of OCR errors [[https://www.zora.uzh.ch/id/eprint/124786/|Clematide et al. 2016]], [[https://www.zora.uzh.ch/id/eprint/162395/|Clematide et al. 2018]]
Line 51: Line 47:
   * special handling of elliptical compound nouns and separable prefix verbs in German [[https://www.zora.uzh.ch/id/eprint/126372/|Volk et al. 2016]], [[https://www.zora.uzh.ch/id/eprint/85249/|Aepli and Volk 2013]]   * special handling of elliptical compound nouns and separable prefix verbs in German [[https://www.zora.uzh.ch/id/eprint/126372/|Volk et al. 2016]], [[https://www.zora.uzh.ch/id/eprint/85249/|Aepli and Volk 2013]]
   * See here for more [[http://textberg.ch/site/de/publi/|publications from the Text+Berg project]]   * See here for more [[http://textberg.ch/site/de/publi/|publications from the Text+Berg project]]
 +
 +
 +===== Relevant links =====
 +
 +  * [[http://textberg.ch/site/en/corpora/|Text+Berg Project Website]]
 +  * [[https://www.sac-cas.ch/|Swiss Alpine Club]]
 +

CL Wiki

Institute of Computational Linguistics – University of Zurich