This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
public:pacoco:text_berg [2019-07-18 22:24] – [Table] Johannes Graën | public:pacoco:text_berg [2023-09-15 20:33] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ~~NOTOC~~ | ||
====== The Text+Berg Corpus ====== | ====== The Text+Berg Corpus ====== | ||
Line 17: | Line 18: | ||
The corpus has been divided into its language specific subsections. The table below provides an overview of corpus statistics for each subsection. | The corpus has been divided into its language specific subsections. The table below provides an overview of corpus statistics for each subsection. | ||
- | === SAC === | + | |
+ | ===== SAC ===== | ||
^ lang ^ tokens | ^ lang ^ tokens | ||
^ de | ^ de | ||
Line 27: | Line 29: | ||
^ Total ^ 38.6m ^ 1.1m ^ 429k ^ 2.1m ^ 21k ^ | ^ Total ^ 38.6m ^ 1.1m ^ 429k ^ 2.1m ^ 21k ^ | ||
- | ===EdA=== | + | ==== Alignment ==== |
+ | The corpus has been aligned on the sentence level. | ||
+ | |||
+ | |||
+ | ===== EdA ===== | ||
^ lang ^ tokens | ^ lang ^ tokens | ||
^ fr | 7.4m | 185k | 40k | 376k | 4.5k | | ^ fr | 7.4m | 185k | 40k | 376k | 4.5k | | ||
- | ===BAC=== | + | |
+ | ===== BAC ===== | ||
^ lang ^ tokens | ^ lang ^ tokens | ||
^ en | 6.5m | 181k | 60k | 289k | 1.5k | | ^ en | 6.5m | 181k | 60k | 289k | 1.5k | | ||
- | ------------------------------ | ||
- | === Relevant links === | + | ===== Publications |
- | + | ||
- | + | ||
- | * [[http:// | + | |
- | * [[https:// | + | |
- | + | ||
- | === Publications | + | |
* Detection and annotation of code-switching [[https:// | * Detection and annotation of code-switching [[https:// | ||
* Crowdsourced correction of OCR errors [[https:// | * Crowdsourced correction of OCR errors [[https:// | ||
Line 51: | Line 51: | ||
* special handling of elliptical compound nouns and separable prefix verbs in German [[https:// | * special handling of elliptical compound nouns and separable prefix verbs in German [[https:// | ||
* See here for more [[http:// | * See here for more [[http:// | ||
+ | |||
+ | |||
+ | ===== Relevant links ===== | ||
+ | |||
+ | * [[http:// | ||
+ | * [[https:// | ||
+ |