Credit Suisse
The Credit Suisse corpus is built on the world's oldest banking magazine, the Credit Suisse Bulletin, which has been in print since 1895.
The corpus consists of three main subcorpora: Credit Suisse News corpus, Credit Suisse PDF Bulletin corpus and Credit Suisse Bulletin In Print corpus.
Credit Suisse News corpus
The Credit Suisse News Corpus is a collection of news articles from the Credit Suisse web page in four languages (English, French, German, Italian). They range from 2001 to 2017.
lang | tokens | types | lemmas | sents | texts |
---|---|---|---|---|---|
de | 1908735 | 105560 | 58166 | 115196 | 1797 |
en | 2078198 | 53534 | 26839 | 110483 | 1821 |
fr | 2027287 | 56333 | 20009 | 99444 | 1596 |
it | 1869238 | 64029 | 21447 | 94439 | 1542 |
Total | 7883458 | 279456 | 126461 | 419562 | 6756 |
Alignment
The corpus has been aligned on the document and sentence level.
Credit Suisse PDF Bulletin corpus
The Credit Suisse PDF Bulletin Corpus is a collection of magazine articles from the Credit Suisse Bulletin in four languages (English, French, German, Italian). They range from 1998 to 2017.
lang | tokens | types | lemmas | sents | texts |
---|---|---|---|---|---|
de | 3610493 | 204526 | 111217 | 269416 | 2713 |
en | 2225123 | 78753 | 35245 | 137688 | 1405 |
fr | 4012143 | 114011 | 30538 | 255677 | 2613 |
it | 3393228 | 117638 | 32723 | 215317 | 2319 |
Total | 13240987 | 514928 | 209723 | 878098 | 9050 |
Alignment
The corpus has been aligned on the document and sentence level.
Credit Suisse Bulletin In Print corpus
The Credit Suisse Bulletin In Print Corpus is a collection of magazine articles from the Credit Suisse Bulletin in five languages (English, French, German, Italian, Spanish). They range from 1895 to 1997.
lang | tokens | types | lemmas | sents | texts |
---|---|---|---|---|---|
de | 14239553 | 467354 | 172506 | 1204989 | 787 |
en | 4632880 | 112646 | 35349 | 423903 | 101 |
es | 909556 | 46421 | 11662 | 87721 | 19 |
fr | 16106141 | 262097 | 35479 | 1179453 | 627 |
it | 4644146 | 129632 | 27993 | 389859 | 99 |
Total | 40532276 | 1018150 | 282989 | 3285925 | 1633 |
Alignment
The corpus has not been aligned yet.
Publications
- Building a Parallel Corpus on the World's Oldest Banking Magazine Volk et al. 2016
Relevant links
- Multilingwis example ‹rentrer chez soi›: [rentrer chez soi] /corpus=cs