This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
public:pacoco:credit_suisse [2019-07-18 22:24] – [Table] Johannes Graën | public:pacoco:credit_suisse [2023-09-15 20:33] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ~~NOTOC~~ | ||
====== Credit Suisse ====== | ====== Credit Suisse ====== | ||
Line 6: | Line 7: | ||
The corpus consists of three main subcorpora: Credit Suisse News corpus, Credit Suisse PDF Bulletin corpus and Credit Suisse Bulletin In Print corpus. | The corpus consists of three main subcorpora: Credit Suisse News corpus, Credit Suisse PDF Bulletin corpus and Credit Suisse Bulletin In Print corpus. | ||
- | ==== Credit Suisse News corpus ==== | + | |
+ | ===== Credit Suisse News corpus | ||
The Credit Suisse News Corpus is a collection of news articles from the Credit Suisse web page in four languages (English, French, German, Italian). They range from 2001 to 2017. | The Credit Suisse News Corpus is a collection of news articles from the Credit Suisse web page in four languages (English, French, German, Italian). They range from 2001 to 2017. | ||
Line 17: | Line 19: | ||
^ Total ^ 7883458 ^ 279456 ^ 126461 ^ 419562 ^ 6756 ^ | ^ Total ^ 7883458 ^ 279456 ^ 126461 ^ 419562 ^ 6756 ^ | ||
+ | ==== Alignment ==== | ||
+ | The corpus has been aligned on the document and sentence level. | ||
- | ==== Credit Suisse PDF Bulletin corpus ==== | + | |
+ | ===== Credit Suisse PDF Bulletin corpus | ||
The Credit Suisse PDF Bulletin Corpus is a collection of magazine articles from the Credit Suisse Bulletin in four languages (English, French, German, Italian). They range from 1998 to 2017. | The Credit Suisse PDF Bulletin Corpus is a collection of magazine articles from the Credit Suisse Bulletin in four languages (English, French, German, Italian). They range from 1998 to 2017. | ||
^ lang ^ tokens | ^ lang ^ tokens | ||
- | | de | + | ^ de |
- | | en | + | ^ en |
- | | fr | + | ^ fr |
- | | it | + | ^ it |
^ Total ^ 13240987 ^ 514928 ^ 209723 ^ 878098 ^ 9050 ^ | ^ Total ^ 13240987 ^ 514928 ^ 209723 ^ 878098 ^ 9050 ^ | ||
- | ==== Credit Suisse Bulletin In Print corpus ==== | + | ==== Alignment ==== |
+ | The corpus has been aligned on the document and sentence level. | ||
+ | |||
+ | |||
+ | ===== Credit Suisse Bulletin In Print corpus | ||
The Credit Suisse Bulletin In Print Corpus is a collection of magazine articles from the Credit Suisse Bulletin in five languages (English, French, German, Italian, Spanish). They range from 1895 to 1997. | The Credit Suisse Bulletin In Print Corpus is a collection of magazine articles from the Credit Suisse Bulletin in five languages (English, French, German, Italian, Spanish). They range from 1895 to 1997. | ||
^ lang ^ tokens | ^ lang ^ tokens | ||
- | | de | + | ^ de |
- | | en | + | ^ en |
- | | es | + | ^ es |
- | | fr | + | ^ fr |
- | | it | + | ^ it |
^ Total ^ 40532276 ^ 1018150 ^ 282989 ^ 3285925 ^ 1633 ^ | ^ Total ^ 40532276 ^ 1018150 ^ 282989 ^ 3285925 ^ 1633 ^ | ||
- | --------- | + | ==== Alignment ==== |
+ | The corpus has not been aligned yet. | ||
- | === Relevant links === | ||
+ | ===== Publications ===== | ||
+ | |||
+ | * Building a Parallel Corpus on the World' | ||
+ | |||
+ | |||
+ | ===== Relevant links ===== | ||
+ | * Multilingwis example ‹rentrer chez soi›: [[mlw> | ||
*[[https:// | *[[https:// | ||
*[[https:// | *[[https:// | ||
+ | *[[https:// | ||
+ | |||
- | === Publications === | ||
- | * Building a Parallel Corpus on the World' |