This shows you the differences between two versions of the page.
public:paste:spanish.sh [2014-07-10 20:15] – created Johannes Graën | public:paste:spanish.sh [2023-09-15 20:33] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | <file bash> | ||
+ | #!/bin/sh | ||
+ | pcorpus=" | ||
+ | |||
+ | zcat $pcorpus \ | ||
+ | | awk ' | ||
+ | | sed -r -e " | ||
+ | -e " | ||
+ | -e " | ||
+ | | tree-tagger-spanish-utf8 \ | ||
+ | | sed -r -e " | ||
+ | | tr -d " | ||
+ | | sed -r -e " | ||
+ | -e " | ||
+ | -e " | ||
+ | -e " | ||
+ | -e " | ||
+ | -e " | ||
+ | | gzip > sentences/ | ||
+ | </ |