Schweizer Textkorpus : AGATE

The Swiss Text Corpus project is part of an international research project which aims to provide a balanced collection of the standard German language of the 20th century and to make it accessible online by digitising German-language texts of all kinds (newspaper articles, advertising, forms, instructions, guides, popular scientific literature, youth and trivial literature, fiction, etc.). The Swiss sub-project Swiss Text Corpus brings together German-language texts by Swiss authors from the 20th century. The digital collection is structured in analogy to the other subcorpora in Germany, Austria and Italy on the basis of formal, content and time criteria. It represents a balanced representation of the German-Swiss vocabulary and can serve as a basis for specific Swiss lexicographic needs.
The joint digital text corpus, which is being developed together with the partner projects from Germany, Austria and Italy, is called Corpus C4 and is expected to contain around 80 million text words in the end. A first version of the Corpus C4 has been published since April 2009 (http://www.korpus-c4.org), although it does not yet contain the complete stock of text words. For the German language of the 20th century, this is the first time that a balanced text corpus is available that takes into account regional variations and can be used for various linguistic questions.
Since the beginning of 2017, the corpus has been expanded to include Swiss texts from the 21st century, and since the beginning of 2019 it has included an additional 3.8 million text words for the 21st century.
The Swiss text corpus was developed by a research group at the German Department of the University of Basel and was mainly financed by the Swiss National Science Foundation during this phase. Since 2014, it has been supervised at the Schweizerisches Idiotikon and financially supported by the Swiss Academy for Humanities and Social Sciences.

Partners

Permanent cooperation partners: Austrian Academy Corpus (Wien); Korpus Südtirol (Bozen); Institut für Computerlinguistik, Universität Zürich; KorpusLab, Universität Zürich
Cooperating projects: Digitales Wörterbuch der deutschen Sprache

Persons

Prof. Dr. Hans Bickel (Project Leader)
MSc / MA Lorenz Küchler (Research Assistant)
MA Muriel Peter (Research Assistant)
Dr. Tobias Roth (Research Assistant)
MA Manuela Weibel (Research Assistant)
Manuela Lörtscher (Student Assistant)
Selina Sprecher (Student Assistant)

Contact

Schweizer Textkorpus
c/o Schweizerisches Idiotikon
Auf der Mauer 5
CH-8001 Zürich

📞 +41 43 251 36 76

📧 info[at]schweizer-textkorpus.ch