Research Project, since 2004
A project of the
Academy, since 1946
Swiss Academy of Humanities and Social Sciences
The Swiss Text Corpus project is part of an international research project which aims to provide a balanced collection of the standard German language of the 20th century and to make it accessible online by digitising German-language texts of all kinds (newspaper articles, advertising, forms, instructions, guides, popular scientific literature, youth and trivial literature, fiction, etc.). The Swiss sub-project Swiss Text Corpus brings together German-language texts by Swiss authors from the 20th century. The digital collection is structured in analogy to the other subcorpora in Germany, Austria and Italy on the basis of formal, content and time criteria. It represents a balanced representation of the German-Swiss vocabulary and can serve as a basis for specific Swiss lexicographic needs.
The joint digital text corpus, which is being developed together with the partner projects from Germany, Austria and Italy, is called Corpus C4 and is expected to contain around 80 million text words in the end. A first version of the Corpus C4 has been published since April 2009 (http://www.korpus-c4.org), although it does not yet contain the complete stock of text words. For the German language of the 20th century, this is the first time that a balanced text corpus is available that takes into account regional variations and can be used for various linguistic questions.
Since the beginning of 2017, the corpus has been expanded to include Swiss texts from the 21st century, and since the beginning of 2019 it has included an additional 3.8 million text words for the 21st century.
The Swiss text corpus was developed by a research group at the German Department of the University of Basel and was mainly financed by the Swiss National Science Foundation during this phase. Since 2014, it has been supervised at the Schweizerisches Idiotikon and financially supported by the Swiss Academy for Humanities and Social Sciences.
Classification
Research projects in AGATE are classified according to various aspects. The content-related aspects include the assignment of the time period that the project deals with and a geographical categorisation. By selecting one of the following tags, you can search the project database for projects with the selected focus.
Temporal Classification
Spatial classification
Languages
Research Objects
Research Methods
The research methods employed are an essential component of research projects. In the field of research methods, projects can be filtered by their specific research activities and techniques, making it easy to find projects with similar approaches.
Research Activities
Partners
- Permanent Cooperation Partners
- Austrian Academy Corpus (Wien)
- Korpus Südtirol (Bozen)
- Institut für Computerlinguistik, Universität Zürich
- KorpusLab, Universität Zürich
- Cooperating Projects
- Digitales Wörterbuch der deutschen Sprache
Persons
- Prof. Dr. Hans Bickel (Project Leader)
- MSc / MA Lorenz Küchler (Research Assistant)
- MA Muriel Peter (Research Assistant)
- Dr. Tobias Roth (Research Assistant)
- MA Manuela Weibel (Research Assistant)
- Manuela Lörtscher (Student Assistant)
- Selina Sprecher (Student Assistant)
Contact
Schweizer Textkorpus
c/o Schweizerisches Idiotikon
Auf der Mauer 5
CH-8001 Zürich