Standardi transkribiranja in označevanja narečnega korpusa GOKO: Klara Šumenjak
Synopsis
GOKO Dialect Corpus Transcription and Annotation Standards. The article presents some principles of the construction of GOKO (Govorni korpus Koprive na Krasu), the first Slovene dialect corpus accessible at http://jt.upr.si/GOKO/. It briefly describes the corpus scope, the demographic sampling, the recordings and their units as well as the corpus annotation. The central part is dedicated to the standards of the transcription of the corpus and the challenges that had to be overcome. All three levels in which the corpus was recorded are described: a) in phonetic transcription, which captures all the phonological features of the Kopriva na Krasu dialect, b) in simplified dialect transcription, which represents only the basic phonological features of the regional speech, and c) in the literary version, in which each individual word is replaced by its literary counterpart, while retaining the features of the spoken language at the phrase and sentence level.