Jezikovni modeli za pripravo govornega korpusa: programi za prepoznavanje govora: Teodor Petrič
Synopsis
Language Models for Spoken Corpus Preparation: Speech Recognition Software. In the last decade, particularly in the last five years after the emergence of large language models based on transformer architectures, we have seen the development of a number of software tools that accelerate the creation of multi-layered corpora. We have tested software tools for speech recognition and conversion to written form (i.e. tools such as Razpoznavalnik, Microsoft Word Dictate, Vosk/Kaldi and OpenAI Whisper), which are crucial for accelerating the creation of spoken corpora. We have employed various criteria concerning ease of use, time-saving features, potential costs, ensuring speaker anonymity and various aspects of conversion quality (e.g. word error rates, number of substitutions, insertions and deletions). While the tools for converting speech to written form have made considerable progress, we would certainly wish for the ability to customize the output formats of these programmes to meet individual research needs, e.g. including discourse markers (such as the so-called ‘fillers’) or the actual spoken contracted word forms in the transcription.