Govorjeni jezik med raziskovanjem in tehnologijo: Zbornik povzetkov

Authors

Darinka Verdonik (ed.)
University of Maribor, Faculty of Electrical Engineering and Computer Science
https://orcid.org/0000-0003-3972-739X
Nikola Ljubešić (ed.)
Jožef Stefan Institute
https://orcid.org/0000-0001-7169-9152

Keywords:

spoken language resource, speech technology, corpus linguistics, language corpus, speech research

Synopsis

Spoken Language between Research and Technology: Book of Abstracts. The book of abstracts from the conference Spoken Language between Research and Technology brings timely contributions at the intersection of spoken language resources, linguistics, and speech technologies. It features publicly available Croatian child-language corpora in CHILDES/TalkBank and the ParlaSpeech V3 collection. Several papers address the creation and processing of Slovenian speech resources: from citizen-science strategies and open-source tools (alignment, anonymization, validation, normalization) to phonetic transcription in the Digital Dictionary Database of Slovene and the expansion of lexical resources with typically spoken vocabulary. The research spans (dis)fluency and filled-pause detection, the relationship between prosodic and syntactic units, and challenges of dialect transcription; a new EPIC-SI early communication corpus is also announced. The volume is open access under the CC BY-SA license and is intended for researchers in linguistics, corpus studies, and speech technologies, as well as the broader professional community.

Downloads

Download data is not yet available.

Author Biographies

Darinka Verdonik (ed.), University of Maribor, Faculty of Electrical Engineering and Computer Science

Darinka Verdonik is an Associate Professor at the University of Maribor, Slovenia. Her research focuses on spoken language, discourse markers, disfluencies, and the development of spoken language corpora. She has led and collaborated on several projects in corpus linguistics and language technologies, with particular attention to the compilation and annotation of spoken Slovene corpora. She has published extensively on discourse and interactional phenomena, as well as on methodological aspects of corpus building. Her work connects theoretical linguistics with practical applications in speech and language technologies.

Maribor, Slovenia. E-mail: darinka.verdonik@um.si

Nikola Ljubešić (ed.), Jožef Stefan Institute

Nikola Ljubešić is a Senior Research Associate at the Jožef Stefan Institute in Ljubljana, Slovenia. His work focuses on natural language processing for South Slavic languages, with expertise in corpus creation, linguistic annotation, and language technologies for under-resourced languages. He has coordinated and contributed to the development of large-scale resources such as ParlaMint and ParlaSpeech and has played a key role in building datasets for Slovene, Croatian, and Serbian. His research combines methodological innovation in NLP with practical applications, aiming to improve language resources and tools for both academic research and digital humanities.

Ljubljana, Slovenia. E-mail: nljubesi@gmail.com

Downloads

Published

September 11, 2025

Details about this monograph

THEMA Subject Codes (93)

C, U

ISBN-13 (15)

978-961-299-050-3

COBISS.SI ID (00)

Date of first publication (11)

2025-09-11

How to Cite

Verdonik, D., & Ljubešić, N. (Eds.). (2025). Govorjeni jezik med raziskovanjem in tehnologijo: Zbornik povzetkov. University of Maribor Press. https://doi.org/10.18690/um.feri.9.2025