When Text Is Not Enough: Structural Limits of Text-Only Transformer-Based Emotion Classification

Szymon Chirowski; Maciej Czerniak; Ondrej Mitas; Maks Burchard

doi:10.18690/um.fov.4.2026

When Text Is Not Enough: Structural Limits of Text-Only Transformer-Based Emotion Classification

Authors

Szymon Chirowski

Breda University of Applied Sciences

Maciej Czerniak

Breda University of Applied Sciences

Ondrej Mitas

Breda University of Applied Sciences, Academy for Tourism

Maks Burchard

Breda University of Applied Sciences

DOI: https://doi.org/10.18690/um.fov.4.2026.44

Synopsis

This study investigates whether limitations observed in text-only transformer-based emotion classification pipelines reflect implementation shortcomings or structural constraints inherent to unimodal modeling. A pipeline was constructed using unscripted dialogue from MasterChef Polska, incorporating automated speech-to-text transcription, neural machine translation, and benchmarking across SVM, Bi-LSTM, and RoBERTa architectures. While the fine-tuned RoBERTa model achieved substantially higher accuracy (0.755), confusion matrix analysis and explainable AI techniques revealed persistent structural asymmetries, including uneven performance across emotion categories, high-arousal anger-joy confusion, and translation-induced distortions. Evaluation against automated labels further exposed a “Ground Truth Paradox,” where models are validating each other rather than a human-verified set of conclusions. Increased architectural capacity improves performance but does not resolve structural limitations of text-only emotion classification.

Author Biographies

Szymon Chirowski, Breda University of Applied Sciences

Breda, the Netherlands. E-mail: 242621@buas.nl

Maciej Czerniak, Breda University of Applied Sciences

Breda, the Netherlands. E-mail: 243552@buas.nl

Ondrej Mitas, Breda University of Applied Sciences, Academy for Tourism

Breda, the Netherlands. E-mail: mitas.o@buas.nl

Maks Burchard, Breda University of Applied Sciences

Breda, the Netherlands. E-mail: 240894@buas.nl

Downloads

PDF

Volume

39^thBled eConference: Co-Creating Human-Centred and Responsible Digital Futures; Conference Proceedings

Pages

705-720

Published

June 5, 2026

Series

Bled eConference

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Chirowski, S., Czerniak, M., Mitas, O., & Burchard, M. (2026). When Text Is Not Enough: Structural Limits of Text-Only Transformer-Based Emotion Classification. In D. Vidmar, A. Pucihar, M. Kljajić Borštnar, R. W. H. Bons, M. Glowatz, & H.-D. Zimmermann (Eds.), & (Ed.), 39th Bled eConference: Co-Creating Human-Centred and Responsible Digital Futures; Conference Proceedings (Vols. 39., pp. 705-720). University of Maribor Press. https://doi.org/10.18690/um.fov.4.2026.44

Download Citation

When Text Is Not Enough: Structural Limits of Text-Only Transformer-Based Emotion Classification

Authors

Synopsis

Author Biographies

Downloads

Volume

Pages

Published

Series

Categories

License

How to Cite

Make a Submission

Keywords

Share