Evaluation of the Progress of Generative Artificial Intelligence in Creating Higher Mathematics Tasks: A Comparative Study

Authors

Uroš Sterle
Kranj School Centre, Secondary Technical School

Synopsis

Generative artificial intelligence (GenAI) brings significant changes to the educational process, including the automation of knowledge assessment preparation. In this article, we discuss the use of GenAI tools in designing tasks for higher mathematics, where accuracy is of key importance. Although GenAI can save educators time in compiling exams, the technology requires critical judgment due to its tendency to "hallucinate" - generating convincing but incorrect information. In our study, we compared five leading artificial intelligence tools (ChatGPT, DeepSeek, Gemini, Copilot, Grok) using five complex math problems as examples. The longitudinal comparison adds particular value to the study: we analyzed the answers obtained in the first phase of the study in April 2025 and compared them with the results of the same tools in January 2026. The aim of the paper is to show the degree of progress of the tools and to justify the need for a "human-in-the-loop" approach in pedagogical practice.

Author Biography

Uroš Sterle, Kranj School Centre, Secondary Technical School

Uroš Sterle is a mathematics professor and a graduate of the Faculty of Mathematics and Physics in Ljubljana with extensive experience in computer science. He began his career in the business world as a programmer, where he used his mathematical knowledge to develop solutions and optimize processes. Since 2010, he has been teaching at the Kranj School Center, where he combines theoretical concepts with practice in databases, web applications, and network services. He is the author of educational materials for developing algorithmic thinking and actively participates in projects to strengthen digital competencies and in higher education. With continuous professional development and over 20 years of experience in individual teaching, he successfully combines in-depth mathematical knowledge with its practical value in modern technology.

Kranj, Slovenia. E-mail: uros.sterle@sckr.si

Downloads

Published

March 17, 2026

License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Sterle, U. (2026). Evaluation of the Progress of Generative Artificial Intelligence in Creating Higher Mathematics Tasks: A Comparative Study. In P. Šprajc, D. Maletič, N. Petrović, I. Iztok, A. Škraba, D. Tomić, & A. Žnidaršič Mohorič (Eds.), & (Ed.), 45th International Conference on Organizational Science Development: Organization and the Longevity Society, Conference Proceedings (Vols. 45., pp. 799-806). University of Maribor Press. https://doi.org/10.18690/um.fov.3.2026.58