Tree-based Machine Learning Methods for Wind Farm Data

Authors

Manohar Gowdru Shridhara
Pavol Jozef Šafárik University in Košice, Faculty of Science
Ľubomír Antoni
Pavol Jozef Šafárik University in Košice, Faculty of Science
Gabriel Semanišin
Pavol Jozef Šafárik University in Košice, Faculty of Science

Synopsis

Environmental and energy datasets are typically characterized by nonlinear dependencies and a combination of numerical and categorical variables. Such characteristics require more adaptable computational approaches. In this context, we explore tree-based machine learning methods since they provide a high predictive performance and a high level of interpretability. In this chapter, we present a comparative study of selected tree-based regression models applied to real-world environmental data from the United States Wind Turbine Database. The evaluated methods include a single regression decision tree, a bagging-based Random Forest ensemble, and modern gradient boosting implementations represented by CatBoost and LightGBM. All models are trained within a unified framework using standard regression performance metrics. We demonstrate that ensemble-based approaches substantially outperform a single decision tree in our experimental results. In particular, boosting-based models achieve higher predictive accuracy, with LightGBM providing the best overall performance in terms of squared error metrics and coefficient of determination. Feature importance analysis further highlights the important role of technical turbine characteristics and categorical descriptors. The findings confirm that modern gradient boosting frameworks represent a powerful and effective solution for regression tasks involving large-scale environmental and energy-related datasets.

Author Biographies

Manohar Gowdru Shridhara, Pavol Jozef Šafárik University in Košice, Faculty of Science

Manohar Gowdru Shridhara is a PhD student at the Faculty of Science, Pavol Jozef Šafárik University in Košice. His research interests include machine learning and opti-mization techniques, mainly in the fields of energetics and wind farms.

Košice, Slovakia. E-mail: manohar.gowdru.shridhara@student.upjs.sk

 

Ľubomír Antoni, Pavol Jozef Šafárik University in Košice, Faculty of Science

Ľubomír Antoni is an associate professor at the Institute of Computer Science, Faculty of Science, Pavol Jozef Šafárik University in Košce. His research interests include artificial intelligence, fuzzy systems, data mining, and applied machine learning.

Košice, Slovakia. E-mail: lubomir.antoni@upjs.sk

Gabriel Semanišin, Pavol Jozef Šafárik University in Košice, Faculty of Science

Gabriel Semanišin is a professor of Computer Science at Faculty of Science, Pavol Jozef Šafárik University in Košice. As part of his research activities, he focuses mainly on algorithmic graph theory and its application in various areas of theoretical and ap-plied informatics. He is a co-guarantor of the study programs Applied Informatics, Data Analysis and Artificial Intelligence, and Computer Science. He was a supervisor of six PhD students in the study programs Computer Science, Discrete Mathematics and The-ory of Teaching Informatics.

Košice, Slovakia. E-mail: gabriel.semanisin@upjs.sk

Downloads

Published

June 18, 2026

License

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Gowdru Shridhara, M., Antoni, Ľubomír, & Semanišin, G. (2026). Tree-based Machine Learning Methods for Wind Farm Data. In R. Leskovar (Ed.), Artificial Intelligence and Environmental Challenges: Research Insights and Emerging Solutions (pp. 1-20). University of Maribor Press. https://doi.org/10.18690/um.fov.5.2026.1