Hit Song Prediction Through Machine Learning and Spotify Data

Avtorji

Kratka vsebina

This study predicts hit songs using metadata from the Spotify API[8]. The dataset includes over 20 genres, each with 40 songs, equally divided between hits and flops, gathered using spotipy[7]. Prediction is based on the popularity feature, rated from 0-100. Models were trained on features like danceability, energy, loud-ness, speechiness, valence, and tempo. The dataset was split using train_test_split (10%, 20%, 33%) and kfold cross-validation with k val-ues of 2, 5, and 10. Models were trained, evaluated, and tested, with kfold cross-validation showing the best accuracy and the least over-fitting. Scikit-learn’s classifiers, ensemble models, and MLPClas-sifier were used, with PassiveAggressiveClassifier and AdaBoost showing 60% accuracy. Ensemble methods like extra trees and ran-dom forest, along with neural networks, performed well. Gaussian Process, Naive Bayes, and ridge classifiers stood out among stan-dard models. These results suggest that enhanced models, especially neural networks and decision tree ensembles, could improve hit prediction. Future work may explore frequency and lyric analysis.

Prenosi

Izdano

30.10.2024

Kako citirati

Hit Song Prediction Through Machine Learning and Spotify Data. (2024). In Proceedings of the10th Student Computing Research Symposium (SCORES’24) (pp. 57-60). Univerzitetna založba Univerze v Mariboru. https://press.um.si/index.php/ump/catalog/book/886/chapter/153