Predicting Lead Changes in Tennis Matches based on Machine Learning Models

Authors

  • Binrong Yang
  • Xiaohan Wang
  • Luanzhen Duan

DOI:

https://doi.org/10.56028/aetr.11.1.625.2024

Keywords:

Leading Change Prediction, Data Downscaling, Extra-Trees, Stacking Model.

Abstract

Based on men's singles data from the 2023 Wimbledon Tennis Championships, this study applies advanced statistical and machine learning techniques (including Principal Component Analysis PCA, Uniform Streaming Approximation and Projection UMAP, Random Forests, and Extreme Random Trees Extra-Trees) to explore the dynamics of lead changes during matches. By constructing mathematical models, the study predicts lead changes at key moments of the race that have a decisive impact on the race strategy and outcome. The experimental results show that the Extreme Random Trees model performs well on all evaluation metrics, especially in terms of accuracy, recall and F1 score, demonstrating its ability to handle complex interactions and variations in the data. In addition, the CatBoost model performed well in identifying actual leading changes, but with lower accuracy, possibly due to overfitting certain data patterns. The hybrid model, on the other hand, demonstrated a balanced performance, although it failed to outperform the extreme random tree model in terms of overall effectiveness.

Downloads

Published

2024-07-18