Model
VANGA Sports is a hybrid model based on a combination of several types of predictive models:
Nearest Neighbors
This model finds the most similar matches from the past based on player characteristics, previous match results, and other factors.
It is a machine learning algorithm that can be used for classification or regression. It works by comparing a new object with a set of already known objects.
The principle of the model is to find a predetermined number of training samples that are closest in distance to a new point, and predict the label based on them.
The number of samples can be a user-specified constant (k-nearest neighbor learning) or vary depending on the local density of points (radius-based neighbor learning).
In general, distance can be any metric measure: the most common choice is the standard Euclidean distance.
Decision Trees
This model builds a tree structure that determines which outcome is most likely for a given match.
The goal is to create a model that predicts the value of a target variable by learning simple decision rules derived from the characteristics of the data. A decision tree can be viewed as a piecewise constant approximation.
Neural Network
This model is trained on a dataset containing information about past sporting events.
It implements a multilayer perceptron (MLP) algorithm that is trained using Backpropagation.
Multilayer Perceptron (MLP) is a supervised learning algorithm that learns a function by training on a data set where m is the number of dimensions for input and o is the number of dimensions for output. Given a set of features and a goal, it can learn a nonlinear function approximator for either classification or regression.
The difference from logistic regression is that between the input and output layers there can be one or more non-linear layers called hidden layers.
Support Vector Machines
This model finds a hyperplane that separates positive and negative examples in feature space.
SVC and NuSVC are classes capable of performing binary and multi-class classification on a dataset. The SVM decision function depends on a subset of training data called support vectors.
The VANGA Sports model was developed to predict the results of tennis matches, but can easily be adapted to predict the results in similar sports, such as table tennis, badminton, squash and others.
Last updated