Global AI and Data Science

 View Only

Support Vector Regression

By Moloy De posted Tue January 24, 2023 05:40 AM

  
SVMs or Support Vector Machines are one of the most popular and widely used algorithm for dealing with classification problems in machine learning. However, the use of SVMs in regression is not very well documented. This algorithm acknowledges the presence of non-linearity in the data and provides a proficient prediction model.

In machine learning, Support Vector Machines are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. In Support Vector Regression, the straight line that is required to fit the data is referred to as hyperplane.

The objective of a support vector machine algorithm is to find a hyperplane in an n-dimensional space that distinctly classifies the data points. The data points on either side of the hyperplane that are closest to the hyperplane are called Support Vectors. These influence the position and orientation of the hyperplane and thus help build the SVM.

Now that we have an intuition of what a support vector machine is, we will take look into the various hyperparameters that are used in Support Vector Regression. Some of the key parameters used are as mentioned below:

1. Hyperplane:
Hyperplanes are decision boundaries that is used to predict the continuous output. The data points on either side of the hyperplane that are closest to the hyperplane are called Support Vectors. These are used to plot the required line that shows the predicted output of the algorithm.

2. Kernel:
A kernel is a set of mathematical functions that takes data as input and transform it into the required form. These are generally used for finding a hyperplane in the higher dimensional space.

3. Boundary Lines:
These are the two lines that are drawn around the hyperplane at a distance of ε (epsilon). It is used to create a margin between the data points.

Support Vector Machine is a supervised learning algorithm that is used to predict discrete values. Support Vector Regression uses the same principle as the SVMs. The basic idea behind SVR is to find the best fit line. In SVR, the best fit line is the hyperplane that has the maximum number of points.

Unlike other Regression models that try to minimize the error between the real and predicted value, the SVR tries to fit the best line within a threshold value. The threshold value is the distance between the hyperplane and boundary line. The fit time complexity of SVR is more than quadratic with the number of samples which makes it hard to scale to datasets with more than a couple of 10000 samples.

For large datasets, Linear SVR is used. Linear SVR provides a faster implementation than SVR but only considers the linear kernel. The model produced by Support Vector Regression depends only on a subset of the training data, because the cost function ignores samples whose prediction is close to their target.

Although Support Vector Regression is used rarely it carries certain advantages that are as mentioned below:
1. It is robust to outliers.
2. Decision model can be easily updated.
3. It has excellent generalization capability, with high prediction accuracy.
4. Its implementation is easy.

Some of the drawbacks faced by Support Vector Machines while handling regression problems are as mentioned below:
1. They are not suitable for large datasets.
2. In cases where the number of features for each data point exceeds the number of training data samples, the SVM will underperform.
3. The Decision model does not perform very well when the data set has more noise i.e. target classes are overlapping..

With that, we have reached the end of this article.


QUESTION I : What is the criteria to build the SVM Classification Hyperplane?
QUESTION II : Wat are the algorithms to build the SVN Regression Hyperplane?

REFERENCE : Unlocking the True Power of Support Vector Regression

#Spotlight
0 comments
25 views

Permalink