Support-vector machine

Support-vector machine
Name	Support-vector machine
Inventor	Vladimir Vapnik, Alexey Chervonenkis
Year	1963
Influenced by	Statistical learning theory
Influenced	Machine learning, Pattern recognition

Contents

Overview
Linear SVM
Nonlinear classification
Extensions and variants
Implementation and algorithms
Applications

Support-vector machine. A support-vector machine is a supervised learning model used for classification and regression analysis within the field of machine learning. Developed from the theoretical foundations of statistical learning theory, it constructs a hyperplane or set of hyperplanes in a high-dimensional space to perform tasks like classification. The model is renowned for its effectiveness in high-dimensional spaces and its versatility through the use of kernel functions.

Overview

The foundational concepts for the support-vector machine were developed by Vladimir Vapnik and Alexey Chervonenkis at the Institute of Control Sciences in the 1960s. The modern incarnation, including the kernel trick, was later popularized by Bernhard Boser, Isabelle Guyon, and Vapnik at AT&T Bell Laboratories. The core principle involves finding the optimal separating hyperplane that maximizes the margin between different classes in the training data, a concept rooted in Vapnik–Chervonenkis theory. This approach provides strong theoretical guarantees against overfitting, making it a robust tool in the arsenal of pattern recognition.

Linear SVM

For linearly separable data, a linear SVM aims to find the hyperplane with the maximum margin. This margin is defined by the distance to the nearest training data points from any class, known as support vectors. The optimization problem is typically formulated as a convex quadratic programming task, minimizing a hinge loss function. The solution depends only on these support vectors, a property that contributes to the model's efficiency. The seminal work on this formulation is often associated with research published by Corinna Cortes and Vapnik.

Nonlinear classification

To handle data that is not linearly separable, SVMs employ the kernel trick. This method implicitly maps input vectors into a higher-dimensional feature space where a linear separation becomes possible. Common kernel functions include the polynomial kernel, the radial basis function kernel, and the sigmoid kernel. The use of kernels allows SVMs to create complex, nonlinear decision boundaries without explicitly performing the computationally expensive transformation. This innovation was crucial for applying SVMs to complex domains like bioinformatics and computer vision.

Extensions and variants

Several important extensions have been developed to enhance the basic SVM model. The soft-margin SVM was introduced by Cortes and Vapnik to handle noisy, non-separable data by introducing slack variables. For regression tasks, support-vector regression applies similar principles by fitting a tube around the data. Other variants include the ν-SVM, which provides a different parameterization for controlling the margin and errors, and transductive SVMs for semi-supervised learning. The LIBSVM library, created by Chih-Jen Lin, has been instrumental in popularizing these methods.

Implementation and algorithms

Efficient training of SVMs, especially for large datasets, is a critical area of research. The sequential minimal optimization algorithm, developed by John Platt at Microsoft Research, breaks the large quadratic programming problem into smaller sub-problems. Other significant software implementations include LIBLINEAR for large-scale linear classification and the scikit-learn library in Python (programming language). These tools have made SVMs accessible for practical applications across various industries and research institutions like Stanford University.

Applications

Support-vector machines have been successfully applied across a diverse range of fields. In bioinformatics, they are used for protein structure prediction and cancer classification from gene expression data. Within computer vision, SVMs are a core component for tasks such as handwriting recognition and object detection in systems developed by companies like Google. They have also seen significant use in natural language processing for text categorization and sentiment analysis, as well as in geostatistics for remote sensing data interpretation by agencies like NASA.

Category:Machine learning Category:Classification algorithms Category:Statistical classification