A Tutorial on Support Vector Machines for Pattern Recognition

Vidal's library

Title:	A Tutorial on Support Vector Machines for Pattern Recognition
Author:	Christopher J. C. Burges
Journal:	Data Mining and Knowledge Discovery
Volume:	2
Number:	2
Pages:	121--167
Year:	1998
Abstract:	The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

Cited by 2342 - Google Scholar

@article{ burges98a,
  author =	 {Christopher J. C. Burges},
  title =	 {A Tutorial on Support Vector Machines for Pattern
                  Recognition},
  journal =	 {Data Mining and Knowledge Discovery},
  volume =	 2,
  number =	 2,
  pages =	 "121--167",
  year =	 1998,
  abstract =	 {The tutorial starts with an overview of the concepts
                  of VC dimension and structural risk minimization. We
                  then describe linear Support Vector Machines (SVMs)
                  for separable and non-separable data, working
                  through a non-trivial example in detail. We describe
                  a mechanical analogy, and discuss when SVM solutions
                  are unique and when they are global. We describe how
                  support vector training can be practically
                  implemented, and discuss in detail the kernel
                  mapping technique which is used to construct SVM
                  solutions which are nonlinear in the data. We show
                  how Support Vector machines can have very large
                  (even infinite) VC dimension by computing the VC
                  dimension for homogeneous polynomial and Gaussian
                  radial basis function kernels. While very high VC
                  dimension would normally bode ill for generalization
                  performance, and while at present there exists no
                  theory which shows that good generalization
                  performance is guaranteed for SVMs, there are
                  several arguments which support the observed high
                  accuracy of SVMs, which we review. Results of some
                  experiments which were inspired by these arguments
                  are also presented. We give numerous examples and
                  proofs of most of the key theorems. There is new
                  material, and I hope that the reader will find that
                  even old material is cast in a fresh light.},
  keywords =     {learning survey},
  url =		 {http://jmvidal.cse.sc.edu/library/burges98a.pdf},
  citeseer =	 {burges98tutorial.html},
  googleid = 	 {ib5C_Mf4RUgJ:scholar.google.com/},
  cluster = 	 5207842081938259593
}

Last modified: Wed Mar 9 10:14:31 EST 2011