# Support Vector Machine Training Procedure

This procedure trains a Support Vector Machine (SVM) model and stores the model file to disk. It is a wrapper around the popular open-source LIBSVM library. For more information about LIBSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/

## Configuration

A new procedure of type svm.train named <id> can be created as follows:

mldb.put("/v1/procedures/"+<id>, {
"type": "svm.train",
"params": {
"trainingData": <InputQuery>,
"modelFileUrl": <Url>,
"configuration": <JSON>,
"functionName": <string>,
"svmType": <SVMType>,
"runOnCreation": <bool>
}
})

with the following key-value definitions for params:

Field, Type, DefaultDescription

trainingData
InputQuery

Specification of the data for input to the SVM Procedure. This should be organized as an embedding, with each selected row containing the same set of columns with numeric values to be used as coordinates. The select statement does not support groupby and having clauses.

modelFileUrl
Url

URL where the model file (with extension '.svm') should be saved. This file can be loaded by a function of type 'svm'.

configuration
JSON

Configuration object to use for the SVM Procedure. Each one has its own parameters. If none is passed, then the configuration will be loaded from the ConfigurationFile parameter

functionName
string

If specified, a SVM function of this name will be created using the trained SVM

svmType
SVMType
"classification"

If specified, a SVM function of this name will be created using the trained SVM.

runOnCreation
bool
true

If true, the procedure will be run immediately. The response will contain an extra field called firstRun pointing to the URL of the run.

### Type of SVM

There are 5 types of SVM that can be trained:

• classification will train a regular SVM for multi-class classification
• nu-classification will train the nu version of multi-class SVM classification
• one class will train a one-class SVM that will evaluate how alike a vector is to the training input
• regression will train a SVM for regression
• nu-regression will train the nu version of SVM for regression

In the nu version of the SVM, the nu parameter is used to control the number of support vectors.

You can choose the type of SVM in the svmType parameter of the procedure training

### Label

You must set the label parameter of the procedure training to specify which column in the input is to be used as label for classification, or as regression value. All other columns will be used as the feature vector.

### Configuration Contents

Here are the fields that you can specify in configuration:

• kernel specifies the type of SVM kernel to be used (see below). Default value is 'rbf'
• degree specifies the degree of polynome for polynomial kernels. Default value is '3'
• coef0 specifies the coefficient of polynomial for sigmoid kernels. Default value is 0.
• eps specifies the stopping criteria for SVM training. Default value is 1e-3.
• C specifies the C parameter for various kernels. Default value is 1.
• gamma specifies gamma parameter for various kernels. Default value is 1 divided by the number of features.
• nu specifies the nu parameter for NU and one class SVM. Default value is 0.5.
• p specifies the p parameter for SVM regression. Default value is 0.1.
• shrinking specifies whether to use shrinking heuristics. Default is 1.
• probability specifies whether to perform probability estimates. Default is 0.

#### Kernels

The following type of kernels are supported, when applying feature vectors x and y:

• linear for a Linear kernel: x dot y
• poly for a Polynomial kernel: (gamma * (x dot y) + coef0)^degree
• rbf for an radial basis function (RBF) kernel: e^(-gamma(x^2 +y^2 - 2(x dot y))). This is the default kernel.
• sigmoid for a sigmoidal kernel : tanh(gamma * (x dot y) + coef0)