Support Vector Machine Training Procedure

This procedure trains a Support Vector Machine (SVM) model and stores the model file to disk. It is a wrapper around the popular open-source LIBSVM library. For more information about LIBSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/

Configuration

A new procedure of type svm.train named <id> can be created as follows:

mldb.put("/v1/procedures/"+<id>, {
    "type": "svm.train",
    "params": {
        "trainingData": <InputQuery>,
        "modelFileUrl": <Url>,
        "configuration": <JSON>,
        "functionName": <string>,
        "svmType": <SVMType>,
        "runOnCreation": <bool>
    }
})

with the following key-value definitions for params:

Field, Type, DefaultDescription

trainingData
InputQuery

Specification of the data for input to the SVM Procedure. This should be organized as an embedding, with each selected row containing the same set of columns with numeric values to be used as coordinates. The select statement does not support groupby and having clauses.

modelFileUrl
Url

URL where the model file (with extension '.svm') should be saved. This file can be loaded by a function of type 'svm'.

configuration
JSON

Configuration object to use for the SVM Procedure. Each one has its own parameters. If none is passed, then the configuration will be loaded from the ConfigurationFile parameter

functionName
string

If specified, a SVM function of this name will be created using the trained SVM

svmType
SVMType
"classification"

If specified, a SVM function of this name will be created using the trained SVM.

runOnCreation
bool
true

If true, the procedure will be run immediately. The response will contain an extra field called firstRun pointing to the URL of the run.

Type of SVM

There are 5 types of SVM that can be trained:

In the nu version of the SVM, the nu parameter is used to control the number of support vectors.

You can choose the type of SVM in the svmType parameter of the procedure training

Label

You must set the label parameter of the procedure training to specify which column in the input is to be used as label for classification, or as regression value. All other columns will be used as the feature vector.

Configuration Contents

Here are the fields that you can specify in configuration:

Kernels

The following type of kernels are supported, when applying feature vectors x and y:

See also