This procedure trains a Support Vector Machine (SVM) model and stores the model file to disk. It is a wrapper around the popular open-source LIBSVM library. For more information about LIBSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
A new procedure of type svm.train
named <id>
can be created as follows:
mldb.put("/v1/procedures/"+<id>, {
"type": "svm.train",
"params": {
"trainingData": <InputQuery>,
"modelFileUrl": <Url>,
"configuration": <JSON>,
"functionName": <string>,
"svmType": <SVMType>,
"runOnCreation": <bool>
}
})
with the following key-value definitions for params
:
Field, Type, Default | Description |
---|---|
trainingData | Specification of the data for input to the SVM Procedure. This should be organized as an embedding, with each selected row containing the same set of columns with numeric values to be used as coordinates. The select statement does not support groupby and having clauses. |
modelFileUrl | URL where the model file (with extension '.svm') should be saved. This file can be loaded by a function of type 'svm'. |
configuration | Configuration object to use for the SVM Procedure. Each one has its own parameters. If none is passed, then the configuration will be loaded from the ConfigurationFile parameter |
functionName | If specified, a SVM function of this name will be created using the trained SVM |
svmType | If specified, a SVM function of this name will be created using the trained SVM. |
runOnCreation | If true, the procedure will be run immediately. The response will contain an extra field called |
There are 5 types of SVM that can be trained:
classification
will train a regular SVM for multi-class classificationnu-classification
will train the nu version of multi-class SVM classificationone class
will train a one-class SVM that will evaluate how alike a vector is to the training inputregression
will train a SVM for regressionnu-regression
will train the nu version of SVM for regressionIn the nu version of the SVM, the nu parameter is used to control the number of support vectors.
You can choose the type of SVM in the svmType
parameter of the procedure training
You must set the label
parameter of the procedure training to specify which column in the input
is to be used as label for classification, or as regression value. All other columns will be used
as the feature vector.
Here are the fields that you can specify in configuration
:
kernel
specifies the type of SVM kernel to be used (see below). Default value is 'rbf'degree
specifies the degree of polynome for polynomial kernels. Default value is '3'coef0
specifies the coefficient of polynomial for sigmoid kernels. Default value is 0.eps
specifies the stopping criteria for SVM training. Default value is 1e-3.C
specifies the C parameter for various kernels. Default value is 1.gamma
specifies gamma parameter for various kernels. Default value is 1 divided by the number of features.nu
specifies the nu parameter for NU and one class SVM. Default value is 0.5.p
specifies the p parameter for SVM regression. Default value is 0.1.shrinking
specifies whether to use shrinking heuristics. Default is 1.probability
specifies whether to perform probability estimates. Default is 0.The following type of kernels are supported, when applying feature vectors x and y:
linear
for a Linear kernel: x dot ypoly
for a Polynomial kernel: (gamma * (x dot y) + coef0)^degreerbf
for an radial basis function (RBF) kernel: e^(-gamma(x^2 +y^2 - 2(x dot y))). This is the default kernel.sigmoid
for a sigmoidal kernel : tanh(gamma * (x dot y) + coef0)classifier.test
procedure type allows the accuracy of a predictor to be tested against
held-out data.svm
function type applies a trained SVM to a feature vector, producing a classification score.