Functions of this type embed a row of a dataset into an SVD space, producing a singular vector,
using a model previously trained by an svd.train
procedure type.
A new function of type svd.embedRow
named <id>
can be created as follows:
mldb.put("/v1/functions/"+<id>, {
"type": "svd.embedRow",
"params": {
"modelFileUrl": <Url>,
"maxSingularValues": <int>,
"acceptUnknownValues": <bool>
}
})
with the following key-value definitions for params
:
Field, Type, Default | Description |
---|---|
modelFileUrl | URL of the model file (with extension '.svd') to load. This file is created by the |
maxSingularValues | Maximum number of singular values to use (-1 = all) |
acceptUnknownValues | This parameter (which defaults to false) tells us whether or not unknown values should be accepted by the SVD. An unknown value occurs when a column that was always a number in training is presented with a string value, or vice versa, or when a string valued column is presented with a value unknown in training. If its value is true, an unknown value will be silently ignored. If its value is false, an unknown value will return an error when the function is applied. |
Functions of this type has a single input value called row
which is a row. The columns that
are expected in this row depend on the features that were trained into the SVD model.
For example, if in the training the input value was "select": "x,y"
, then the function will
expect two columns called x
and y
.
These functions have a single output value called embedding
which contains a row. This row
will contain columns with names prefixed with the outputColumn
parameter of the svd.train
procedure type that trained the model, followed by a 4 digit number for each of the singular values. By default, the outputColumn
parameter is svd
so the columns of the output row will be svd0001
, svd0002
, etc.
When an SVD procedure is trained, it infers the type of input values and does feature extraction based upon the types seen in training. If the type of an input value passed into the function doesn't match the input value type seen in training, then the SVD may not give sensible outputs. This happens in the following situations:
The acceptUnknownValues
parameter controls what happens in this situation.
If the value of that parameter is true
, then a column with an unknown value
will be ignored completely. If the value of the parameter is false
, then
the application of the function will return an error.
The main use of that parameter is to catch errors in the development phase, for example accidentially encoding a parameter as a string when it should be an int or mixing column names.
svd.train
procedure type trains an SVD