A procedure configuration object is used to create or load a procedure.
It is a JSON object that looks like this:
{
"id": <id>,
"type": <type>,
"params": {
<params>
}
}
id
is a string that defines the URL at which the procedure will be available via the REST APItype
is a string that specified the procedure's type (see below)params
is an object that configures the procedure, and whose contents will vary according to the typeNot all three of these fields are required in all contexts:
id
and type
must be specified
id
is specified, MLDB will assume this is a pre-existing procedure and will try to load it (an error will ensue if it doesn't already exist)type
is specified, MLDB will assume that the procedure doesn't exist yet and will try to create it (an error will ensue if it already exists)
type
is specified without id
, an id will be auto-generatedtype
is specified with id
, the procedure will be created with the specified id
unless a procedure already exists with that idtype
is specified, then a corresponding params
function must be specified if the type requires itThe following types of procedures are available:
Type | Description | Doc |
---|---|---|
classifier.experiment | Train and test a classifier | [doc] |
classifier.test | Calculate the accuracy of a classifier on held-out data | [doc] |
classifier.train | Train a supervised classifier | [doc] |
export.csv | Exports a dataset to a target location as a CSV | [doc] |
import.git | Import a Git repository's metadata into MLDB | [doc] |
import.json | Import a text file with one JSON per line into MLDB | [doc] |
import.sentiwordnet | Import a SentiWordNet file into MLDB | [doc] |
import.text | Import from a text file, line by line. | [doc] |
import.word2vec | Import a word2vec file into MLDB | [doc] |
kmeans.train | Simple clustering algorithm based on cluster centroids in embedding space | [doc] |
melt | Performs a melt operation on a dataset | [doc] |
mongodb.import | Import a dataset from MongoDB | [doc] |
permuter.run | Run a child procedure with permutations of its configuration | [doc] |
probabilizer.train | Trains a model to calibrate a score into a probability | [doc] |
randomforest.binary.train | Train a supervised binary random forest | [doc] |
statsTable.bagOfWords.train | Create statistical tables of trials against outcomes for bag of words | [doc] |
statsTable.train | Create statistical tables of trials against outcomes | [doc] |
summary.statistics | Creates a dataset with summary statistics for each columns of an input dataset | [doc] |
svd.train | Train a SVD to convert rows or columns to embedding coordinates | [doc] |
tfidf.train | Prepare data for a TF-IDF function | [doc] |
transform | Apply an SQL expression over a dataset to transform into another dataset | [doc] |
tsne.train | Project a high dimensional space into a low-dimensional space suitable for visualization | [doc] |