A procedure configuration object is used to create or load a procedure.
It is a JSON object that looks like this:
{
"id": <id>,
"type": <type>,
"params": {
<params>
}
}
id is a string that defines the URL at which the procedure will be available via the REST APItype is a string that specified the procedure's type (see below)params is an object that configures the procedure, and whose contents will vary according to the typeNot all three of these fields are required in all contexts:
id and type must be specified
id is specified, MLDB will assume this is a pre-existing procedure and will try to load it (an error will ensue if it doesn't already exist)type is specified, MLDB will assume that the procedure doesn't exist yet and will try to create it (an error will ensue if it already exists)
type is specified without id, an id will be auto-generatedtype is specified with id, the procedure will be created with the specified id unless a procedure already exists with that idtype is specified, then a corresponding params function must be specified if the type requires itThe following types of procedures are available:
| Type | Description | Doc |
|---|---|---|
classifier.experiment | Train and test a classifier | [doc] |
classifier.test | Calculate the accuracy of a classifier on held-out data | [doc] |
classifier.train | Train a supervised classifier | [doc] |
export.csv | Exports a dataset to a target location as a CSV | [doc] |
import.git | Import a Git repository's metadata into MLDB | [doc] |
import.json | Import a text file with one JSON per line into MLDB | [doc] |
import.sentiwordnet | Import a SentiWordNet file into MLDB | [doc] |
import.text | Import from a text file, line by line. | [doc] |
import.word2vec | Import a word2vec file into MLDB | [doc] |
kmeans.train | Simple clustering algorithm based on cluster centroids in embedding space | [doc] |
melt | Performs a melt operation on a dataset | [doc] |
mongodb.import | Import a dataset from MongoDB | [doc] |
permuter.run | Run a child procedure with permutations of its configuration | [doc] |
probabilizer.train | Trains a model to calibrate a score into a probability | [doc] |
randomforest.binary.train | Train a supervised binary random forest | [doc] |
statsTable.bagOfWords.train | Create statistical tables of trials against outcomes for bag of words | [doc] |
statsTable.train | Create statistical tables of trials against outcomes | [doc] |
summary.statistics | Creates a dataset with summary statistics for each columns of an input dataset | [doc] |
svd.train | Train a SVD to convert rows or columns to embedding coordinates | [doc] |
tfidf.train | Prepare data for a TF-IDF function | [doc] |
transform | Apply an SQL expression over a dataset to transform into another dataset | [doc] |
tsne.train | Project a high dimensional space into a low-dimensional space suitable for visualization | [doc] |