Behaviour Dataset

This feature is part of the MLDB Pro Plugin and so can only be used in compliance with the trial license unless a commercial license has been purchased

The Behaviour Dataset is used to store behavioural data. It is designed for the following situations:

The reason for these restrictions is that the underlying data structure stores (userid,feature,timestamp) tuples, rather than the (userid,key,value,timestamp) format for MLDB. A new "feature" is created for every combination of (key,value) which can lead to a lot of storage being taken up if a key has many values.

It stores its data in a binary file format, normally with an extension of .beh, which is specified by the dataFileUrl parameter. This file format is allows full random access to both the matrix and its inverse and is very efficient in memory usage.

This dataset type is read-only, in other words it can only load up datasets that were previously written from an artifact. See the beh.mutable dataset type for how to create these files.

Configuration

A new dataset of type beh named <id> can be created as follows:

mldb.put("/v1/datasets/"+<id>, {
    "type": "beh",
    "params": {
        "dataFileUrl": <Url>
    }
})

with the following key-value definitions for params:

Field, Type, DefaultDescription

dataFileUrl
Url

URL of the data file (with extension '.beh') from which to load the dataset.

See also