JSON Import Procedure

The JSON Import Procedure type is used to import a text file containing one JSON per line in a dataset.

This procedure will process lines using the parse_json builtin function.

Configuration

A new procedure of type import.json named <id> can be created as follows:

mldb.put("/v1/procedures/"+<id>, {
    "type": "import.json",
    "params": {
        "dataFileUrl": <Url>,
        "outputDataset": <OutputDatasetSpec>,
        "limit": <int>,
        "offset": <int>,
        "ignoreBadLines": <bool>,
        "select": <SqlSelectExpression>,
        "where": <string>,
        "named": <string>,
        "arrays": <JsonArrayHandling>,
        "runOnCreation": <bool>
    }
})

with the following key-value definitions for params:

Field, Type, Default	Description
dataFileUrl Url	URL to load text file from
outputDataset OutputDatasetSpec `{"type":"tabular"}`	Configuration for output dataset
limit int `0`	Maximum number of lines to process
offset int `0`	Skip the first n lines.
ignoreBadLines bool `false`	If true, any line causing an error will be skipped. Any line with an invalid JSON object will cause an error.
select SqlSelectExpression `"*"`	Which columns to use.
where string `"true"`	Which lines to use to create rows.
named string `"lineNumber()"`	Row name expression for output dataset. Note that each row must have a unique name and that names cannot be objects.
arrays JsonArrayHandling `"parse"`	Describes how arrays are encoded in the JSON output. For ''parse' (default), the arrays become structured values. For 'encode', arrays containing atoms are sparsified with the values representing one-hot keys and boolean true values
runOnCreation bool `true`	If true, the procedure will be run immediately. The response will contain an extra field called `firstRun` pointing to the URL of the run.

Enumeration `JsonArrayHandling`

Value	Description
`parse`	Arrays will be parsed into nested values
`encode`	Arrays will be encoded as one-hot values

JSON Import Procedure

Configuration

Enumeration JsonArrayHandling

See also

Enumeration `JsonArrayHandling`