The JSON Import Procedure type is used to import a text file containing one JSON per line in a dataset.
This procedure will process lines using the parse_json builtin function.
A new procedure of type import.json
named <id>
can be created as follows:
mldb.put("/v1/procedures/"+<id>, {
"type": "import.json",
"params": {
"dataFileUrl": <Url>,
"outputDataset": <OutputDatasetSpec>,
"limit": <int>,
"offset": <int>,
"ignoreBadLines": <bool>,
"select": <SqlSelectExpression>,
"where": <string>,
"named": <string>,
"arrays": <JsonArrayHandling>,
"runOnCreation": <bool>
}
})
with the following key-value definitions for params
:
Field, Type, Default | Description |
---|---|
dataFileUrl | URL to load text file from |
outputDataset | Configuration for output dataset |
limit | Maximum number of lines to process |
offset | Skip the first n lines. |
ignoreBadLines | If true, any line causing an error will be skipped. Any line with an invalid JSON object will cause an error. |
select | Which columns to use. |
where | Which lines to use to create rows. |
named | Row name expression for output dataset. Note that each row must have a unique name and that names cannot be objects. |
arrays | Describes how arrays are encoded in the JSON output. For ''parse' (default), the arrays become structured values. For 'encode', arrays containing atoms are sparsified with the values representing one-hot keys and boolean true values |
runOnCreation | If true, the procedure will be run immediately. The response will contain an extra field called |
JsonArrayHandling
Value | Description |
---|---|
parse | Arrays will be parsed into nested values |
encode | Arrays will be encoded as one-hot values |
import.text
procedure type is used to import text filesmelt
procedure type is used to melt columns into many rows. This is useful
when dealing with a JSON array of objects