When a procedure accepts data as input, the data query can be specified in one of two ways: as a string or as a JSON object.
Input queries can be specified as a string containing a full SQL query. For example the string:
SELECT column1, column2 FROM dataset1 WHERE column1 > column2 ORDER BY column1
where column1
and column2
refer to columns in a pre-existing dataset with id dataset1
.
Input queries can also be specified as a JSON object representing a Dataset Configuration and a decomposed SQL query.
The fields in the JSON structure are as follows:
Field, Type, Default | Description |
---|---|
select | SELECT clause |
named | NAMED clause |
from | FROM clause |
when | WHEN clause |
where | WHERE clause |
orderBy | ORDER BY clause |
groupBy | GROUP BY clause |
having | HAVING clause |
offset | OFFSET clause |
limit | LIMIT clause |
For example the object:
{
"select" : "column1, column2",
"from" : {
"id" : "dataset1"
},
"where" : "column1 > column2",
"orderBy" : "column1"
}
is equivalent to the query above on the pre-existing dataset with id dataset1
. In addition, this representation
offers the ability to first create a dataset. In this example,
{
"select" : "column1, column2",
"from" : {
"id" : "dataset1",
"type" : "beh",
"params" : {
"dataFileUrl" : "file:///mldb_data/file.beh"
}
},
"where" : "column1 > column2",
"orderBy" : "column1"
}
the dataset dataset1
is first created by loading a beh file.
For performance reason, some procedures do not accept a full SQL query as their data input. For details on what is supported for a given procedure, read its documentation.