Nearest Neighbors Function

The embedding.neighbors function type returns information about the nearest neighbor rows in an existing embedding dataset type to an arbitrary point.

Configuration

A new function of type embedding.neighbors named <id> can be created as follows:

mldb.put("/v1/functions/"+<id>, {
    "type": "embedding.neighbors",
    "params": {
        "defaultNumNeighbors": <int>,
        "defaultMaxDistance": <float>,
        "dataset": <SqlFromExpression>,
        "columnName": <Path>
    }
})

with the following key-value definitions for params:

Field, Type, DefaultDescription

defaultNumNeighbors
int
10

Default number of neighbors to return. This can be overwritten when calling the function.

defaultMaxDistance
float
"inf"

Default value for maxDistance parameter to function if not specified in the function call. This can be overridden on a call-by-call basis.

dataset
SqlFromExpression

Embedding dataset in which to find neighbors. This must be a dataset of type embedding.

columnName
Path

The column name within the embedding dataset to use to match values against. This must match the columns within the dataset referred to in the dataset parameter. In the case that the embedding contains values from multiple columns instead of a single embedding (in other words, they are not of the format columnName.0, columnName.1, ... but instead look like name1, name2, ...), then pass in [] which signifies use all columns (and is the default).

Input and Output Values

Functions of this type have the following input values:

Functions of this type have the following output values: * neighbors: an embedding of the rowPaths of the nearest neighbors in order of proximity * distances: a row of rowName to distance for the nearest neighbors

See also