SentiWordNet Importer Procedure

This procedure allows word and phrase embeddings from the SentiWordNet lexical resource to be loaded into MLDB.

Using these embeddings, each word or phrase in English is convertible to a 3-dimensional set of coordinates representing sentiment scores: positivity, negativity, objectivity.

This is a simple implementation that does not do word sense disambiguation. SentiWordNet provides sentiment scores for each of WordNet's synsets. For a given word, this implementation does a weighted average of the sentiment scores of each of the word's synsets. This means more weight will be given to the scores of the more likely word sense in general rather than in the current context.

Configuration

A new procedure of type import.sentiwordnet named <id> can be created as follows:

mldb.put("/v1/procedures/"+<id>, {
    "type": "import.sentiwordnet",
    "params": {
        "dataFileUrl": <Url>,
        "outputDataset": <OutputDatasetSpec>,
        "runOnCreation": <bool>
    }
})

with the following key-value definitions for params:

Field, Type, DefaultDescription

dataFileUrl
Url

Path to SentiWordNet 3.0 data file

outputDataset
OutputDatasetSpec
{"type":"sparse.mutable"}

Output dataset for result

runOnCreation
bool
true

If true, the procedure will be run immediately. The response will contain an extra field called firstRun pointing to the URL of the run.

The dataFileUri parameter should point to a SentiWordNet 3.0 data file. It can be obtained on the SentiWordNet website.

Data format

The row names will be a word followed by a # and a one character code indicating the synset type.

The following table shows the synset codes (source):

Code Name
n NOUN
v VERB
a ADJECTIVE
s ADJECTIVE SATELLITE
r ADVERB

Assuming the SentiWordNet data is imported in the sentiWordNet table, the following query gets the embedding for the word love in the context of a verb and dog in the context of a noun.

SELECT * FROM sentiWordNet WHERE rowName() IN ('love#v', 'dog#n')
SentiPos SentiNeg SentiObj POS baseWord
0 0.1928374618291855 0.8071626424789429 "n" "dog"
0.6249999403953552 0.01499999966472387 0.3600000143051147 "v" "love"

See also