This page is part of the documentation for the Machine Learning Database.

It is a static snapshot of a Notebook which you can play with interactively by trying MLDB online now.
It's free and takes 30 seconds to get going.

# Using pymldb's Progress Bar and Cancel Button Tutorial¶

This tutorial showcases the use of progress bars and cancel buttons for long-running procedures with pymldb with a Jupyter notebook. This allows a user to see the progress of a procedure as well as cancel it.

If you have not done so already, we encourage you to go through the Using pymldb Tutorial.

## How does it work?¶

To use this feature, you only need to slightly modify the way you execute procedures. For example, when doing an HTTP PUT, you would go from using mldb.put() to mldb.put_and_track().

The cancel button is displayed as soon as the procedure run id is found. The button is removed as soon as the procedure finishes either normally or with an error.

The progress bar library used is tqdm/tqdm. Progress bars are displayed as soon as a procedure enters the "executing" state. Then they are refreshed at every interval for as long as the procedure stays in the "executing" state. They move to a valid state (they turn green) when a step/procedure finishes normally and to a danger state (they turn red) when they finish with an error.

If a procedure runs too quickly, the progress bars will not be displayed because the application logic will not have time to catch the "executing" phase. If a procedure stays in the "initializing" phase for some time, the "Cancel" button will be visible with no progress bars as long as the "executing" phase is not reached.

## ⚠ Disclaimers¶

1. There is a known issue where the final value of the last progress bar may not reflect the real final value of what was done in MLDB. The reason for it is that once a procedure has finished running, it no longer reports how many items it processed for each step.
2. Due to XSS (cross site scripting) restrictions, the cancel button provided with the progress bars will not work if the notebook is running on a different host than mldb itself.

Here we start with the obligatory lines to import pymldb and initialize the connection to MLDB.

In [13]:
import pymldb
mldb = pymldb.Connection()


## Procedure with steps¶

Here we post to a procedure with multiple steps. The steps are displayed as soon as the procedure starts running and are updated accordingly.

In [8]:
print mldb.post_and_track('/v1/procedures', {
'type' : 'mock',
'params' : {'durationMs' : 8000, "refreshRateMs" : 500}
}, 0.5)


<Response [201]>


## Procedure with no steps¶

A procedure with no inner steps will simply display its progress.

This one is an example where the "initializing" phase sticks for some time, so the "Cancel" button is shown alone and eventually, when the "executing" phase is reached, the progress bar is displayed.

In [9]:
print mldb.put_and_track('/v1/procedures/embedded_imagess', {
'type' : 'import.text',
'params' : {
'dataFileUrl' : 'https://s3.amazonaws.com/benchm-ml--main/train-1m.csv',
'outputDataset' : {
'id' : 'embedded_images_realestate',
'type' : 'sparse.mutable'
}
}
}, 0.1)

<Response [201]>


## Serial procedure¶

When using post_and_track along with a serial procedure, a progress bar is displayed for each step. They will only take the value of 0/1 and 1/1.

In [11]:
prefix = 'http://public.mldb.ai/datasets/dataset-builder'
print mldb.post_and_track('/v1/procedures', {
'type' : 'serial',
'params' : {
'steps' : [
{
'type' : 'mock',
'params' : {'durationMs' : 2000, "refreshRateMs" : 500}
}, {
'type' : 'import.text',
'params' : {
'dataFileUrl' : prefix + '/cache/dataset_creator_embedding_realestate.csv.gz',
'outputDataset' : {
'id' : 'embedded_images_realestate',
'type' : 'embedding'
},
'select' : '* EXCLUDING(rowName)',
'named' : 'rowName',
}
}, {
'type' : 'mock',
'params' : {'durationMs' : 2000, "refreshRateMs" : 500}
}
]
}
})



<Response [201]>


## Where to next?¶

Check out the other Tutorials and Demos.