In this tutorial, learn how to extend ClearML automagical capturing of inputs and outputs with explicit reporting.
In this example, we will add the following to the pytorch_mnist.py example script from ClearML's GitHub repo:
- Setting an output destination for model checkpoints (snapshots).
- Explicitly logging a scalar, other (non-scalar) data, and logging text.
- Registering an artifact, which is uploaded to ClearML Server, and ClearML logs changes to it.
- Uploading an artifact, which is uploaded, but changes to it are not logged.
- The clearml repository is cloned.
clearmlpackage is installed.
Make a copy of
pytorch_mnist.py in order to add explicit reporting to it.
- In the local ClearML repository,
Specify a default output location, which is where model checkpoints (snapshots) and artifacts will be stored when the experiment runs. Some possible destinations include:
- Local destination
- Shared folder
- Cloud storage:
- S3 EC2
- Google Cloud Storage
- Azure Storage.
Specify the output location in the
output_uri parameter of the Task.init method.
In this tutorial, we specify a local folder destination.
pytorch_mnist_tutorial.py, change the code from:
When the script runs, ClearML creates the following directory structure:
and puts the model checkpoints (snapshots) and artifacts in that folder.
For example, if the Task ID is
9ed78536b91a44fbb3cc7a006128c1b0, then the directory structure will be:
In addition to ClearML automagical logging, the ClearML Python package contains methods for explicit reporting of plots, log text, media, and tables. These methods include:
- Logger.report_surface (surface diagrams)
- Logger.report_image - Report an image and upload its contents.
- Logger.report_table - Report a table as a Pandas DataFrame, CSV file, or URL for a CSV file.
- Logger.report_media - Report media including images, audio, and video.
- Logger.get_default_upload_destination - Retrieve the destination that is set for uploaded media.
First, create a logger for the Task using the Task.get_logger method.
Add scalar metrics using the Logger.report_scalar method to report loss metrics.
The script contains a function named
test, which determines loss and correct for the trained model. We add a histogram
and confusion matrix to log them.
Extend ClearML by explicitly logging text, including errors, warnings, and debugging statements. We use the Logger.report_text
method and its argument
level to report a debugging message.
Registering an artifact uploads it to ClearML Server, and if it changes, the change is logged in ClearML Server. Currently, ClearML supports Pandas DataFrames as registered artifacts.
In the tutorial script,
test function, we can assign the test loss and correct data to a Pandas DataFrame object and register
that Pandas DataFrame using the Task.register_artifact method.
Once an artifact is registered, it can be referenced and utilized in the Python experiment script.
Artifact can be uploaded to the ClearML Server, but changes are not logged.
Supported artifacts include:
- Pandas DataFrames
- Files of any type, including image files
- Folders - stored as ZIP files
- Images - stored as PNG files
- Dictionaries - stored as JSONs
- Numpy arrays - stored as NPZ files
In the tutorial script, we upload the loss data as an artifact using the Task.upload_artifact
method with metadata specified in the
After extending the Python experiment script, run it and view the results in the ClearML Web UI.
To view the experiment results, do the following:
- In the ClearML Web UI, on the Projects page, click the examples project.
- In the experiments table, click the Extending automagical ClearML example experiment.
- In the ARTIFACTS tab, DATA AUDIT section, click Test_Loss_Correct. The registered Pandas DataFrame appears, including the file path, size, hash, metadata, and a preview.
- In the OTHER section, click Loss. The uploaded numpy array appears, including its related information.
- Click the RESULTS tab.
- Click the CONSOLE sub-tab, and see the debugging message showing the Pandas DataFrame sample.
- Click the SCALARS sub-tab, and see the scalar plots for epoch logging loss.
- Click the PLOTS sub-tab, and see the confusion matrix and histogram.