Make sure to have at least one ClearML Agent running and assigned to listen to the
The script trains a simple deep neural network on the PyTorch built-in MNIST dataset. The following describes the code's execution flow:
- The training runs for one epoch.
- The code passes the
execute_remotelymethod which terminates the local execution of the code and enqueues the task to the
defaultqueue, as specified in the
- An agent listening to the queue fetches the task and restarts task execution remotely. When the agent executes the task,
execute_remotelyis considered no-op.
An execution flow that uses
execute_remotely method is especially helpful when running code on a development machine for a few iterations
to debug and to make sure the code doesn't crash, or to set up an environment. After that, the training can be
moved to be executed by a stronger machine.
During the execution of the example script, the code does the following:
- Uses ClearML's automatic and explicit logging.
- Creates an experiment named
remote_execution pytorch mnist train, which is associated with the
In the example script's
train function, the following code explicitly reports scalars to ClearML:
test method, the code explicitly reports
These scalars can be visualized in plots, which appear in the ClearML web UI, in the experiment's page > RESULTS > SCALARS.
ClearML automatically logs command line options defined with
argparse. They appear in CONFIGURATIONS > HYPER PARAMETERS > Args.
Text printed to the console for training progress, as well as all other console output, appear in RESULTS > CONSOLE.
Model artifacts associated with the experiment appear in the info panel of the EXPERIMENTS tab and in the info panel of the MODELS tab.