ClearML Task lies at the heart of ClearML's experiment manager.
A Task is a single code execution session, which can represent an experiment, a step in a workflow, a workflow controller, or any custom implementation you choose.
To transform an existing script into a ClearML Task, one must call the Task.init() method and specify a task name and its project. This creates a Task object that automatically captures code execution information as well as execution outputs.
All the information captured by a task is by default uploaded to the ClearML Server and it can be visualized in the ClearML WebApp (UI). ClearML can also be configured to upload model checkpoints, artifacts, and charts to cloud storage (see Storage).
In the UI and code, tasks are grouped into projects, which are logical entities similar to folders. Users can decide how to group tasks, though different models or objectives are usually grouped into different projects. Projects can be divided into sub-projects (and sub-sub-projects, etc.) just like files and subdirectories on a computer, making experiment organization easier. In the WebApp, every project has an Overview tab, where a project description can be written and shared.
Tasks that are in the system can be accessed and utilized with code. To access a task, it can be identified either by a project name & task name combination or by a unique ID.
It's possible to copy (clone) a task multiple times and to modify it for re-execution.
The sections of ClearML Task are made up of the information that a task captures and stores, which consists of code execution details and execution outputs. This information is used for tracking and visualizing results, reproducing, tuning, and comparing experiments, and executing workflows.
The captured code execution information includes:
- Git information
- Uncommitted code modifications
- Python environment
- Execution configuration
The captured execution output includes:
To view a more in depth description of each task section, see Tracking Experiments and Visualizing Results.
Tasks have a type attribute, which denotes their purpose (Training / Testing / Data processing). This helps to further organize projects and ensure tasks are easy to search and find. The default task type is training. Available task types are:
- training, testing, inference
- controller, optimizer
- monitor, service, application
- data_processing, qc
ClearML Tasks are created in one of the following methods:
- Manually running code that is instrumented with the ClearML SDK and invokes
- Cloning an existing task.
- Creating a task via CLI using clearml-task.
The above diagram describes how execution information is recorded when running code instrumented with ClearML:
- Once a ClearML Task is initialized, ClearML automatically logs the complete environment information
- Source code
- Python environment
- Configuration parameters.
- As the execution progresses, any outputs produced are recorded including:
- Console logs
- Metrics and graphs
- Models and other artifacts
- Once the script terminates, the task will change its status to either
Aborted(see Task states below).
All information logged can be viewed in the task details UI.
The above diagram demonstrates how a previously run task can be used as a baseline for experimentation:
- A previously run task is cloned, creating a new task, in
Draftmode (see Task states below).
The new task retains all the source task's configuration. The original task's outputs are not carried over.
- The new task's configuration is modified to reflect the desired parameters for the new execution.
- The new task is enqueued for execution.
clearml-agentservicing the queue pulls the new task and executes it (where ClearML again logs all the execution outputs).
The state of a Task represents its stage in the Task lifecycle. It indicates whether the Task is read-write (editable) or read-only. For each state, a state transition indicates which actions can be performed on an experiment, and the new state after performing an action.
The following table describes the Task states and state transitions.
|State||Description / Usage||State Transition|
|Draft||The experiment is editable. Only experiments in Draft mode are editable. The experiment is not running locally or remotely.||If the experiment is enqueued for a worker to fetch and execute, the state becomes Pending.|
|Pending||The experiment was enqueued and is waiting in a queue for a worker to fetch and execute it.||If the experiment is dequeued, the state becomes Draft.|
|Running||The experiment is running locally or remotely.||If the experiment is manually or programmatically terminated, the state becomes Aborted.|
|Completed||The experiment ran and terminated successfully.||If the experiment is reset or cloned, the state of the cloned experiment or newly cloned experiment becomes Draft. Resetting deletes the logs and output of a previous run. Cloning creates an exact, editable copy.|
|Failed||The experiment ran and terminated with an error.||The same as Completed.|
|Aborted||The experiment ran, and was manually or programmatically terminated.||The same as Completed.|
|Published||The experiment is read-only. Publish an experiment to prevent changes to its inputs and outputs.||A Published experiment cannot be reset. If it is cloned, the state of the newly cloned experiment becomes Draft.|
Task.init() is the main method used to create Tasks in ClearML. It will create a Task, and populate it with:
- A link to the running git repository (including commit ID and local uncommitted changes)
- Python packages used (i.e. directly imported Python packages, and the versions available on the machine)
- Argparse arguments (default and specific to the current execution)
- Reports to Tensorboard & Matplotlib and model checkpoints.
Once a Task is created, the Task object can be accessed from anywhere in the code by calling
If multiple Tasks need to be created in the same process (for example, for logging multiple manual runs),
make sure to close a Task, before initializing a new one. To close a task simply call
(see example here).
When initializing a Task, its project needs to be specified. If the project entered does not exist, it will be created. Projects can be divided into sub-projects, just like folders are broken into sub-folders. For example:
Nesting projects works on multiple levels. For example:
Task.init call will create a new Task for the current execution.
In order to mitigate the clutter that a multitude of debugging Tasks might create, a Task will be reused if:
- The last time it was executed (on this machine) was under 72 hours ago (configurable, see
sdk.developmentsection of the ClearML configuration reference)
- The previous Task execution did not have any artifacts / models
It's possible to always create a new Task by passing
Task.init documentation here.
A Task can also be created without the need to execute the code itself. Unlike the runtime detections, all the environment and configuration details needs to be provided explicitly.
Task.create in the Python SDK reference.
A Task can be identified by its project and name, and by a unique identifier (UUID string). The name and project of a Task can be changed after an experiment has been executed, but its ID can't be changed.
Programmatically, Task objects can be retrieved by querying the system based on either the Task ID or a project and name combination. If a project / name combination is used, and multiple Tasks have the exact same name, the function will return the last modified Task.
- Accessing a Task object with a Task ID:
- Accessing a Task with a project / name:
Once a Task object is obtained, it's possible to query the state of the Task, reported scalars, etc. The Task's outputs, such as artifacts and models, can also be retrieved.
Searching and filtering Tasks can be done via the web UI, but also programmatically.
Input search parameters into the
Task.get_tasks method, which returns a list of Task objects that match the search.
It's possible to also filter Tasks by passing filtering rules to
Once a Task object is created, it can be a copied (cloned).
Task.clone returns a copy of the original Task (
By default, the cloned Task is added to the same project as the original, and it's called "Clone Of ORIGINAL_NAME", but
the name / project / comment of the cloned Task can be directly overridden.
A cloned Task starts in draft mode, so its Task configurations can be edited (see Task.set_parameters). Once a Task is modified, launch it by pushing it into an execution queue, then a ClearML Agent will pull it from the queue and execute the Task.
See enqueue example.
A compelling workflow is:
- Running code on the development machine for a few iterations, or just setting up the environment.
- Moving the execution to a beefier remote machine for the actual training.
For example, to stop the current manual execution, and then re-run it on a remote machine, simply add the following function call to the code:
Once the function is called on the machine, it will stop the local process and enqueue the current Task into the default queue. From there, an agent will be able to pick it up and launch it.
See the Remote Execution example.
A specific function can also be launched on a remote machine with
Arguments passed to the function will be automatically logged under the
Function section in the Hyperparameters tab.
Like any other arguments, they can be changed from the UI or programmatically.
Function Tasks must be created from within a regular Task, created by calling