Task
class Task()
The Task
class is a code template for a Task object which, together with its connected experiment components,
represents the current running experiment. These connected components include hyperparameters, loggers,
configuration, label enumeration, models, and other artifacts.
The term “main execution Task” refers to the Task context for current running experiment. Python experiment scripts can create one, and only one, main execution Task. It is traceable, and after a script runs and ClearML stores the Task in the ClearML Server (backend), it is modifiable, reproducible, executable by a worker, and you can duplicate it for further experimentation.
The Task
class and its methods allow you to create and manage experiments, as well as perform
advanced experimentation functions, such as autoML.
Do not construct Task objects directly. Use one of the methods listed below to create experiments or reference existing experiments. Do not define CLEARMLTASK and CLEARMLPROC OS environments, they are used internally for bookkeeping between processes and agents.
For detailed information about creating Task objects, see the following methods:
- Create a new reproducible Task -
Task.init
In some cases, Task.init may return a Task object which is already stored in ClearML Server (already initialized), instead of creating a new Task. For a detailed explanation of those cases, see the Task.init method.
Manually create a new Task (no auto-logging will apply) -
Task.create
Get the current running Task -
Task.current_task
Get another (different) Task -
Task.get_task
The ClearML documentation often refers to a Task as, “Task (experiment)”.
“Task” refers to the class in the ClearML Python Client Package, the object in your Python experiment script, and the entity with which ClearML Server and ClearML Agent work.
“Experiment” refers to your deep learning solution, including its connected components, inputs, and outputs, and is the experiment you can view, analyze, compare, modify, duplicate, and manage using the ClearML Web-App (UI).
Therefore, a “Task” is effectively an “experiment”, and “Task (experiment)” encompasses its usage throughout the ClearML.
The exception to this Task behavior is sub-tasks (non-reproducible Tasks), which do not use the main execution Task. Creating a sub-task always creates a new Task with a new Task ID.
Do not construct Task manually! Please use Task.init or Task.get_task
Task.add_requirements
classmethod add_requirements(package_name, package_version=None)
Force the adding of a package to the requirements list. If package_version
is None, use the
installed package version, if found.
Example: Task.add_requirements(‘tensorflow’, ‘2.4.0’)
Example: Task.add_requirements(‘tensorflow’, ‘>=2.4’)
Example: Task.add_requirements(‘tensorflow’) -> use the installed tensorflow version
Example: Task.add_requirements(‘tensorflow’, ‘’) -> no version limit
Alternatively, you can add all requirements from a file.
Example: Task.add_requirements(‘/path/to/your/project/requirements.txt’)
Parameters
package_name (str ) – The package name or path to a requirements file to add to the “Installed Packages” section of the task.
package_version (
Optional
[str
]) – The package version requirements. IfNone
, then use the installed version.
Return type
None
add_tags
add_tags(tags)
Add Tags to this task. Old tags are not deleted. When executing a Task (experiment) remotely, this method has no effect.
Parameters
tags (
Union
[Sequence
[str
],str
]) – A list of tags which describe the Task to add.Return type
None
artifacts
property artifacts
A read-only dictionary of Task artifacts (name, artifact).
Return type
Dict
[str
,Artifact
]Returns
The artifacts.
cache_dir
property cache_dir
The cache directory which is used to store the Task related files.
Return type
Path
Task.clone
classmethod clone(source_task=None, name=None, comment=None, parent=None, project=None)
Create a duplicate (a clone) of a Task (experiment). The status of the cloned Task is Draft
and modifiable.
Use this method to manage experiments and for autoML.
Parameters
source_task (str ) – The Task to clone. Specify a Task object or a Task ID. (Optional)
name (str ) – The name of the new cloned Task. (Optional)
comment (str ) – A comment / description for the new cloned Task. (Optional)
parent (str ) – The ID of the parent Task of the new Task.
If
parent
is not specified, thenparent
is set tosource_task.parent
.If
parent
is not specified andsource_task.parent
is not available, thenparent
set tosource_task
.
project (str ) – The ID of the project in which to create the new Task. If
None
, the new task inherits the original Task’s project. (Optional)
Returns
The new cloned Task (experiment).
Return type
Task
close
close()
Closes the current Task and changes its status to “Completed”. Enables you to manually shut down the task from the process which opened the task.
This method does not terminate the (current) Python process, in contrast to Task.mark_completed
.
After having Task.close
-d a task, the respective object cannot be used anymore and
methods like Task.connect
or Task.connect_configuration
will throw a ValueError.
In order to obtain an object representing the task again, use methods like Task.get_task
.
Only call Task.close if you are certain the Task is not needed.
comment
property comment
Returns the current Task’s (user defined) comments.
Return type
str
completed
completed(ignore_errors=True)
Deprecated, use mark_completed(…) instead
Return type
()
Parameters
ignore_errors (bool ) –
connect
connect(mutable, name=None)
Connect an object to a Task object. This connects an experiment component (part of an experiment) to the
experiment. For example, an experiment component can be a valid object containing some hyperparameters, or a Model
.
Parameters
mutable (object ) – The experiment component to connect. The object must be one of the following types:
argparse - An argparse object for parameters.
dict - A dictionary for parameters. Note: only keys of type str are supported.
TaskParameters - A TaskParameters object.
Model
- A model object for initial model warmup, or for model update/snapshot uploading. In practice the model should be eitherInputModel
orOutputModel
.type - A Class type, storing all class properties (excluding ‘_’ prefixed properties).
object - A class instance, storing all instance properties (excluding ‘_’ prefixed properties).
name (str ) – A section name associated with the connected object, if ‘name’ is None defaults to ‘General’ Currently, name is only supported for dict and TaskParameter objects, and should be omitted for the other supported types. (Optional)
For example, by setting name=’General’ the connected dictionary will be under the General section in the hyperparameters section. While by setting name=’Train’ the connected dictionary will be under the Train section in the hyperparameters section.
Return type
Any
Returns
It will return the same object that was passed as the mutable argument to the method, except if the type of the object is dict. For dicts the
Task.connect
will return the dict decorated as a ProxyDictPostWrite. This is done to allow propagating the updates from the connected object.Raise
Raises an exception if passed an unsupported object.
connect_configuration
connect_configuration(configuration, name=None, description=None)
Connect a configuration dictionary or configuration file (pathlib.Path / str) to a Task object. This method should be called before reading the configuration file.
For example, a local file:
config_file = task.connect_configuration(config_file)
my_params = json.load(open(config_file,'rt'))
A parameter dictionary/list:
my_params = task.connect_configuration(my_params)
Parameters
configuration (
Union
[Mapping
,list
,Path
,str
]) – The configuration. This is usually the configuration used in the model training process.Specify one of the following:
A dictionary/list - A dictionary containing the configuration. ClearML stores the configuration in
the ClearML Server (backend), in a HOCON format (JSON-like format) which is editable.
A
pathlib2.Path
string - A path to the configuration file. ClearML stores the content of the file.A local path must be relative path. When executing a Task remotely in a worker, the contents brought from the ClearML Server (backend) overwrites the contents of the file.
name (str ) – Configuration section name. default: ‘General’ Allowing users to store multiple configuration dicts/files
description (str ) – Configuration section description (text). default: None
Return type
Union
[dict
,Path
,str
]Returns
If a dictionary is specified, then a dictionary is returned. If pathlib2.Path / string is specified, then a path to a local configuration file is returned. Configuration object.
connect_label_enumeration
connect_label_enumeration(enumeration)
Connect a label enumeration dictionary to a Task (experiment) object.
Later, when creating an output model, the model will include the label enumeration dictionary.
Parameters
enumeration (dict ) – A label enumeration dictionary of string (label) to integer (value) pairs.
For example:
{
"background": 0,
"person": 1
}Return type
Dict
[str
,int
]Returns
The label enumeration dictionary (JSON).
Task.create
classmethod create(project_name=None, task_name=None, task_type=None, repo=None, branch=None, commit=None, script=None, working_directory=None, packages=None, requirements_file=None, docker=None, docker_args=None, docker_bash_setup_script=None, argparse_args=None, base_task_id=None, add_task_init_call=True)
Manually create and populate a new Task (experiment) in the system.
If the code does not already contain a call to Task.init
, pass add_task_init_call=True,
and the code will be patched in remote execution (i.e. when executed by clearml-agent)
This method always creates a new Task. Use Task.init method to automatically create and populate task for the running process. To reference an existing Task, call the Task.get_task method .
Parameters
project_name (
Optional
[str
]) – Set the project name for the task. Required if base_task_id is None.task_name (
Optional
[str
]) – Set the name of the remote task. Required if base_task_id is None.task_type (
Optional
[str
]) – Optional, The task type to be created. Supported values: ‘training’, ‘testing’, ‘inference’, ‘data_processing’, ‘application’, ‘monitor’, ‘controller’, ‘optimizer’, ‘service’, ‘qc’, ‘custom’repo (
Optional
[str
]) – Remote URL for the repository to use, or path to local copy of the git repository Example: ‘https://github.com/allegroai/clearml.git’ or ‘~/project/repo’branch (
Optional
[str
]) – Select specific repository branch/tag (implies the latest commit from the branch)commit (
Optional
[str
]) – Select specific commit ID to use (default: latest commit, or when used with local repository matching the local commit id)script (
Optional
[str
]) – Specify the entry point script for the remote execution. When used in tandem with remote git repository the script should be a relative path inside the repository, for example: ‘./source/train.py’ . When used with local repository path it supports a direct path to a file inside the local repository itself, for example: ‘~/project/source/train.py’working_directory (
Optional
[str
]) – Working directory to launch the script from. Default: repository root folder. Relative to repo root or local folder.packages (
Union
[bool
,Sequence
[str
],None
]) – Manually specify a list of required packages. Example: [“tqdm>=2.1”, “scikit-learn”] or True to automatically create requirements based on locally installed packages (repository must be local).requirements_file (
Union
[str
,Path
,None
]) – Specify requirements.txt file to install when setting the session. If not provided, the requirements.txt from the repository will be used.docker (
Optional
[str
]) – Select the docker image to be executed in by the remote sessiondocker_args (
Optional
[str
]) – Add docker arguments, pass a single stringdocker_bash_setup_script (
Optional
[str
]) – Add bash script to be executed inside the docker before setting up the Task’s environmentargparse_args (
Optional
[Sequence
[Tuple
[str
,str
]]]) – Arguments to pass to the remote execution, list of string pairs (argument, value) Notice, only supported if the codebase itself uses argparse.ArgumentParserbase_task_id (
Optional
[str
]) – Use a pre-existing task in the system, instead of a local repo/script. Essentially clones an existing task and overrides arguments/requirements.add_task_init_call (
bool
) – If True, a ‘Task.init()’ call is added to the script entry point in remote execution.
Returns
The newly created Task (experiment)
Return type
Task
create_function_task
create_function_task(func, func_name=None, task_name=None, kwargs)**
Create a new task, and call func
with the specified kwargs.
One can think of this call as remote forking, where the newly created instance is the new Task
calling the specified func with the appropriate kwargs and leaving once the func terminates.
Notice that a remote executed function cannot create another child remote executed function.
Must be called from the main Task, i.e. the one created by Task.init(…)
The remote Tasks inherits the environment from the creating Task
In the remote Task, the entrypoint is the same as the creating Task
In the remote Task, the execution is the same until reaching this function call
Parameters
func (
Callable
) – A function to execute remotely as a single Task. On the remote executed Task the entry-point and the environment are copied from this calling process, only this function call redirect the execution flow to the called func, alongside the passed argumentsfunc_name (
Optional
[str
]) – A unique identifier of the function. Default the function name without the namespace. For example Class.foo() becomes ‘foo’task_name (
Optional
[str
]) – The newly created Task name. Default: the calling Task name + function namekwargs (
Optional
[Any
]) – name specific arguments for the target function. These arguments will appear under the configuration, “Function” section
Return Task
Return the newly created Task or None if running remotely and execution is skipped
Return type
Optional
[Task
]
Task.current_task
classmethod current_task()
Get the current running Task (experiment). This is the main execution Task (task context) returned as a Task object.
Returns
The current running Task (experiment).
Return type
Task
Task.debug_simulate_remote_task
classmethod debug_simulate_remote_task(task_id, reset_task=False)
Simulate remote execution of a specified Task. This call will simulate the behaviour of your Task as if executed by the ClearML-Agent This means configurations will be coming from the backend server into the code (the opposite from manual execution, where the backend logs the code arguments) Use with care.
Parameters
task_id (str ) – Task ID to simulate, notice that all configuration will be taken from the specified Task, regardless of the code initial values, just like it as if executed by ClearML agent
reset_task (bool ) – If True, target Task, is automatically cleared / reset.
Return type
()
delete
delete(delete_artifacts_and_models=True, skip_models_used_by_other_tasks=True, raise_on_error=False, callback=None)
Delete the task as well as its output models and artifacts. Models and artifacts are deleted from their storage locations, each using its URI.
Note: in order to delete models and artifacts using their URI, make sure the proper storage credentials are configured in your configuration file (e.g. if an artifact is stored in S3, make sure sdk.aws.s3.credentials are properly configured and that you have delete permission in the related buckets).
Parameters
delete_artifacts_and_models (
bool
) – If True, artifacts and models would also be deleted (default True). If callback is provided, this argument is ignored.skip_models_used_by_other_tasks (
bool
) – If True, models used by other tasks would not be deleted (default True)raise_on_error (
bool
) – If True, an exception will be raised when encountering an error. If False an error would be printed and no exception will be raised.callback (
Optional
[Callable
[[str
,str
],bool
]]) – An optional callback accepting a uri type (string) and a uri (string) that will be called for each artifact and model. If provided, the delete_artifacts_and_models is ignored. Return True to indicate the artifact/model should be deleted or False otherwise.
Return type
bool
Returns
True if the task was deleted successfully.
delete_artifacts
delete_artifacts(artifact_names, raise_on_errors=True, delete_from_storage=True)
Delete a list of artifacts, by artifact name, from the Task.
Parameters
artifact_names (list ) – list of artifact names
raise_on_errors (bool ) – if True, do not suppress connectivity related exceptions
delete_from_storage (bool ) – If True, try to delete the actual file from the external storage (e.g. S3, GS, Azure, File Server etc.)
Return type
bool
Returns
True if successful
delete_parameter
delete_parameter(name)
Delete a parameter by its full name Section/name.
Parameters
name (
str
) – Parameter name in full, i.e. Section/name. For example, ‘Args/batch_size’Return type
bool
Returns
True if the parameter was deleted successfully
delete_user_properties
delete_user_properties(*iterables)
Delete hyperparameters for this task.
Parameters
iterables (Iterable [ Union [ dict , Iterable [ str , str ] ] ] ) – Hyperparameter key iterables. Each an iterable whose possible values each represent a hyperparameter entry to delete, value formats are:
A dictionary containing a ‘section’ and ‘name’ fields
An iterable (e.g. tuple, list etc.) whose first two items denote ‘section’ and ‘name’
Return type
bool
Task.dequeue
classmethod dequeue(task)
Dequeue (remove) a Task from an execution queue.
Parameters
task (Task/str ) – The Task to dequeue. Specify a Task object or Task ID.
Return type
Any
Returns
A dequeue JSON response.
{
"dequeued": 1,
"updated": 1,
"fields": {
"status": "created",
"status_reason": "",
"status_message": "",
"status_changed": "2020-02-24T16:43:43.057320+00:00",
"last_update": "2020-02-24T16:43:43.057320+00:00",
"execution.queue": null
}
}
dequeued
- The number of Tasks enqueued (an integer ornull
).fields
status
- The status of the experiment.status_reason
- The reason for the last status change.status_message
- Information about the status.status_changed
- The last status change date and time in ISO 8601 format.last_update
- The last time the Task was created, updated,changed, or events for this task were reported.
execution.queue
- The ID of the queue where the Task is enqueued.null
indicates not enqueued.
updated
- The number of Tasks updated (an integer ornull
).
Task.enqueue
classmethod enqueue(task, queue_name=None, queue_id=None)
Enqueue a Task for execution, by adding it to an execution queue.
A worker daemon must be listening at the queue for the worker to fetch the Task and execute it, see ClearML Agent in the ClearML Documentation.
Parameters
task (Task/str ) – The Task to enqueue. Specify a Task object or Task ID.
queue_name (str ) – The name of the queue. If not specified, then
queue_id
must be specified.queue_id (str ) – The ID of the queue. If not specified, then
queue_name
must be specified.
Return type
Any
Returns
An enqueue JSON response.
{
"queued": 1,
"updated": 1,
"fields": {
"status": "queued",
"status_reason": "",
"status_message": "",
"status_changed": "2020-02-24T15:05:35.426770+00:00",
"last_update": "2020-02-24T15:05:35.426770+00:00",
"execution.queue": "2bd96ab2d9e54b578cc2fb195e52c7cf"
}
}queued
- The number of Tasks enqueued (an integer ornull
).updated
- The number of Tasks updated (an integer ornull
).fields
status
- The status of the experiment.status_reason
- The reason for the last status change.status_message
- Information about the status.status_changed
- The last status change date and time (ISO 8601 format).last_update
- The last Task update time, including Task creation, update, change, or events for this task (ISO 8601 format).execution.queue
- The ID of the queue where the Task is enqueued.null
indicates not enqueued.
execute_remotely
execute_remotely(queue_name=None, clone=False, exit_process=True)
If task is running locally (i.e., not by clearml-agent
), then clone the Task and enqueue it for remote
execution; or, stop the execution of the current Task, reset its state, and enqueue it. If exit==True
,
exit this process.
If the task is running remotely (i.e., clearml-agent is executing it), this call is a no-op (i.e., does nothing).
Parameters
queue_name (
Optional
[str
]) – The queue name used for enqueueing the task. IfNone
, this call exits the process without enqueuing the task.clone (
bool
) – Clone the Task and execute the newly cloned TaskThe values are:
True
- A cloned copy of the Task will be created, and enqueued, instead of this Task.False
- The Task will be enqueued.
exit_process (
bool
) – The function call will leave the calling process at the end.True
- Exit the process (exit(0)). Note: ifclone==False
, thenexit_process
must beTrue
.False
- Do not exit the process.
Return Task
return the task object of the newly generated remotely executing task
Return type
Optional
[Task
]
export_task
export_task()
Export Task’s configuration into a dictionary (for serialization purposes). A Task can be copied/modified by calling Task.import_task() Notice: Export task does not include the tasks outputs, such as results (scalar/plots etc.) or Task artifacts/models
Return type
dict
Returns
dictionary of the Task’s configuration.
flush
flush(wait_for_uploads=False)
Flush any outstanding reports or console logs.
Parameters
wait_for_uploads (bool ) – Wait for all outstanding uploads to complete
True
- WaitFalse
- Do not wait (default)
Return type
bool
Task.force_requirements_env_freeze
classmethod force_requirements_env_freeze(force=True, requirements_file=None)
Force using pip freeze / conda list to store the full requirements of the active environment (instead of statically analyzing the running code and listing directly imported packages) Notice: Must be called before Task.init !
Parameters
force (
bool
) – Set force using pip freeze flag on/offrequirements_file (
Union
[str
,Path
,None
]) – Optional pass requirements.txt file to use (instead of pip freeze or automatic analysis)
Return type
None
Task.force_store_standalone_script
classmethod force_store_standalone_script(force=True)
Force using storing the main python file as a single standalone script, instead of linking with the local git repository/commit ID.
Notice: Must be called before Task.init !
Parameters
force (
bool
) – Set force storing the main python file as a single standalone scriptReturn type
None
Task.get_all
classmethod get_all(session=None, log=None, kwargs)**
List all the Tasks based on specific projection.
Parameters
session (Session ) – The session object used for sending requests to the API.
log (logging.Logger ) – The Log object.
kwargs (dict ) – Keyword args passed to the GetAllRequest (see
backend_api.service.v?.tasks.GetAllRequest
for details; the ? needs to be replaced by the appropriate version.)For example:
status='completed', 'search_text'='specific_word', 'user'='user_id', 'project'='project_id'
Return type
Any
Returns
The API response.
get_all_reported_scalars
get_all_reported_scalars(x_axis='iter')
Return a nested dictionary for the all scalar graphs, containing all the registered samples,
where the first key is the graph title and the second is the series name.
Value is a dict with ‘x’: values and ‘y’: values.
To fetch downsampled scalar values, please see the Task.get_reported_scalars
.
This call is not cached, any call will retrieve all the scalar reports from the back-end. If the Task has many scalars reported, it might take long for the call to return.
Parameters
x_axis (str ) – scalar x_axis, possible values: ‘iter’: iteration (default), ‘timestamp’: timestamp as milliseconds since epoch, ‘iso_time’: absolute time
Return type
Mapping
[str
,Mapping
[str
,Mapping
[str
,Sequence
[float
]]]]Returns
dict: Nested scalar graphs: dict[title(str), dict[series(str), dict[axis(str), list(float)]]]
get_archived
get_archived()
Return the Archive state of the Task
Return type
bool
Returns
If True, the Task is archived, otherwise it is not.
get_base_docker
get_base_docker()
Get the base Docker command (image) that is set for this experiment.
Return type
str
Task.get_by_name
classmethod get_by_name(task_name)
Returns the most recent task with the given name from anywhere in the system as a Task object.
Parameters
task_name (str ) – The name of the task to search for.
Return type
Task
Returns
Task object of the most recent task with that name.
get_configuration_object
get_configuration_object(name)
Get the Task’s configuration object section as a blob of text Use only for automation (externally), otherwise use Task.connect_configuration.
Parameters
name (str ) – Configuration section name
Return type
Optional
[str
]Returns
The Task’s configuration as a text blob (unconstrained text string) return None if configuration name is not valid
get_configuration_object_as_dict
get_configuration_object_as_dict(name)
Get the Task’s configuration object section as parsed dictionary Parsing supports JSON and HOCON, otherwise parse manually with get_configuration_object() Use only for automation (externally), otherwise use Task.connect_configuration.
Parameters
name (str ) – Configuration section name
Return type
Union
[dict
,list
,None
]Returns
The Task’s configuration as a parsed dict. return None if configuration name is not valid
get_configuration_objects
get_configuration_objects()
Get the Task’s configuration object section as a blob of text Use only for automation (externally), otherwise use Task.connect_configuration.
Return type
Optional
[Mapping
[str
,str
]]Returns
The Task’s configurations as a dict (config name as key) and text blob as value (unconstrained text string)
get_debug_samples
get_debug_samples(title, series, n_last_iterations=None)
Parameters
title (str ) – Debug sample’s title, also called metric in the UI
series (str ) – Debug sample’s series, corresponding to debug sample’s file name in the UI, also known as variant
n_last_iterations (int ) – How many debug sample iterations to fetch in reverse chronological order. Leave empty to get all debug samples.
Raise
TypeError if n_last_iterations is explicitly set to anything other than a positive integer value
Return type
List
[dict
]Returns
A list of `dict`s, each dictionary containing the debug sample’s URL and other metadata. The URLs can be passed to StorageManager.get_local_copy to fetch local copies of debug samples.
get_initial_iteration
get_initial_iteration()
Return the initial iteration offset, default is 0 Useful when continuing training from previous checkpoints
Return type
int
Returns
Initial iteration offset.
get_label_num_description
get_label_num_description()
Get a dictionary of label number to string pairs representing all labels associated with this number on the model labels.
get_labels_enumeration
get_labels_enumeration()
Get the label enumeration dictionary label enumeration dictionary of string (label) to integer (value) pairs.
Return type
Mapping
[str
,int
]Returns
A dictionary containing the label enumeration.
get_last_iteration
get_last_iteration()
Get the last reported iteration, which is the last iteration for which the Task reported a metric.
The maximum reported iteration is not in the local cache. This method sends a request to the ClearML Server (backend).
Return type
int
Returns
The last reported iteration number.
get_last_scalar_metrics
get_last_scalar_metrics()
Get the last scalar metrics which the Task reported. This is a nested dictionary, ordered by title and series.
For example:
{
"title": {
"series": {
"last": 0.5,
"min": 0.1,
"max": 0.9
}
}
}
Return type
Dict
[str
,Dict
[str
,Dict
[str
,float
]]]Returns
The last scalar metrics.
get_logger
get_logger()
Get a Logger object for reporting, for this task context. You can view all Logger report output associated with the Task for which this method is called, including metrics, plots, text, tables, and images, in the ClearML Web-App (UI).
Return type
Returns
The Logger for the Task (experiment).
get_model_config_dict
get_model_config_dict()
Deprecated: Deprecated since version 0.14.1: Use Task.connect_configuration
instead.
Return type
Dict
get_model_config_text
get_model_config_text()
Deprecated: Deprecated since version 0.14.1: Use Task.connect_configuration
instead.
Return type
str
get_model_design
get_model_design()
Get the model configuration as blob of text.
Return type
str
Returns
The model configuration as blob of text.
get_models
get_models()
Return a dictionary with {‘input’: [], ‘output’: []} loaded/stored models of the current Task Input models are files loaded in the task, either manually or automatically logged Output models are files stored in the task, either manually or automatically logged. Automatically logged frameworks are for example: TensorFlow, Keras, PyTorch, ScikitLearn(joblib) etc.
Return type
Mapping
[str
,Sequence
[Model
]]Returns
A dictionary-like object with “input”/”output” keys and input/output properties, pointing to a list-like object containing Model objects. Each list-like object also acts as a dictionary, mapping model name to an appropriate model instance.
Example:
{'input': [clearml.Model()], 'output': [clearml.Model()]}
Task.get_num_enqueued_tasks
classmethod get_num_enqueued_tasks(queue_name=None, queue_id=None)
Get the number of tasks enqueued in a given queue.
Parameters
queue_name (
Optional
[str
]) – The name of the queue. If not specified, thenqueue_id
must be specifiedqueue_id (
Optional
[str
]) – The ID of the queue. If not specified, thenqueue_name
must be specified
Return type
int
Returns
The number of tasks enqueued in the given queue
get_num_of_classes
get_num_of_classes()
number of classes based on the task’s labels
get_offline_mode_folder
get_offline_mode_folder()
Return the folder where all the task outputs and logs are stored in the offline session.
:rtype: Optional
[Path
]
:return: Path object, local folder, later to be used with report_offline_session()
Return type
Optional[pathlib2.Path]
get_output_destination
get_output_destination(extra_path=None, kwargs)**
Get the task’s output destination, with an optional suffix
get_output_log_web_page
get_output_log_web_page()
Return the Task results & outputs web page address. For example: https://demoapp.demo.clear.ml/projects/216431/experiments/60763e04/output/log
Return type
str
Returns
http/s URL link.
get_parameter
get_parameter(name, default=None, cast=False)
Get a value for a parameter.
Parameters
name (
str
) – Parameter namedefault (
Optional
[Any
]) – Default valuecast (
bool
) – If value is found, cast to original type. If False, return string.
Return type
Any
Returns
The Parameter value (or default value if parameter is not defined).
get_parameters
get_parameters(backwards_compatibility=True, cast=False)
Get the parameters for a Task. This method returns a complete group of key-value parameter pairs, but does not support parameter descriptions (the result is a dictionary of key-value pairs). Notice the returned parameter dict is flat: i.e. {‘Args/param’: ‘value’} is the argument “param” from section “Args”
Parameters
backwards_compatibility (
bool
) – If True (default), parameters without section name (API version < 2.9, clearml-server < 0.16) will be at dict root level. If False, parameters without section name, will be nested under “Args/” key.cast (
bool
) – If True, cast the parameter to the original type. Default False, values are returned in their string representation
Return type
Optional
[dict
]Returns
dict of the task parameters, all flattened to key/value. Different sections with key prefix “section/”
get_parameters_as_dict
get_parameters_as_dict(cast=False)
Get the Task parameters as a raw nested dictionary.
If cast is False (default) The values are not parsed. They are returned as is.
Parameters
cast (
bool
) – If True, cast the parameter to the original type. Default False, values are returned in their string representationReturn type
Dict
get_progress
get_progress()
Gets Task’s progress (0 - 100)
Return type
Optional
[int
]Returns
Task’s progress as an int. In case the progress doesn’t exist, None will be returned
Task.get_project_id
classmethod get_project_id(project_name, search_hidden=True)
Return a project’s unique ID (str). If more than one project matched the project_name, return the last updated project If no project matched the requested name, returns None
Return type
Optional
[str
]Returns
Project unique ID (str), or None if no project was found.
Parameters
project_name (str ) –
search_hidden (bool ) –
get_project_name
get_project_name()
Get the current Task’s project name.
Return type
Optional
[str
]
get_project_object
get_project_object()
Get the current Task’s project as a python object.
Return type
dict
Task.get_projects
classmethod get_projects(kwargs)**
Return a list of projects in the system, sorted by last updated time
Return type
List
[ForwardRef
]Returns
A list of all the projects in the system. Each entry is a services.projects.Project object.
Parameters
kwargs (Any ) –
get_registered_artifacts
get_registered_artifacts()
Get a dictionary containing the Task’s registered (dynamically synchronized) artifacts (name, artifact object).
After calling get_registered_artifacts, you can still modify the registered artifacts.
Return type
Dict
[str
,Artifact
]Returns
The registered (dynamically synchronized) artifacts.
get_reported_console_output
get_reported_console_output(number_of_reports=1)
Return a list of console outputs reported by the Task. Retrieved outputs are the most updated console outputs.
Parameters
number_of_reports (int ) – The number of reports to return. The default value is
1
, indicating the last (most updated) console outputReturn type
Sequence
[str
]Returns
A list of strings, each entry corresponds to one report.
get_reported_plots
get_reported_plots(max_iterations=None)
Return a list of all the plots reported for this Task, Notice the plot data is plotly compatible.
This call is not cached, any call will retrieve all the plot reports from the back-end. If the Task has many plots reported, it might take long for the call to return.
Example:
[{
"timestamp": 1636921296370,
"type": "plot",
"task": "0ce5e89bbe484f428e43e767f1e2bb11",
"iter": 0,
"metric": "Manual Reporting",
"variant": "Just a plot",
"plot_str": "{'data': [{'type': 'scatter', 'mode': 'markers', 'name': null,
'x': [0.2620246750155817], 'y': [0.2620246750155817]}]}",
"@timestamp": "2021-11-14T20:21:42.387Z",
"worker": "machine-ml",
"plot_len": 6135,
},]
Parameters
max_iterations (int ) – Maximum number of historic plots (iterations from end) to return.
Return type
List
[dict
]Returns
list: List of dicts, each one represents a single plot
get_reported_scalars
get_reported_scalars(max_samples=0, x_axis='iter')
Return a nested dictionary for the scalar graphs, where the first key is the graph title and the second is the series name. Value is a dict with ‘x’: values and ‘y’: values
This call is not cached, any call will retrieve all the scalar reports from the back-end. If the Task has many scalars reported, it might take long for the call to return.
Calling this method will return potentially downsampled scalars. The maximum number of returned samples is 5000. Even when setting max_samples to a value larger than 5000, it will be limited to at most 5000 samples. To fetch all scalar values, please see the Task.get_all_reported_scalars.
Example:
{"title": {"series": {
"x": [0, 1 ,2],
"y": [10, 11 ,12]
}}}
Parameters
max_samples (int ) – Maximum samples per series to return. Default is 0 returning up to 5000 samples. With sample limit, average scalar values inside sampling window.
x_axis (str ) – scalar x_axis, possible values: ‘iter’: iteration (default), ‘timestamp’: timestamp as milliseconds since epoch, ‘iso_time’: absolute time
Return type
Mapping
[str
,Mapping
[str
,Mapping
[str
,Sequence
[float
]]]]Returns
dict: Nested scalar graphs: dict[title(str), dict[series(str), dict[axis(str), list(float)]]]
get_reported_single_value
get_reported_single_value(name)
Get a single reported value, identified by its name. Note that this function calls Task.get_reported_single_values.
Parameters
name (
str
) – The name of the reported valueReturn type
Optional
[float
]Returns
The actual value of the reported value, if found. Otherwise, returns None
get_reported_single_values
get_reported_single_values()
Get all reported single values as a dictionary, where the keys are the names of the values and the values of the dictionary are the actual reported values.
Return type
Dict
[str
,float
]Returns
A dict containing the reported values
get_script
get_script()
Get task’s script details.
Returns a dictionary containing the script details.
Return type
Mapping
[str
,Optional
[str
]]Returns
Dictionary with script properties e.g.
{
'working_dir': 'examples/reporting',
'entry_point': 'artifacts.py',
'branch': 'master',
'repository': 'https://github.com/allegroai/clearml.git'
}
get_status
get_status()
Return The task status without refreshing the entire Task object (only the status property)
TaskStatusEnum: [“created”, “in_progress”, “stopped”, “closed”, “failed”, “completed”, “queued”, “published”, “publishing”, “unknown”]
Return type
str
Returns
str: Task status as string (TaskStatusEnum)
get_status_message
get_status_message()
Return The task status without refreshing the entire Task object (only the status property) Return also the last message coupled with the status change
Task Status options: [“created”, “in_progress”, “stopped”, “closed”, “failed”, “completed”, “queued”, “published”, “publishing”, “unknown”] Message: is a string
Return type
(Optional[str], Optional[str])
Returns
(Task status as string, last message)
get_tags
get_tags()
Get all current Task’s tags.
Return type
Sequence
[str
]
Task.get_task
classmethod get_task(task_id=None, project_name=None, task_name=None, tags=None, allow_archived=True, task_filter=None)
Get a Task by ID, or project name / task name combination.
For example:
The following code demonstrates calling Task.get_task
to report a scalar to another Task. The output
of Logger.report_scalar
from testing is associated with the Task named training
. It allows
training and testing to run concurrently, because they initialized different Tasks (see Task.init
for information about initializing Tasks).
The training script:
# initialize the training Task
task = Task.init('myProject', 'training')
# do some training
The testing script:
# initialize the testing Task
task = Task.init('myProject', 'testing')
# get the training Task
train_task = Task.get_task(project_name='myProject', task_name='training')
# report metrics in the training Task
for x in range(10):
train_task.get_logger().report_scalar('title', 'series', value=x * 2, iteration=x)
Parameters
task_id (str ) – The ID (system UUID) of the experiment to get. If specified,
project_name
andtask_name
are ignored.project_name (str ) – The project name of the Task to get.
task_name (str ) – The name of the Task within
project_name
to get.tags (list ) – Filter based on the requested list of tags (strings) (Task must have at least one of the listed tags). To exclude a tag add “-” prefix to the tag. Example: [“best”, “-debug”]
allow_archived (bool ) – Only applicable if not using specific
task_id
, If True (default), allow to return archived Tasks, if False filter out archived Taskstask_filter (bool ) – Only applicable if not using specific
task_id
, Pass additional query filters, on top of project/name. See details in Task.get_tasks.
Returns
The Task specified by ID, or project name / experiment name combination.
Return type
Task
Task.get_task_output_log_web_page
classmethod get_task_output_log_web_page(task_id, project_id=None, app_server_host=None)
Return the Task results & outputs web page address. For example: https://demoapp.demo.clear.ml/projects/216431/experiments/60763e04/output/log
Parameters
task_id (str ) – Task ID.
project_id (str ) – Project ID for this task.
app_server_host (str ) – ClearML Application server host name. If not provided, the current session will be used to resolve the host name.
Return type
str
Returns
http/s URL link.
Task.get_tasks
classmethod get_tasks(task_ids=None, project_name=None, task_name=None, tags=None, allow_archived=True, task_filter=None)
Get a list of Tasks objects matching the queries/filters
A list of specific Task IDs.
Filter Tasks based on specific fields:
project name (including partial match), task name (including partial match), tags Apply Additional advanced filtering with task_filter
This function returns the most recent 500 tasks. If you wish to retrieve older tasks use Task.query_tasks()
Parameters
task_ids (list ( str ) ) – The IDs (system UUID) of experiments to get. If
task_ids
specified, thenproject_name
andtask_name
are ignored.project_name (str ) – The project name of the Tasks to get. To get the experiment in all projects, use the default value of
None
. (Optional) Use a list of strings for multiple optional project names.task_name (str ) – The full name or partial name of the Tasks to match within the specified
project_name
(or all projects ifproject_name
isNone
). This method supports regular expressions for name matching. (Optional) To match an exact task name (i.e. not partial matching), add ^/$ at the beginning/end of the string, for example: “^exact_task_name_here$”tags (list ) – Filter based on the requested list of tags (strings) (Task must have all the listed tags) To exclude a tag add “-” prefix to the tag. Example: [“best”, “-debug”]
allow_archived (bool ) – If True (default), allow to return archived Tasks, if False filter out archived Tasks
task_filter (dict ) – filter and order Tasks.
See
backend_api.service.v?.tasks.GetAllRequest
for details; the ? needs to be replaced by the appropriate version.parent
- (str) filter by parent task-id matchingsearch_text
- (str) free text search (in task fields comment/name/id)status
- List[str] List of valid statuses. Options are: “created”, “queued”, “in_progress”, “stopped”, “published”, “publishing”, “closed”, “failed”, “completed”, “unknown”type
- List[str] List of valid task types. Options are: ‘training’, ‘testing’, ‘inference’, ‘data_processing’, ‘application’, ‘monitor’, ‘controller’, ‘optimizer’, ‘service’, ‘qc’. ‘custom’user
- List[str] Filter based on Task’s user owner, provide list of valid user IDs.order_by
- List[str] List of field names to order by. Whensearch_text
is used. Use ‘-‘ prefix to specify descending order. Optional, recommended when using page. Example:order_by=['-last_update']
_all_
- dict(fields=[], pattern=’’) Match string pattern (regular expression) appearing in All fields. Example: dict(fields=[‘script.repository’], pattern=’github.com/user’)_any_
- dict(fields=[], pattern=’’) Match string pattern (regular expression) appearing in Any of the fields. Example: dict(fields=[‘comment’, ‘name’], pattern=’my comment’)Examples -
{'status': ['stopped'], 'order_by': ["-last_update"]}
,{'order_by'=['-last_update'], '_all_'=dict(fields=['script.repository'], pattern='github.com/user'))
Returns
The Tasks specified by the parameter combinations (see the parameters).
Return type
List[Task]
get_user_properties
get_user_properties(value_only=False)
Get user properties for this task. Returns a dictionary mapping user property name to user property details dict.
Parameters
value_only (
bool
) – If True, returned user property details will be a string representing the property value.Return type
Dict
[str
,Union
[str
,dict
]]
Task.ignore_requirements
classmethod ignore_requirements(package_name)
Ignore a specific package when auto generating the requirements list. Example: Task.ignore_requirements(‘pywin32’)
Parameters
package_name (str ) – The package name to remove/ignore from the “Installed Packages” section of the task.
Return type
None
Task.import_offline_session
classmethod import_offline_session(session_folder_zip, previous_task_id=None, iteration_offset=0)
Upload an offline session (execution) of a Task. Full Task execution includes repository details, installed packages, artifacts, logs, metric and debug samples. This function may also be used to continue a previously executed task with a task executed offline.
Parameters
session_folder_zip (
str
) – Path to a folder containing the session, or zip-file of the session folder.previous_task_id (
Optional
[str
]) – Task ID of the task you wish to continue with this offline session.iteration_offset (
Optional
[int
]) – Reporting of the offline session will be offset with the number specified by this parameter. Useful for avoiding overwriting metrics.
Return type
Optional
[str
]Returns
Newly created task ID or the ID of the continued task (previous_task_id)
Task.import_task
classmethod import_task(task_data, target_task=None, update=False)
Import (create) Task from previously exported Task configuration (see Task.export_task) Can also be used to edit/update an existing Task (by passing target_task and update=True).
Parameters
task_data (
dict
) – dictionary of a Task’s configurationtarget_task (
Union
[str
,Task
,None
]) – Import task_data into an existing Task. Can be either task_id (str) or Task object.update (
bool
) – If True, merge task_data with current Task configuration.
Return type
Optional
[Task
]Returns
return True if Task was imported/updated
Task.init
classmethod init(project_name=None, task_name=None, task_type=<TaskTypes.training: 'training'>, tags=None, reuse_last_task_id=True, continue_last_task=False, output_uri=None, auto_connect_arg_parser=True, auto_connect_frameworks=True, auto_resource_monitoring=True, auto_connect_streams=True, deferred_init=False)
Creates a new Task (experiment) if:
The Task never ran before. No Task with the same
task_name
andproject_name
is stored in ClearML Server.The Task has run before (the same
task_name
andproject_name
), and (a) it stored models and / or artifacts, or (b) its status is Published , or (c) it is Archived.A new Task is forced by calling
Task.init
withreuse_last_task_id=False
.
Otherwise, the already initialized Task object for the same task_name
and project_name
is returned,
or, when being executed remotely on a clearml-agent, the task returned is the existing task from the backend.
To reference another Task, instead of initializing the same Task more than once, call Task.get_task. For example, to “share” the same experiment in more than one script, call Task.get_task. See the Task.get_task method for an example.
For example: The first time the following code runs, it will create a new Task. The status will be Completed.
from clearml import Task
task = Task.init('myProject', 'myTask')
If this code runs again, it will not create a new Task. It does not store a model or artifact, it is not Published (its status Completed) , it was not Archived, and a new Task is not forced.
If the Task is Published or Archived, and run again, it will create a new Task with a new Task ID.
The following code will create a new Task every time it runs, because it stores an artifact.
task = Task.init('myProject', 'myOtherTask')
d = {'a': '1'}
task.upload_artifact('myArtifact', d)
Parameters
project_name (str ) – The name of the project in which the experiment will be created. If the project does not exist, it is created. If
project_name
isNone
, the repository name is used. (Optional)task_name (str ) – The name of Task (experiment). If
task_name
isNone
, the Python experiment script’s file name is used. (Optional)task_type (TaskTypes ) – The task type. Valid task types:
TaskTypes.training
(default)TaskTypes.testing
TaskTypes.inference
TaskTypes.data_processing
TaskTypes.application
TaskTypes.monitor
TaskTypes.controller
TaskTypes.optimizer
TaskTypes.service
TaskTypes.qc
TaskTypes.custom
tags (
Optional
[Sequence
[str
]]) – Add a list of tags (str) to the created Task. For example: tags=[‘512x512’, ‘yolov3’]reuse_last_task_id (bool ) – Force a new Task (experiment) with a previously used Task ID,
and the same project and Task name. If the previously executed Task has artifacts or models, it will not be reused (overwritten), and a new Task will be created. When a Task is reused, the previous execution outputs are deleted, including console outputs and logs. The values are:
True
- Reuse the last Task ID. (default)False
- Force a new Task (experiment).A string - You can also specify a Task ID (string) to be reused, instead of the cached ID based on the project/name combination.
continue_last_task (bool ) – Continue the execution of a previously executed Task (experiment). When
continuing the executing of a previously executed Task, all previous artifacts / models / logs remain intact. New logs will continue iteration/step based on the previous-execution maximum iteration value. For example, The last train/loss scalar reported was iteration 100, the next report will be iteration 101. The values are:
True
- Continue the last Task ID. Specified explicitly by reuse_last_task_id or implicitly with the same logic as reuse_last_task_idFalse
- Overwrite the execution of previous Task (default).A string - You can also specify a Task ID (string) to be continued. This is equivalent to continue_last_task=True and reuse_last_task_id=a_task_id_string.
An integer - Specify initial iteration offset (override the auto automatic last_iteration_offset). Pass 0, to disable the automatic last_iteration_offset or specify a different initial offset. You can specify a Task ID to be used with reuse_last_task_id=’task_id_here’
output_uri (str ) – The default location for output models and other artifacts. If True, the default
files_server will be used for model storage. In the default location, ClearML creates a subfolder for the output. The subfolder structure is the following: <output destination name> / <project name> / <task name>.<Task ID>. Note that for cloud storage, you must install the ClearML package for your cloud storage type, and then configure your storage credentials. For detailed information, see “Storage” in the ClearML Documentation. The following are examples of
output_uri
values for the supported locations:A shared folder:
/mnt/share/folder
S3:
s3://bucket/folder
Google Cloud Storage:
gs://bucket-name/folder
Azure Storage:
azure://company.blob.core.windows.net/folder/
Default file server: True
auto_connect_arg_parser (
Union
[bool
,Mapping
[str
,bool
]]) – Automatically connect an argparse object to the Task. Supported argumentparser packages are: argparse, click, python-fire, jsonargparse. The values are:
True
- Automatically connect. (default)False
- Do not automatically connect.A dictionary - In addition to a boolean, you can use a dictionary for fined grained control of connected
arguments. The dictionary keys are argparse variable names and the values are booleans. The
False
value excludes the specified argument from the Task’s parameter section. Keys missing from the dictionary default toTrue
, you can change it to beFalse
by adding\*
key asFalse
to the dictionary. An empty dictionary defaults toFalse
.For example:
auto_connect_arg_parser={"do_not_include_me": False, }
auto_connect_arg_parser={"only_include_me": True, "*": False}
infoTo manually connect an argparse, use Task.connect.
auto_connect_frameworks (
Union
[bool
,Mapping
[str
,Union
[bool
,str
,list
]]]) – Automatically connect frameworks This includes patching MatplotLib, XGBoost,scikit-learn, Keras callbacks, and TensorBoard/X to serialize plots, graphs, and the model location to the ClearML Server (backend), in addition to original output destination. The values are:
True
- Automatically connect (default)False
- Do not automatically connectA dictionary - In addition to a boolean, you can use a dictionary for fined grained control of connected
frameworks. The dictionary keys are frameworks and the values are booleans, other dictionaries used for finer control or wildcard strings. In case of wildcard strings, the local path of a model file has to match at least one wildcard to be saved/loaded by ClearML. Example: {‘pytorch’ : ‘.pt’, ‘tensorflow’: [‘.h5’, ‘*’]} Keys missing from the dictionary default to
True
, and an empty dictionary defaults toFalse
. Supported keys for finer control: {‘tensorboard’: {‘report_hparams’: bool}} # whether to report TensorBoard hyperparametersFor example:
auto_connect_frameworks={
'matplotlib': True, 'tensorflow': ['*.hdf5, 'something_else*], 'tensorboard': True,
'pytorch': ['*.pt'], 'xgboost': True, 'scikit': True, 'fastai': True,
'lightgbm': True, 'hydra': True, 'detect_repository': True, 'tfdefines': True,
'joblib': True, 'megengine': True, 'catboost': True, 'gradio': True
}auto_connect_frameworks={'tensorboard': {'report_hparams': False}}
auto_resource_monitoring (bool ) – Automatically create machine resource monitoring plots
These plots appear in the ClearML Web-App (UI), RESULTS tab, SCALARS sub-tab, with a title of :resource monitor:. The values are:
True
- Automatically create resource monitoring plots. (default)False
- Do not automatically create.Class Type - Create ResourceMonitor object of the specified class type.
auto_connect_streams (
Union
[bool
,Mapping
[str
,bool
]]) – Control the automatic logging of stdout and stderr.The values are:
True
- Automatically connect (default)False
- Do not automatically connectA dictionary - In addition to a boolean, you can use a dictionary for fined grained control of stdout and
stderr. The dictionary keys are ‘stdout’ , ‘stderr’ and ‘logging’, the values are booleans. Keys missing from the dictionary default to
False
, and an empty dictionary defaults toFalse
. Notice, the default behaviour is logging stdout/stderr. The logging module is logged as a by product of the stderr loggingFor example:
auto_connect_streams={'stdout': True, 'stderr': True, 'logging': False}
deferred_init (
bool
) – (default: False) Wait for Task to be fully initialized (regular behaviour).** BETA feature! use with care **.
If set to True, Task.init function returns immediately and all initialization / communication to the clearml-server is running in a background thread. The returned object is a full proxy to the regular Task object, hence everything will be working as expected. Default behaviour can be controlled with:
CLEARML_DEFERRED_TASK_INIT=1
. Notes:Any access to the returned proxy Task object will essentially wait for the Task.init to be completed.
For example: print(task.name) will wait for Task.init to complete in the background and then return the name property of the task original object
Before Task.init completes in the background, auto-magic logging (console/metric) might be missed
If running via an agent, this argument is ignored, and Task init is called synchronously (default)
Returns
The main execution Task (Task context)
Return type
Task
input_models_id
property input_models_id
Returns the current Task’s input model IDs as a dictionary.
Return type
Mapping
[str
,str
]
is_current_task
is_current_task()
Deprecated: Deprecated since version 0.13.0: This method is deprecated. Use Task.is_main_task
instead.
Is this Task object the main execution Task (initially returned by Task.init
)
Return type
bool
Returns
Is this Task object the main execution Task
True
- Is the main execution Task.False
- Is not the main execution Task.
is_main_task
is_main_task()
Is this Task object the main execution Task (initially returned by Task.init
)
If Task.init was never called, this method will not create it, making this test more efficient than:
Task.init() == task
Return type
bool
Returns
Is this Task object the main execution Task
True
- Is the main execution Task.False
- Is not the main execution Task.
Task.is_offline
classmethod is_offline()
Return offline-mode state, If in offline-mode, no communication to the backend is enabled.
Return type
bool
Returns
boolean offline-mode state
labels_stats
property labels_stats
Get accumulated label stats for the current/last frames iteration
Return type
dict
last_worker
property last_worker
ID of last worker that handled the task.
Return type
str
Returns
The worker ID.
launch_multi_node
launch_multi_node(total_num_nodes, port=29500, queue=None, wait=False, addr=None)
Enqueue multiple clones of the current task to a queue, allowing the task to be ran by multiple workers in parallel. Each task running this way is called a node. Each node has a rank The node that initialized the execution of the other nodes is called the master node and it has a rank equal to 0.
A dictionary named multi_node_instance will be connected to the tasks. One can use this dictionary to modify the behaviour of this function when running remotely. The contents of this dictionary correspond to the parameters of this function, and they are:
- total_num_nodes - the total number of nodes, including the master node
- queue - the queue to enqueue the nodes to
The following environment variables, will be set:
- MASTER_ADDR - the address of the machine that the master node is running on
- MASTER_PORT - the open port of the machine that the master node is running on
- WORLD_SIZE - the total number of nodes, including the master
- RANK - the rank of the current node (master has rank 0)
One may use this function in conjuction with PyTorch’s distributed communication package. Note that Task.launch_multi_node should be called before torch.distributed.init_process_group. For example:
from clearml import Task
import torch
import torch.distributed as dist
def run(rank, size):
print('World size is ', size)
tensor = torch.zeros(1)
if rank == 0:
for i in range(1, size):
tensor += 1
dist.send(tensor=tensor, dst=i)
print('Sending from rank ', rank, ' to rank ', i, ' data: ', tensor[0])
else:
dist.recv(tensor=tensor, src=0)
print('Rank ', rank, ' received data: ', tensor[0])
if __name__ == '__main__':
task = Task.init('some_name', 'some_name')
task.execute_remotely(queue_name='queue')
config = task.launch_multi_node(4)
dist.init_process_group('gloo')
run(config.get('node_rank'), config.get('total_num_nodes'))
When using the ClearML cloud autoscaler apps, one needs to make sure the nodes can reach eachother. The machines need to be in the same security group, the MASTER_PORT needs to be exposed and the MASTER_ADDR needs to be the right private ip of the instance the master is running on. For example, to achieve this, one can set the following Docker arguments in the Additional ClearML Configuration section:
agent.extra_docker_arguments=["--ipc=host", "--network=host", "-p", "29500:29500", "--env", "CLEARML_MULTI_NODE_MASTER_DEF_ADDR=`hostname -I | awk '{print $1}'`"]`
Parameters
total_num_nodes (
int
) – The total number of nodes to be enqueued, including the master node, which should already be enqueued when running remotelyport (
Optional
[int
]) – Port opened by the master node. If the environment variableCLEARML_MULTI_NODE_MASTER_DEF_PORT
is set, the value of this parameter will be set to the one defined inCLEARML_MULTI_NODE_MASTER_DEF_PORT
. IfCLEARML_MULTI_NODE_MASTER_DEF_PORT
doesn’t exist, butMASTER_PORT
does, then the value of this parameter will be set to the one defined inMASTER_PORT
. If neither environment variables exist, the value passed to the parameter will be usedqueue (
Optional
[str
]) – The queue to enqueue the nodes to. Can be different from the queue the master node is enqueued to. If None, the nodes will be enqueued to the same queue as the master nodewait (
bool
) – If True, the master node will wait for the other nodes to startaddr (
Optional
[str
]) – The address of the master node’s worker. If the environment variableCLEARML_MULTI_NODE_MASTER_DEF_ADDR
is set, the value of this parameter will be set to the one defined inCLEARML_MULTI_NODE_MASTER_DEF_ADDR
. IfCLEARML_MULTI_NODE_MASTER_DEF_ADDR
doesn’t exist, butMASTER_ADDR
does, then the value of this parameter will be set to the one defined inMASTER_ADDR
. If neither environment variables exist, the value passed to the parameter will be used. If this value is None (default), the private IP of the machine the master node is running on will be used.
Return type
dict
Returns
A dictionary containing relevant information regarding the multi node run. This dictionary has the following entries:
master_addr - the address of the machine that the master node is running on
master_port - the open port of the machine that the master node is running on
total_num_nodes - the total number of nodes, including the master
queue - the queue the nodes are enqueued to, excluding the master
node_rank - the rank of the current node (master has rank 0)
wait - if True, the master node will wait for the other nodes to start
logger
property logger
Get a Logger object for reporting, for this task context. You can view all Logger report output associated with the Task for which this method is called, including metrics, plots, text, tables, and images, in the ClearML Web-App (UI).
Return type
Returns
The Logger object for the current Task (experiment).
mark_completed
mark_completed(ignore_errors=True, status_message=None, force=False)
Use this method to close and change status of (remotely!) executed tasks.
This method closes the task it is a member of,
changes its status to “Completed”, and
terminates the Python process that created the task.
This is in contrast to Task.close
, which does the first two steps, but does not terminate any Python process.
Let’s say that process A created the task and process B has a handle on the task, e.g., with Task.get_task
.
Then, if we call Task.mark_completed
, process A is terminated, but process B is not.
However, if Task.mark_completed
was called from the same process in which the task was created,
then - effectively - the process terminates itself.
For example, in
task = Task.init(...)
task.mark_completed()
from time import sleep
sleep(30)
print('This text will not be printed!')
the text will not be printed, because the Python process is immediately terminated.
Parameters
ignore_errors (bool ) – If True (default), ignore any errors raised
force (bool ) – If True, the task status will be changed to stopped regardless of the current Task state.
status_message (str ) – Optional, add status change message to the stop request. This message will be stored as status_message on the Task’s info panel
Return type
()
mark_failed
mark_failed(ignore_errors=True, status_reason=None, status_message=None, force=False)
The signal that this Task stopped.
Return type
()
Parameters
ignore_errors (bool ) –
status_reason (Optional [ str ] ) –
status_message (Optional [ str ] ) –
force (bool ) –
mark_started
mark_started(force=False)
Manually mark a Task as started (happens automatically)
Parameters
force (bool ) – If True, the task status will be changed to started regardless of the current Task state.
Return type
()
mark_stopped
mark_stopped(force=False, status_message=None)
Manually mark a Task as stopped (also used in _at_exit
)
Parameters
force (bool ) – If True, the task status will be changed to stopped regardless of the current Task state.
status_message (str ) – Optional, add status change message to the stop request. This message will be stored as status_message on the Task’s info panel
Return type
()
metrics_manager
property metrics_manager
A metrics manager used to manage the metrics related to this task
Return type
Metrics
models
property models
Read-only dictionary of the Task’s loaded/stored models.
Return type
Mapping
[str
,Sequence
[Model
]]Returns
A dictionary-like object with “input”/”output” keys and input/output properties, pointing to a list-like object containing Model objects. Each list-like object also acts as a dictionary, mapping model name to an appropriate model instance.
Get input/output models:
task.models.input
task.models["input"]
task.models.output
task.models["output"]Get the last output model:
task.models.output[-1]
Get a model by name:
task.models.output["model name"]
move_to_project
move_to_project(new_project_id=None, new_project_name=None, system_tags=None)
Move this task to another project
Parameters
new_project_id (
Optional
[str
]) – The ID of the project the task should be moved to. Not required if new_project_name is passed.new_project_name (
Optional
[str
]) – Name of the new project the task should be moved to. Not required if new_project_id is passed.system_tags (
Optional
[Sequence
[str
]]) – System tags for the project the task should be moved to.
Return type
bool
Returns
True if the move was successful and False otherwise
name
property name
Returns the current Task’s name.
Return type
str
output_models_id
property output_models_id
Returns the current Task’s output model IDs as a dictionary.
Return type
Mapping
[str
,str
]
output_uri
property output_uri
The storage / output url for this task. This is the default location for output models and other artifacts.
Return type
str
Returns
The url string.
parent
property parent
Returns the current Task’s parent task ID (str).
Return type
str
project
property project
Returns the current Task’s project ID.
Return type
str
publish
publish(ignore_errors=True)
The signal that this task will be published
Return type
()
Parameters
ignore_errors (bool ) –
publish_on_completion
publish_on_completion(enable=True)
The signal that this task will be published automatically on task completion
Return type
()
Parameters
enable (bool ) –
Task.query_tasks
classmethod query_tasks(project_name=None, task_name=None, tags=None, additional_return_fields=None, task_filter=None)
Get a list of Tasks ID matching the specific query/filter. Notice, if additional_return_fields is specified, returns a list of dictionaries with requested fields (dict per Task)
Parameters
project_name (str ) – The project name of the Tasks to get. To get the experiment in all projects, use the default value of
None
. (Optional) Use a list of strings for multiple optional project names.task_name (str ) – The full name or partial name of the Tasks to match within the specified
project_name
(or all projects ifproject_name
isNone
). This method supports regular expressions for name matching. (Optional)project_name – project name (str) the task belongs to (use None for all projects)
task_name – task name (str) within the selected project Return any partial match of task_name, regular expressions matching is also supported. If None is passed, returns all tasks within the project
tags (list ) – Filter based on the requested list of tags (strings) To exclude a tag add “-” prefix to the tag. Example: [“best”, “-debug”] To include All tags (instead of the default All behaviour) use “$all” as the first string, example: [“$all”, “best”, “experiment”, “ever”] To combine All tags and exclude a list of tags use “$not” before the excluded tags, example: [“$all”, “best”, “experiment”, “ever”, “__$not”, “internal”, “test”]
additional_return_fields (list ) – Optional, if not provided return a list of Task IDs. If provided return dict per Task with the additional requested fields. Example:
returned_fields=['last_updated', 'user', 'script.repository']
will return a list of dict:[{'id': 'task_id', 'last_update': datetime.datetime(), 'user': 'user_id', 'script.repository': 'https://github.com/user/'}, ]
task_filter (dict ) – filter and order Tasks.
See
backend_api.service.v?.tasks.GetAllRequest
for details; the ? needs to be replaced by the appropriate version.parent
- (str) filter by parent task-id matchingsearch_text
- (str) free text search (in task fields comment/name/id)status
- List[str] List of valid statuses. Options are: “created”, “queued”, “in_progress”, “stopped”, “published”, “publishing”, “closed”, “failed”, “completed”, “unknown”type
- List[Union[str, TaskTypes]] List of valid task types. Options are: ‘training’, ‘testing’, ‘inference’, ‘data_processing’, ‘application’, ‘monitor’, ‘controller’, ‘optimizer’, ‘service’, ‘qc’. ‘custom’user
- List[str] Filter based on Task’s user owner, provide list of valid user IDs.order_by
- List[str] List of field names to order by. When search_text is used. Use ‘-‘ prefix to specify descending order. Optional, recommended when using page. Example:order_by=['-last_update']
_all_
- dict(fields=[], pattern=’’) Match stringpattern
(regular expression) appearing in All fields.dict(fields=['script.repository'], pattern='github.com/user')
_any_
- dict(fields=[], pattern=’’) Match string pattern (regular expression) appearing in Any of the fields. dict(fields=[‘comment’, ‘name’], pattern=’my comment’)Examples:
{'status': ['stopped'], 'order_by': ["-last_update"]}
,{'order_by'=['-last_update'], '_all_'=dict(fields=['script.repository'], pattern='github.com/user')}
Return type
Union
[List
[str
],List
[Dict
[str
,str
]]]Returns
The Tasks specified by the parameter combinations (see the parameters).
register_abort_callback
register_abort_callback(callback_function, callback_execution_timeout=30.0)
Register a Task abort callback (single callback function support only). Pass a function to be called from a background thread when the Task is externally being aborted. Users must specify a timeout for the callback function execution (default 30 seconds) if the callback execution function exceeds the timeout, the Task’s process will be terminated
Call this register function from the main process only.
Note: Ctrl-C is Not considered external, only backend induced abort is covered here
Parameters
callback_function – Callback function to be called via external thread (from the main process). pass None to remove existing callback
callback_execution_timeout – Maximum callback execution time in seconds, after which the process will be terminated even if the callback did not return
register_artifact
register_artifact(name, artifact, metadata=None, uniqueness_columns=True)
Register (add) an artifact for the current Task. Registered artifacts are dynamically synchronized with the ClearML Server (backend). If a registered artifact is updated, the update is stored in the ClearML Server (backend). Registered artifacts are primarily used for Data Auditing.
The currently supported registered artifact object type is a pandas.DataFrame.
See also Task.unregister_artifact
and Task.get_registered_artifacts
.
ClearML also supports uploaded artifacts which are one-time uploads of static artifacts that are not dynamically synchronized with the ClearML Server (backend). These static artifacts include additional object types. For more information, see Task.upload_artifact.
Parameters
name (str ) – The name of the artifact.
dangerIf an artifact with the same name was previously registered, it is overwritten.
artifact (object ) – The artifact object.
metadata (dict ) – A dictionary of key-value pairs for any metadata. This dictionary appears with the experiment in the ClearML Web-App (UI), ARTIFACTS tab.
uniqueness_columns (Union [ bool , Sequence [ str ] ] ) – A Sequence of columns for artifact uniqueness comparison criteria, or the default value of
True
. IfTrue
, the artifact uniqueness comparison criteria is all the columns, which is the same asartifact.columns
.
Return type
None
reload
reload()
Reload current Task’s state from clearml-server. Refresh all task’s fields, including artifacts / models / parameters etc.
Return type
()
remove_input_models
remove_input_models(models_to_remove)
Remove input models from the current task. Note that the models themselves are not deleted, but the tasks’ reference to the models is removed. To delete the models themselves, see Models.remove
Parameters
models_to_remove (Sequence [ Union [ str , BaseModel ] ] ) – The models to remove from the task. Can be a list of ids, or of BaseModel (including its subclasses: Model and InputModel)
Return type
()
rename
rename(new_name)
Rename this task
Parameters
new_name (
str
) – The new name of this taskReturn type
bool
Returns
True if the rename was successful and False otherwise
reset
reset(set_started_on_success=False, force=False)
Reset a Task. ClearML reloads a Task after a successful reset.
When a worker executes a Task remotely, the Task does not reset unless
the force
parameter is set to True
(this avoids accidentally clearing logs and metrics).
Parameters
set_started_on_success (bool ) – If successful, automatically set the Task to started
True
- If successful, set to started.False
- If successful, do not set to started. (default)
force (bool ) – Force a Task reset, even when executing the Task (experiment) remotely in a worker
True
- ForceFalse
- Do not force (default)
Return type
None
running_locally
static running_locally()
Is the task running locally (i.e., clearml-agent
is not executing it)
Return type
bool
Returns
True, if the task is running locally. False, if the task is not running locally.
save_exec_model_design_file
save_exec_model_design_file(filename='model_design.txt', use_cache=False)
Save execution model design to file
set_archived
set_archived(archive)
Archive the Task or remove it from the archived folder.
Parameters
archive (bool ) – If True, archive the Task. If False, make sure it is removed from the archived folder
Return type
()
set_artifacts
set_artifacts(artifacts_list=None)
List of artifacts (tasks.Artifact) to update the task
Parameters
artifacts_list (list ) – list of artifacts (type tasks.Artifact)
Return type
Optional
[List
[Artifact
]]Returns
List of current Task’s Artifacts or None if error.
set_base_docker
set_base_docker(docker_cmd=None, docker_image=None, docker_arguments=None, docker_setup_bash_script=None)
Set the base docker image for this experiment If provided, this value will be used by clearml-agent to execute this experiment inside the provided docker image. When running remotely the call is ignored
Parameters
docker_cmd (Optional [ str ] ) – Deprecated! compound docker container image + arguments (example: ‘nvidia/cuda:11.1 -e test=1’) Deprecated, use specific arguments.
docker_image (Optional [ str ] ) – docker container image (example: ‘nvidia/cuda:11.1’)
docker_arguments (Optional [ Union [ str , Sequence [ str ] ] ] ) – docker execution parameters (example: ‘-e ENV=1’)
docker_setup_bash_script (Optional [ Union [ str , Sequence [ str ] ] ] ) – bash script to run at the beginning of the docker before launching the Task itself. example: [‘apt update’, ‘apt-get install -y gcc’]
Return type
()
set_comment
set_comment(comment)
Set a comment / description for the Task.
Parameters
comment (str ) – The comment / description for the Task.
Return type
()
set_configuration_object
set_configuration_object(name, config_text=None, description=None, config_type=None, config_dict=None)
Set the Task’s configuration object as a blob of text or automatically encoded dictionary/list. Use only for automation (externally), otherwise use Task.connect_configuration.
Parameters
name (str ) – Configuration section name
config_text (
Optional
[str
]) – configuration as a blob of text (unconstrained text string) usually the content of a configuration file of a sortdescription (str ) – Configuration section description
config_type (str ) – Optional configuration format type
config_dict (dict ) – configuration dictionary/list to be encoded using HOCON (json alike) into stored text Notice you can either pass config_text or config_dict, not both
Return type
None
Task.set_credentials
classmethod set_credentials(api_host=None, web_host=None, files_host=None, key=None, secret=None, store_conf_file=False)
Set new default ClearML Server (backend) host and credentials.
These credentials will be overridden by either OS environment variables, or the ClearML configuration
file, clearml.conf
.
Credentials must be set before initializing a Task object.
For example, to set credentials for a remote computer:
Task.set_credentials(
api_host='http://localhost:8008', web_host='http://localhost:8080', files_host='http://localhost:8081',
key='optional_credentials', secret='optional_credentials'
)
task = Task.init('project name', 'experiment name')
Parameters
api_host (str ) – The API server url. For example,
host='http://localhost:8008'
web_host (str ) – The Web server url. For example,
host='http://localhost:8080'
files_host (str ) – The file server url. For example,
host='http://localhost:8081'
key (str ) – The user key (in the key/secret pair). For example,
key='thisisakey123'
secret (str ) – The user secret (in the key/secret pair). For example,
secret='thisisseceret123'
store_conf_file (bool ) – If True, store the current configuration into the ~/clearml.conf file. If the configuration file exists, no change will be made (outputs a warning). Not applicable when running remotely (i.e. clearml-agent).
Return type
None
set_initial_iteration
set_initial_iteration(offset=0)
Set initial iteration, instead of zero. Useful when continuing training from previous checkpoints
Parameters
offset (int ) – Initial iteration (at starting point)
Return type
int
Returns
Newly set initial offset.
set_input_model
set_input_model(model_id=None, model_name=None, update_task_design=True, update_task_labels=True, name=None)
Set a new input model for the Task. The model must be “ready” (status is Published
) to be used as the
Task’s input model.
Parameters
model_id (str ) – The ID of the model on the ClearML Server (backend). If
model_name
is not specified, thenmodel_id
must be specified.model_name (Optional [ str ] ) – The model name in the artifactory. The model_name is used to locate an existing model in the ClearML Server (backend). If
model_id
is not specified, thenmodel_name
must be specified.update_task_design (bool ) – Update the Task’s design
True
- ClearML copies the Task’s model design from the input model.False
- ClearML does not copy the Task’s model design from the input model.
update_task_labels (bool ) – Update the Task’s label enumeration
True
- ClearML copies the Task’s label enumeration from the input model.False
- ClearML does not copy the Task’s label enumeration from the input model.
name (Optional [ str ] ) – Model section name to be stored on the Task (unrelated to the model object name itself) Default: the model weight filename is used (excluding file extension)
Return type
()
set_model_config
set_model_config(config_text=None, config_dict=None)
Deprecated: Deprecated since version 0.14.1: Use Task.connect_configuration
instead.
Return type
None
Parameters
config_text (Optional [ str ] ) –
config_dict (Optional [ Mapping ] ) –
set_model_label_enumeration
set_model_label_enumeration(enumeration=None)
Set the label enumeration for the Task object before creating an output model. Later, when creating an output model, the model will inherit these properties.
Parameters
enumeration (dict ) – A label enumeration dictionary of string (label) to integer (value) pairs.
For example:
{
"background": 0,
"person": 1
}Return type
()
set_name
set_name(name)
Set the Task name.
Parameters
name (str ) – The name of the Task.
Return type
()
Task.set_offline
classmethod set_offline(offline_mode=False)
Set offline mode, where all data and logs are stored into local folder, for later transmission
Task.set_offline can’t move the same task from offline to online, nor can it be applied before Task.create. See below an example of incorrect usage of Task.set_offline:
from clearml import Task
Task.set_offline(True)
task = Task.create(project_name=’DEBUG’, task_name=”offline”)
# ^^^ an error or warning is raised, saying that Task.set_offline(True)
# is supported only for Task.init
Task.set_offline(False)
# ^^^ an error or warning is raised, saying that running Task.set_offline(False)
# while the current task is not closed is not supported
data = task.export_task()
imported_task = Task.import_task(task_data=data)
The correct way to use Task.set_offline can be seen in the following example:
from clearml import Task
Task.set_offline(True)
task = Task.init(project_name=’DEBUG’, task_name=”offline”)
task.upload_artifact(“large_artifact”, “test_string”)
task.close()
Task.set_offline(False)
imported_task = Task.import_offline_session(task.get_offline_mode_folder())
Parameters
offline_mode (
bool
) – If True, offline-mode is turned on, and no communication to the backend is enabled.Return type
None
Returns
set_packages
set_packages(packages)
Manually specify a list of required packages or a local requirements.txt file. When running remotely the call is ignored
Parameters
packages (Union [ str , Sequence [ str ] ] ) – The list of packages or the path to the requirements.txt file. Example: [“tqdm>=2.1”, “scikit-learn”] or “./requirements.txt”
Return type
()
set_parameter
set_parameter(name, value, description=None, value_type=None)
Set a single Task parameter. This overrides any previous value for this parameter.
Parameters
name (str ) – The parameter name.
value (str ) – The parameter value.
description (Optional [ str ] ) – The parameter description.
value_type (Optional [ Any ] ) – The type of the parameters (cast to string and store)
Return type
()
set_parameters
set_parameters(*args, kwargs)**
Set the parameters for a Task. This method sets a complete group of key-value parameter pairs, but does not support parameter descriptions (the input is a dictionary of key-value pairs). Notice the parameter dict is flat: i.e. {‘Args/param’: ‘value’} will set the argument “param” in section “Args” to “value”
Parameters
args (dict ) – Positional arguments, which are one or more dictionaries or (key, value) iterable. They are merged into a single key-value pair dictionary.
kwargs (Any ) – Key-value pairs, merged into the parameters dictionary created from
args
.
Return type
()
set_parameters_as_dict
set_parameters_as_dict(dictionary)
Set the parameters for the Task object from a dictionary. The dictionary can be nested.
This does not link the dictionary to the Task object. It does a one-time update. This
is the same behavior as the Task.connect
method.
Return type
None
Parameters
dictionary (Dict ) –
set_parent
set_parent(parent)
Set the parent task for the Task.
Parameters
parent (str or Task ) – The parent task ID (or parent Task object) for the Task. Set None for no parent.
Return type
()
set_progress
set_progress(progress)
Sets Task’s progress (0 - 100) Progress is a field computed and reported by the user.
Parameters
progress (int ) – numeric value (0 - 100)
Return type
()
set_project
set_project(project_id=None, project_name=None)
Set the project of the current task by either specifying a project name or ID
Return type
()
Parameters
project_id (Optional [ str ] ) –
project_name (Optional [ str ] ) –
Task.set_random_seed
classmethod set_random_seed(random_seed)
Set the default random seed for any new initialized tasks
Parameters
random_seed (Optional [ int ] ) – If None or False, disable random seed initialization. If True, use the default random seed, otherwise use the provided int value for random seed initialization when initializing a new task.
Return type
()
set_repo
set_repo(repo, branch=None, commit=None)
Specify a repository to attach to the function. Allow users to execute the task inside the specified repository, enabling them to load modules/script from the repository. Notice the execution work directory will be the repository root folder. Supports both git repo url link, and local repository path (automatically converted into the remote git/commit as is currently checkout). Example remote url: ‘https://github.com/user/repo.git’. Example local repo copy: ‘./repo’ -> will automatically store the remote repo url and commit ID based on the locally cloned copy. When executing remotely, this call will not override the repository data (it is ignored)
Parameters
repo (str ) – Remote URL for the repository to use, OR path to local copy of the git repository Example: ‘https://github.com/allegroai/clearml.git’ or ‘~/project/repo’
branch (Optional [ str ] ) – Optional, specify the remote repository branch (Ignored, if local repo path is used)
commit (Optional [ str ] ) – Optional, specify the repository commit ID (Ignored, if local repo path is used)
Return type
()
set_resource_monitor_iteration_timeout
set_resource_monitor_iteration_timeout(seconds_from_start=1800)
Set the ResourceMonitor maximum duration (in seconds) to wait until first scalar/plot is reported. If timeout is reached without any reporting, the ResourceMonitor will start reporting machine statistics based on seconds from Task start time (instead of based on iteration)
Parameters
seconds_from_start (
float
) – Maximum number of seconds to wait for scalar/plot reporting before defaulting to machine statistics reporting based on seconds from experiment start timeReturn type
bool
Returns
True if success
set_script
set_script(repository=None, branch=None, commit=None, diff=None, working_dir=None, entry_point=None)
Set task’s script.
Examples:
task.set_script(
repository='https://github.com/allegroai/clearml.git,
branch='main',
working_dir='examples/reporting',
entry_point='artifacts.py'
)
Parameters
repository (
Optional
[str
]) – Optional, URL of remote repository. use empty string (“”) to clear repository entry.branch (
Optional
[str
]) – Optional, Select specific repository branch / tag. use empty string (“”) to clear branch entry.commit (
Optional
[str
]) – Optional, set specific git commit id. use empty string (“”) to clear commit ID entry.diff (
Optional
[str
]) – Optional, set “git diff” section. use empty string (“”) to clear git-diff entry.working_dir (
Optional
[str
]) – Optional, Working directory to launch the script from.entry_point (
Optional
[str
]) – Optional, Path to execute within the repository.
Return type
None
set_tags
set_tags(tags)
Set the current Task’s tags. Please note this will overwrite anything that is there already.
Parameters
tags (Sequence ( str ) ) – Any sequence of tags to set.
Return type
()
set_task_type
set_task_type(task_type)
Set the task_type for the Task.
Parameters
task_type (str or TaskTypes ) – The task_type of the Task.
Valid task types:
TaskTypes.training
TaskTypes.testing
TaskTypes.inference
TaskTypes.data_processing
TaskTypes.application
TaskTypes.monitor
TaskTypes.controller
TaskTypes.optimizer
TaskTypes.service
TaskTypes.qc
TaskTypes.custom
Return type
()
set_user_properties
set_user_properties(*iterables, properties)**
Set user properties for this task. A user property can contain the following fields (all of type string): name / value / description / type
Examples:
task.set_user_properties(backbone='great', stable=True)
task.set_user_properties(backbone={"type": int, "description": "network type", "value": "great"}, )
task.set_user_properties(
{"name": "backbone", "description": "network type", "value": "great"},
{"name": "stable", "description": "is stable", "value": True},
)
Parameters
iterables (
Union
[Mapping
[str
,Union
[str
,dict
,None
]],Iterable
[dict
]]) – Properties iterables, each can be:A dictionary of string key (name) to either a string value (value) a dict (property details). If the value
is a dict, it must contain a “value” field. For example:
{
"property_name": {"description": "This is a user property", "value": "property value"},
"another_property_name": {"description": "This is user property", "value": "another value"},
"yet_another_property_name": "some value"
}An iterable of dicts (each representing property details). Each dict must contain a “name” field and a
”value” field. For example:
[
{
"name": "property_name",
"description": "This is a user property",
"value": "property value"
},
{
"name": "another_property_name",
"description": "This is another user property",
"value": "another value"
}
]
properties (
Union
[str
,dict
,int
,float
,None
]) – Additional properties keyword arguments. Key is the property name, and value can be a string (property value) or a dict (property details). If the value is a dict, it must contain a “value” field. For example:{
"property_name": "string as property value",
"another_property_name": {
"type": "string",
"description": "This is user property",
"value": "another value"
}
}
Return type
bool
setup_aws_upload
setup_aws_upload(bucket, subdir=None, host=None, key=None, secret=None, token=None, region=None, multipart=True, secure=True, verify=True)
Setup S3 upload options.
Parameters
bucket – AWS bucket name
subdir – Subdirectory in the AWS bucket
host – Hostname. Only required in case a Non-AWS S3 solution such as a local Minio server is used)
key – AWS access key. If not provided, we’ll attempt to obtain the key from the configuration file (bucket-specific, than global)
secret – AWS secret key. If not provided, we’ll attempt to obtain the secret from the configuration file (bucket-specific, than global)
token – AWS 2FA token
region – Bucket region. Required if the bucket doesn’t reside in the default region (us-east-1)
multipart – Server supports multipart. Only required when using a Non-AWS S3 solution that doesn’t support multipart.
secure – Server supports HTTPS. Only required when using a Non-AWS S3 solution that only supports HTTPS.
verify – Whether or not to verify SSL certificates. Only required when using a Non-AWS S3 solution that only supports HTTPS with self-signed certificate.
Return type
None
setup_azure_upload
setup_azure_upload(account_name, account_key, container_name=None)
Setup Azure upload options.
Parameters
account_name (
str
) – Name of the accountaccount_key (
str
) – Secret key used to authenticate the accountcontainer_name (
Optional
[str
]) – The name of the blob container to upload to
Return type
None
setup_gcp_upload
setup_gcp_upload(bucket, subdir='', project=None, credentials_json=None, pool_connections=None, pool_maxsize=None)
Setup GCP upload options.
Parameters
bucket (
str
) – Bucket to upload tosubdir (
str
) – Subdir in bucket to upload toproject (
Optional
[str
]) – Project the bucket belongs tocredentials_json (
Optional
[str
]) – Path to the JSON file that contains the credentialspool_connections (
Optional
[int
]) – The number of urllib3 connection pools to cachepool_maxsize (
Optional
[int
]) – The maximum number of connections to save in the pool
Return type
None
started
started(ignore_errors=True, force=False)
The signal that this Task started.
Return type
()
Parameters
ignore_errors (bool ) –
force (bool ) –
status
property status
The Task’s status. To keep the Task updated. ClearML reloads the Task status information only, when this value is accessed.
return str: TaskStatusEnum status
Return type
str
stopped
stopped(ignore_errors=True, force=False, status_reason=None, status_message=None)
The signal that this Task stopped.
Return type
()
Parameters
ignore_errors (bool ) –
force (bool ) –
status_reason (Optional [ str ] ) –
status_message (Optional [ str ] ) –
task_id
property task_id
Returns the current Task’s ID.
Return type
str
task_type
property task_type
Returns the current Task’s type.
Valid task types:
TaskTypes.training
(default)
TaskTypes.testing
TaskTypes.inference
TaskTypes.data_processing
TaskTypes.application
TaskTypes.monitor
TaskTypes.controller
TaskTypes.optimizer
TaskTypes.service
TaskTypes.qc
TaskTypes.custom
Return type
str
unregister_artifact
unregister_artifact(name)
Unregister (remove) a registered artifact. This removes the artifact from the watch list that ClearML uses to synchronize artifacts with the ClearML Server (backend).
Calling this method does not remove the artifact from a Task. It only stops ClearML from monitoring the artifact.
When this method is called, ClearML immediately takes the last snapshot of the artifact.
Return type
None
Parameters
name (str ) –
update_model_desc
update_model_desc(new_model_desc_file=None)
Change the Task’s model description.
Return type
()
Parameters
new_model_desc_file (Optional [ str ] ) –
update_output_model
update_output_model(model_path, name=None, comment=None, tags=None, model_name=None, iteration=None, auto_delete_file=True)
Update the Task’s output model weights file. First, ClearML uploads the file to the preconfigured output
destination (see the Task’s output.destination
property or call the setup_upload
method),
then ClearML updates the model object associated with the Task. The API call uses the URI
of the uploaded file, and other values provided by additional arguments.
Notice: A local model file will be uploaded to the task’s output_uri destination, If no output_uri was specified, the default files-server will be used to store the model file/s.
Parameters
model_path (
str
) – A local weights file or folder to be uploaded. If remote URI is provided (e.g. http:// or s3: // etc) then the URI is stored as is, without any uploadname (
Optional
[str
]) – The updated model name. If not provided, the name is the model weights file filename without the extension.comment (
Optional
[str
]) – The updated model description. (Optional)tags (
Optional
[Sequence
[str
]]) – The updated model tags. (Optional)model_name (
Optional
[str
]) – If provided the model name as it will appear in the model artifactory. (Optional) Default: Task.name - nameiteration (
Optional
[int
]) – iteration number for the current stored model (Optional)auto_delete_file (bool ) – Delete the temporary file after uploading (Optional)
True
- Delete (Default)False
- Do not delete
Return type
str
Returns
The URI of the uploaded weights file. Notice: upload is done is a background thread, while the function call returns immediately
update_parameters
update_parameters(*args, kwargs)**
Update the parameters for a Task. This method updates a complete group of key-value parameter pairs, but does not support parameter descriptions (the input is a dictionary of key-value pairs). Notice the parameter dict is flat: i.e. {‘Args/param’: ‘value’} will set the argument “param” in section “Args” to “value”
Parameters
args (dict ) – Positional arguments, which are one or more dictionaries or (key, value) iterable. They are merged into a single key-value pair dictionary.
kwargs (Any ) – Key-value pairs, merged into the parameters dictionary created from
args
.
Return type
()
update_task
update_task(task_data)
Update current task with configuration found on the task_data dictionary. See also export_task() for retrieving Task configuration.
Parameters
task_data (
dict
) – dictionary with full Task configurationReturn type
bool
Returns
return True if Task update was successful
upload_artifact
upload_artifact(name, artifact_object, metadata=None, delete_after_upload=False, auto_pickle=True, preview=None, wait_on_upload=False, extension_name=None, serialization_function=None, retries=0)
Upload (add) a static artifact to a Task object. The artifact is uploaded in the background.
The currently supported upload (static) artifact types include:
string / pathlib2.Path - A path to artifact file. If a wildcard or a folder is specified, then ClearML creates and uploads a ZIP file.
dict - ClearML stores a dictionary as
.json
(or seeextension_name
) file and uploads it.pandas.DataFrame - ClearML stores a pandas.DataFrame as
.csv.gz
(compressed CSV) (or seeextension_name
) file and uploads it.numpy.ndarray - ClearML stores a numpy.ndarray as
.npz
(or seeextension_name
) file and uploads it.PIL.Image - ClearML stores a PIL.Image as
.png
(or seeextension_name
) file and uploads it.Any - If called with auto_pickle=True, the object will be pickled and uploaded.
Parameters
name (str ) – The artifact name.
dangerIf an artifact with the same name was previously uploaded, then it is overwritten.
artifact_object (object ) – The artifact object.
metadata (dict ) – A dictionary of key-value pairs for any metadata. This dictionary appears with the experiment in the ClearML Web-App (UI), ARTIFACTS tab.
delete_after_upload (bool ) – After the upload, delete the local copy of the artifact
True
- Delete the local copy of the artifact.False
- Do not delete. (default)
auto_pickle (bool ) – If True (default) and the artifact_object is not one of the following types: pathlib2.Path, dict, pandas.DataFrame, numpy.ndarray, PIL.Image, url (string), local_file (string), the artifact_object will be pickled and uploaded as pickle file artifact (with file extension .pkl)
preview (Any ) – The artifact preview
wait_on_upload (bool ) – Whether the upload should be synchronous, forcing the upload to complete before continuing.
extension_name (str ) – File extension which indicates the format the artifact should be stored as.
The following are supported, depending on the artifact type (default value applies when extension_name is None):
Any -
.pkl
if passed supersedes any other serialization type, and always pickles the objectdict -
.json
,.yaml
(default.json
)pandas.DataFrame -
.csv.gz
,.parquet
,.feather
,.pickle
(default.csv.gz
)numpy.ndarray -
.npz
,.csv.gz
(default.npz
)PIL.Image - whatever extensions PIL supports (default
.png
)In case the
serialization_function
argument is set - any extension is supported
Union [ bytes , bytearray ] ] serialization_function (Callable [ Any , ) – A serialization function that takes one parameter of any type which is the object to be serialized. The function should return a bytes or bytearray object, which represents the serialized object. Note that the object will be immediately serialized using this function, thus other serialization methods will not be used (e.g. pandas.DataFrame.to_csv), even if possible. To deserialize this artifact when getting it using the Artifact.get method, use its deserialization_function argument.
retries (int ) – Number of retries before failing to upload artifact. If 0, the upload is not retried
serialization_function (Optional [ Callable [ [ Any ] , Union [ bytes , bytearray ] ] ] ) –
Return type
bool
Returns
The status of the upload.
True
- Upload succeeded.False
- Upload failed.
Raise
If the artifact object type is not supported, raise a
ValueError
.
wait_for_status
wait_for_status(status=(<TaskStatusEnum.completed: 'completed'>, <TaskStatusEnum.stopped: 'stopped'>, <TaskStatusEnum.closed: 'closed'>), raise_on_status=(<TaskStatusEnum.failed: 'failed'>, ), check_interval_sec=60.0)
Wait for a task to reach a defined status.
Parameters
status (Iterable [ Task.TaskStatusEnum ] ) – Status to wait for. Defaults to (‘completed’, ‘stopped’, ‘closed’, )
raise_on_status (Optional [ Iterable [ Task.TaskStatusEnum ] ] ) – Raise RuntimeError if the status of the tasks matches one of these values. Defaults to (‘failed’).
check_interval_sec (float ) – Interval in seconds between two checks. Defaults to 60 seconds.
Raise
RuntimeError if the status is one of {raise_on_status}.
Return type
()