Skip to main content

ClearmlJob

class automation.ClearmlJob()

Create a new Task based on a base_task_id with a different set of parameters

  • Parameters

    • base_task_id (str ) – base task ID to clone from

    • parameter_override (dict ) – dictionary of parameters and values to set fo the cloned task

    • task_overrides (dict ) – Task object specific overrides. for example {‘script.version_num’: None, ‘script.branch’: ‘main’}

    • configuration_overrides (Optional [ Mapping [ str , Union [ str , Mapping ] ] ] ) – Optional, override Task configuration objects. Expected dictionary of configuration object name and configuration object content. Examples:

      {‘config_section’: dict(key=’value’)} {‘config_file’: ‘configuration file content’} {‘OmegaConf’: YAML.dumps(full_hydra_dict)}

    • tags (list ) – additional tags to add to the newly cloned task

    • parent (str ) – Set newly created Task parent task field, default: base_tak_id.

    • kwargs (dict ) – additional Task creation parameters

    • disable_clone_task (bool ) – if False (default), clone base task id. If True, use the base_task_id directly (base-task must be in draft-mode / created),

    • allow_caching (bool ) – If True, check if we have a previously executed Task with the same specification. If we do, use it and set internal is_cached flag. Default False (always create new Task).

    • bool ] output_uri (Union [ str , ) – The storage / output url for this job. This is the default location for output models and other artifacts. Check Task.init reference docs for more info (output_uri is a parameter).

    • target_project (str ) – Optional, Set the target project name to create the cloned Task in.


abort

abort()

Abort currently running job (can be called multiple times)

  • Return type

    ()


delete

delete()

Delete the current temporary job (before launching) Return False if the Job/Task could not deleted

  • Return type

    bool


elapsed

elapsed()

Return the time in seconds since job started. Return -1 if job is still pending

  • Return type

    float

  • Returns

    Seconds from start.


get_console_output

get_console_output(number_of_reports=1)

Return a list of console outputs reported by the Task. Returned console outputs are retrieved from the most updated console outputs.

  • Parameters

    number_of_reports (int ) – number of reports to return, default 1, the last (most updated) console output

  • Return type

    Sequence[str]

  • Returns

    List of strings each entry corresponds to one report.


get_metric

get_metric(title, series)

Retrieve a specific scalar metric from the running Task.

  • Parameters

    • title (str ) – Graph title (metric)

    • series (str ) – Series on the specific graph (variant)

  • Return type

    (float, float, float)

  • Returns

    A tuple of min value, max value, last value


is_aborted

is_aborted()

Return True, if job was executed and aborted

  • Return type

    bool

  • Returns

    True the task is currently in aborted state


is_cached_task

is_cached_task()

  • Return type

    bool

  • Returns

    True if the internal Task is a cached one, False otherwise.


is_completed

is_completed()

Return True, if job was executed and completed successfully

  • Return type

    bool

  • Returns

    True the task is currently in completed or published state


is_failed

is_failed()

Return True, if job is has executed and failed

  • Return type

    bool

  • Returns

    True the task is currently in failed state


is_pending

is_pending()

Return True, if job is waiting for execution

  • Return type

    bool

  • Returns

    True if the task is currently queued.


is_running

is_running()

Return True, if job is currently running (pending is considered False)

  • Return type

    bool

  • Returns

    True, if the task is currently in progress.


is_stopped

is_stopped(aborted_nonresponsive_as_running=False)

Return True, if job finished executing (for any reason)

  • Parameters

    aborted_nonresponsive_as_running (bool) – (default: False) If True, ignore the stopped state if the backend non-responsive watchdog sets this Task to stopped. This scenario could happen if an instance running the job is killed without warning (e.g. spot instances)

  • Return type

    bool

  • Returns

    True the task is currently one of these states, stopped / completed / failed / published.


iterations

iterations()

Return the last iteration value of the current job. -1 if job has not started yet

  • Return type

    int

  • Returns

    Task last iteration.


launch

launch(queue_name=None)

Send Job for execution on the requested execution queue

  • Parameters

    queue_name (str ) –

  • Return type

    bool

:return False if Task is not in “created” status (i.e. cannot be enqueued) or cannot be enqueued

  • Return type

    bool

  • Parameters

    queue_name (Optional [ str ] ) –


ClearmlJob.register_hashing_callback

classmethod register_hashing_callback(a_function)

Allow to customize the dict used for hashing the Task. Provided function will be called with a dict representing a Task, allowing to return a modified version of the representation dict.

  • Parameters

    a_function (Callable[[dict], dict]) – Function manipulating the representation dict of a function

  • Return type

    None


started

started()

Return True, if job already started, or ended. False, if created/pending.

  • Return type

    bool

  • Returns

    False, if the task is currently in draft mode or pending.


status

status(force=False)

Return the Job Task current status. Options are: “created”, “queued”, “in_progress”, “stopped”, “published”, “publishing”, “closed”, “failed”, “completed”, “unknown”.

  • Parameters

    force (bool) – Force status update, otherwise, only refresh state every 1 sec

  • Return type

    str

  • Returns

    Task status Task.TaskStatusEnum in string.


status_message

status_message()

Gets the status message of the task. Note that the message is updated only after BaseJob.status() is called

  • Return type

    str

  • Returns

    The status message of the corresponding task as a string


task_id

task_id()

Return the Task id.

  • Return type

    str

  • Returns

    The Task ID.


ClearmlJob.update_status_batch

classmethod update_status_batch(jobs)

Update the status of jobs, in batch_size

  • Parameters

    jobs (Sequence [ BaseJob ] ) – The jobs to update the status of

  • Return type

    ()


wait

wait(timeout=None, pool_period=30.0, aborted_nonresponsive_as_running=False)

Wait until the task is fully executed (i.e., aborted/completed/failed)

  • Parameters

    • timeout (Optional[float]) – maximum time (minutes) to wait for Task to finish

    • pool_period (float) – check task status every pool_period seconds

    • aborted_nonresponsive_as_running (bool) – (default: False) If True, ignore the stopped state if the backend non-responsive watchdog sets this Task to stopped. This scenario could happen if an instance running the job is killed without warning (e.g. spot instances)

  • Return type

    bool

  • Returns

    True, if Task finished.


worker

worker()

Return the current worker ID executing this Job. If job is pending, returns None

  • Return type

    Optional[str]

  • Returns

    ID of the worker executing / executed the job, or None if job is still pending.