Skip to main content

OptimizerOptuna

class automation.optuna.OptimizerOptuna()#

Initialize an Optuna search strategy optimizer Optuna performs robust and efficient hyperparameter optimization at scale by combining. Specific hyperparameter pruning strategy can be selected via sampler and pruner arguments

  • Parameters

    • base_task_id (str ) – Task ID (str)

    • hyper_parameters (list ) – list of Parameter objects to optimize over

    • objective_metric (Objective ) – Objective metric to maximize / minimize

    • execution_queue (str ) – execution queue to use for launching Tasks (experiments).

    • num_concurrent_workers (int ) – Limit number of concurrent running Tasks (machines)

    • max_iteration_per_job (int ) – number of iteration per job ‘iterations’ are the reported iterations for the specified objective, not the maximum reported iteration of the Task.

    • total_max_jobs (int ) – total maximum job for the optimization process. Must be provided in order to calculate the total budget for the optimization process. The total budget is measured by “iterations” (see above) and will be set to max_iteration_per_job * total_max_jobs This means more than total_max_jobs could be created, as long as the cumulative iterations (summed over all created jobs) will not exceed max_iteration_per_job * total_max_jobs

    • pool_period_min (float ) – time in minutes between two consecutive pools

    • min_iteration_per_job (int ) – The minimum number of iterations (of the Objective metric) per single job, before early stopping the Job. (Optional)

    • time_limit_per_job (float ) – Optional, maximum execution time per single job in minutes, when time limit is exceeded job is aborted

    • compute_time_limit (float ) – The maximum compute time in minutes. When time limit is exceeded, all jobs aborted. (Optional)

    • optuna_kwargs (Any ) – arguments passed directly to the Optuna object


create_job#

create_job()

Abstract helper function. Implementation is not required. Default use in process_step default implementation Create a new job if needed. return the newly created job. If no job needs to be created, return None.

  • Return type

    Optional[ClearmlJob]

  • Returns

    A Newly created ClearmlJob object, or None if no ClearmlJob created.


get_created_jobs_ids#

get_created_jobs_ids()

Return a Task IDs dict created by this optimizer until now, including completed and running jobs. The values of the returned dict are the parameters used in the specific job

  • Return type

    Mapping[str, dict]

  • Returns

    dict of task IDs (str) as keys, and their parameters dict as values.


get_created_jobs_tasks#

get_created_jobs_tasks()

Return a Task IDs dict created by this optimizer until now. The values of the returned dict are the ClearmlJob.

  • Return type

    Mapping[str, dict]

  • Returns

    dict of task IDs (str) as keys, and their ClearmlJob as values.


get_objective_metric#

get_objective_metric()

Return the metric title, series pair of the objective.

  • Return type

    (str, str)

  • Returns

    (title, series)


get_running_jobs#

get_running_jobs()

Return the current running ClearmlJob.

  • Return type

    Sequence[ClearmlJob]

  • Returns

    List of ClearmlJob objects.


get_top_experiments#

get_top_experiments(top_k)

Return a list of Tasks of the top performing experiments, based on the controller Objective object.

  • Parameters

    top_k (int ) – The number of Tasks (experiments) to return.

  • Return type

    Sequence[Task]

  • Returns

    A list of Task objects, ordered by performance, where index 0 is the best performing Task.


get_top_experiments_details#

get_top_experiments_details(top_k, all_metrics=False, all_hyper_parameters=False, only_completed=False)

Return a list of dictionaries of the top performing experiments. Example: [

{‘task_id’: Task-ID, ‘metrics’: scalar-metric-dict, ‘hyper_parameters’: Hyper-Parameters},

] Order is based on the controller Objective object.

  • Parameters

    • top_k (int ) – The number of Tasks (experiments) to return.

    • all_metrics (bool ) – Default False, only return the objective metric on the metrics dictionary. If True, return all scalar metrics of the experiment

    • all_hyper_parameters (bool ) – Default False. If True return all the hyper-parameters from all the sections.

    • only_completed (bool ) – return only completed Tasks. Default False.

  • Return type

    Sequence[(str, dict)]

  • Returns

    A list of dictionaries ({task_id: ‘’, hyper_parameters: {}, metrics: {}}), ordered by performance, where index 0 is the best performing Task. Example w/ all_metrics=False:

    [

    {
    task_id: ‘0593b76dc7234c65a13a301f731958fa’,
    hyper_parameters: {‘General/lr’: ‘0.03’, ‘General/batch_size’: ‘32’},
    metrics: {
    ’accuracy per class/cat’: {
    ‘metric’: ‘accuracy per class’,
    ‘variant’: ‘cat’,
    ‘value’: 0.119,
    ‘min_value’: 0.119,
    ‘max_value’: 0.782
    },
    }
    },

    ]

Example w/ all_metrics=True:

[

{
task_id: ‘0593b76dc7234c65a13a301f731958fa’,
hyper_parameters: {‘General/lr’: ‘0.03’, ‘General/batch_size’: ‘32’},
metrics: {
‘accuracy per class/cat’: {
‘metric’: ‘accuracy per class’,
‘variant’: ‘cat’,
‘value’: 0.119,
‘min_value’: 0.119,
‘max_value’: 0.782
},
‘accuracy per class/deer’: {
‘metric’: ‘accuracy per class’,
‘variant’: ‘deer’,
‘value’: 0.219,
‘min_value’: 0.219,
‘max_value’: 0.282
},
}
},

]


get_top_experiments_id_metrics_pair#

get_top_experiments_id_metrics_pair(top_k, all_metrics=False, only_completed=False)

Return a list of pairs (Task ID, scalar metric dict) of the top performing experiments. Order is based on the controller Objective object.

  • Parameters

    • top_k (int ) – The number of Tasks (experiments) to return.

    • all_metrics (bool ) – Default False, only return the objective metric on the metrics dictionary. If True, return all scalar metrics of the experiment

    • only_completed (bool ) – return only completed Tasks. Default False.

  • Return type

    Sequence[(str, dict)]

  • Returns

    A list of pairs (Task ID, metric values dict), ordered by performance,

where index 0 is the best performing Task. Example w/ all_metrics=False:

[

(‘0593b76dc7234c65a13a301f731958fa’,
{
‘accuracy per class/cat’: {
‘metric’: ‘accuracy per class’,
‘variant’: ‘cat’,
‘value’: 0.119,
‘min_value’: 0.119,
‘max_value’: 0.782
},
}
),

]

Example w/ all_metrics=True:

[

(‘0593b76dc7234c65a13a301f731958fa’,
{
‘accuracy per class/cat’: {
‘metric’: ‘accuracy per class’,
‘variant’: ‘cat’,
‘value’: 0.119,
‘min_value’: 0.119,
‘max_value’: 0.782
},
‘accuracy per class/deer’: {
‘metric’: ‘accuracy per class’,
‘variant’: ‘deer’,
‘value’: 0.219,
‘min_value’: 0.219,
‘max_value’: 0.282
},
}
),

]


helper_create_job#

helper_create_job(base_task_id, parameter_override=None, task_overrides=None, tags=None, parent=None, kwargs)**

Create a Job using the specified arguments, ClearmlJob for details.

  • Return type

    ClearmlJob

  • Returns

    A newly created Job instance.

  • Parameters

    • base_task_id (str ) –

    • parameter_override (Optional [ Mapping [ str , str ] ] ) –

    • task_overrides (Optional [ Mapping [ str , str ] ] ) –

    • tags (Optional [ Sequence [ str ] ] ) –

    • parent (Optional [ str ] ) –

    • kwargs (Any ) –


monitor_job#

monitor_job(job)

Helper function, Implementation is not required. Default use in process_step default implementation. Check if the job needs to be aborted or already completed.

If returns False, the job was aborted / completed, and should be taken off the current job list

If there is a budget limitation, this call should update self.budget.compute_time.update / self.budget.iterations.update

  • Parameters

    job (ClearmlJob ) – A ClearmlJob object to monitor.

  • Return type

    bool

  • Returns

    False, if the job is no longer relevant.


process_step#

process_step()

Abstract helper function. Implementation is not required. Default use in start default implementation Main optimization loop, called from the daemon thread created by start.

  • Call monitor job on every ClearmlJob in jobs:

    • Check the performance or elapsed time, and then decide whether to kill the jobs.
  • Call create_job:

    • Check if spare job slots exist, and if they do call create a new job based on previous tested experiments.
  • Return type

    bool

  • Returns

    True, if continue the optimization. False, if immediately stop.


set_job_class#

set_job_class(job_class)

Set the class to use for the helper_create_job function.

  • Parameters

    job_class (ClearmlJob ) – The Job Class type.

  • Return type

    ()


set_job_default_parent#

set_job_default_parent(job_parent_task_id, project_name=None)

Set the default parent for all Jobs created by the helper_create_job method.

  • Parameters

    • job_parent_task_id (str ) – The parent Task ID.

    • project_name (str ) – If specified, create the jobs in the specified project

  • Return type

    ()


set_job_naming_scheme#

set_job_naming_scheme(naming_function)

Set the function used to name a newly created job.

  • Parameters

    naming_function (callable ) – ```py naming_functor(base_task_name, argument_dict) -> str

  • Return type

    ()


set_optimizer_task#

set_optimizer_task(task)

Set the optimizer task object to be used to store/generate reports on the optimization process. Usually this is the current task of this process.

  • Parameters

    task (Task ) – The optimizer`s current Task.

  • Return type

    ()


start#

start()

Start the Optimizer controller function loop() If the calling process is stopped, the controller will stop as well.

important

This function returns only after optimization is completed or stop was called.

  • Return type

    ()


stop#

stop()

Stop the current running optimization loop, Called from a different thread than the start.

  • Return type

    ()