Skip to main content

Setup

The following page goes over how to set up and upgrade clearml-serving.

Prerequisites#

  • ClearML-Server : Model repository, Service Health, Control plane
  • Kubernetes / Single-instance Machine : Deploying containers
  • CLI : Configuration & model deployment interface

Initial Setup#

  1. Set up your ClearML Server or use the free hosted service

  2. Connect clearml SDK to the server, see instructions here

  3. Install clearml-serving CLI:

    pip3 install clearml-serving
  4. Create the Serving Service Controller:

    clearml-serving create --name "serving example"

    The new serving service UID should be printed

    New Serving Service created: id=aa11bb22aa11bb22

    Write down the Serving Service UID

  5. Clone the clearml-serving repository:

    git clone https://github.com/allegroai/clearml-serving.git
  6. Edit the environment variables file (docker/example.env) with your clearml-server credentials and Serving Service UID. For example, you should have something like

    cat docker/example.env
    CLEARML_WEB_HOST="https://app.clear.ml"
    CLEARML_API_HOST="https://api.clear.ml"
    CLEARML_FILES_HOST="https://files.clear.ml"
    CLEARML_API_ACCESS_KEY="<access_key_here>"
    CLEARML_API_SECRET_KEY="<secret_key_here>"
    CLEARML_SERVING_TASK_ID="<serving_service_id_here>"
  7. Spin up the clearml-serving containers with docker-compose (or if running on Kubernetes, use the helm chart)

    cd docker && docker-compose --env-file example.env -f docker-compose.yml up

    If you need Triton support (keras/pytorch/onnx etc.), use the triton docker-compose file

    cd docker && docker-compose --env-file example.env -f docker-compose-triton.yml up

    If running on a GPU instance w/ Triton support (keras/pytorch/onnx etc.), use the triton gpu docker-compose file:

    cd docker && docker-compose --env-file example.env -f docker-compose-triton-gpu.yml up
note

Any model that registers with Triton engine will run the pre/post processing code on the Inference service container, and the model inference itself will be executed on the Triton Engine container.

Advanced Setup - S3/GS/Azure Access (Optional)#

To add access credentials and allow the inference containers to download models from your S3/GS/Azure object-storage, add the respective environment variables to your env files (example.env). See further details on configuring the storage access here.

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
GOOGLE_APPLICATION_CREDENTIALS
AZURE_STORAGE_ACCOUNT
AZURE_STORAGE_KEY

Upgrading ClearML Serving#

Upgrading to v1.1

  1. Take down the serving containers (docker-compose or k8s)
  2. Update the clearml-serving CLI pip3 install -U clearml-serving
  3. Re-add a single existing endpoint with clearml-serving model add ... (press yes when asked). It will upgrade the clearml-serving session definitions
  4. Pull the latest serving containers (docker-compose pull ... or k8s)
  5. Re-spin serving containers (docker-compose or k8s)

Tutorial#

For further details, see the ClearML Serving Tutorial.