By Adam Wolf
This blog covers the topic of ClearML Vaults as it relates to enforcing AI infrastructure policies within an organization. It accompanies our Enterprise AI Infrastructure Security YouTube series. Watch the corresponding video below.
When Policy Lives in Documentation, It Gets Ignored
Consider three incidents that happen on real AI teams every week. An engineer saves model artifacts to their personal Azure account because it was faster to set up than requesting access to the approved bucket. A data scientist pulls a foundation model directly from Hugging Face, bypassing the internal model registry. Someone pins a fine-tuning job to pytorch:latest, an image with an unpatched CVE, and shares the config with the GenAI team. Now the entire LLM development pipeline is running on a vulnerable container. None of them did this maliciously. They were just trying to get their work done. The policy existed – it just wasn’t enforced.
ClearML Vaults solve this by moving enforcement from documentation to the platform itself. A vault is a configuration block that the ClearML server applies to every task at runtime before a single line of user code executes. Whether a task runs locally via the SDK or remotely via the ClearML Agent, the vault config is pulled from the server and merged on top of the local clearml.conf file. Users don’t trigger this explicitly. It just happens.

What Goes in a Vault
A vault can hold anything that belongs in a clearml.conf file, organized across four main areas:
- Storage: the default output URI where artifacts and models land (on-prem, S3, Azure Blob, GCS), plus the storage credentials themselves. The vault controls both where data goes and how the platform authenticates to get there.
- Compute: the Docker image and version your tasks run in. Admins can set a default image, pin specific versions, and configure Docker arguments. ClearML Enterprise adds match rules: a lookup table that selects the right container based on task requirements: if a task declares Python 3.9, use this image; if it needs TensorFlow 2.6, use that one. There’s also an override flag that prevents users from substituting their own image entirely.
- Credentials: Git tokens for private repos, API tokens for external services. Credentials are layered: an admin sets platform-wide defaults, group vaults add team-specific config on top, and individual users can extend further. Rotating a credential happens once in the vault: every subsequent task across every user in the group picks it up automatically.
- SDK and environment settings: environment variables (HuggingFace tokens, proxy settings, database credentials), auto-generated config files injected at specific paths at runtime, and package manager config including internal PyPI mirrors for air-gapped environments.
Vaults aren’t a subset of the runtime environment; they are the runtime environment. Storage, compute, credentials, and environment config are all enforced centrally.

Personal Vaults vs. Administrator Vaults
ClearML has two vault types. A personal vault belongs to an individual user (their credentials, their storage paths, their environment variables.) It’s useful for personal workflow preferences, and it only applies to that user’s own tasks.
An administrator vault is created by a platform admin and assigned to a user group. Every task run by any member of that group gets the admin vault applied. The critical rule: the admin vault always wins. Where the admin vault and a personal vault conflict, the admin vault overrides, and the user’s personal vault UI will show a notification listing exactly which fields have been overridden.
This applies to service accounts too. Any admin vault scoped to a group governs service accounts in that group, so automated pipelines and agents are held to the same policies as human users.
The layering goes three levels deep: platform admin → group → user. Each layer can add configuration but cannot override the layer above it.

What This Looks Like in Practice
Take a user, and let’s call him Bob. Bob has a personal vault with his own Azure Blob credentials, a Git token for his private repo, and a base Docker image set to python:latest. His code calls set_base_docker(“python:latest”) and submits to the llm_serving queue. Without an admin vault, the task runs exactly as configured: Bob’s storage, Bob’s container, Bob’s credentials.
An admin creates an administrator vault assigned to Bob’s group with three things: the approved encrypted Azure storage location, a vetted and pinned container image, and centrally managed Git credentials. Bob’s personal vault can be left in place or disabled – either way, the admin vault config takes precedence.
Bob resubmits the same code, unchanged. The task now writes to the approved storage bucket. It runs in the admin’s pinned image, not python:latest. The Git credentials injected by the platform are used to clone the repo. Bob’s personal settings were overridden silently, and the task ran compliantly without Bob changing anything.
set_base_docker(“python:latest”) was still in Bob’s code. The task ran on the admin’s approved image. That’s vault enforcement.

Vaults in the ClearML Security Model
Vaults are the third layer in ClearML’s enterprise security architecture. The first layer covers authentication and identity (SSO, SAML, LDAP, and identity provider integration). The second handles access control, which groups can see which projects, queues, and resources. Vaults are the third: once a user is authenticated and authorized, vaults govern the environment their tasks actually run in.
Each layer addresses a different attack surface. Identity controls who gets in. Access rules control what they can reach. Vaults control what happens when they execute. Together, they close the loop: a user can only access what they’re permitted to see, and when they run something, it runs in the environment the platform defines, not whatever configuration they happened to assemble on their laptop.
Closing
ClearML Vaults flip the compliance model on its head. Instead of asking teams to remember and follow policy, the platform simply applies it, on every task, for every user, without any changes to how people write or submit their code. The result is a security posture that scales with your team rather than degrading as it grows.
If your organization is running AI workloads across multiple teams and environments, vaults are the mechanism that keeps storage, compute, and credentials consistent and auditable, regardless of what any individual user happens to have configured locally. Set it once, enforce it everywhere. Previous videos in our Enterprise AI Security series cover IdP and RBAC. Watch them on YouTube here. Get in touch if you would like to discuss how ClearML can support your organization’s AI infrastructure policy requirements.