As organizations expand their AI initiatives, they increasingly need to provide users, be they data scientists, AI/ML engineers, researchers, or application developers, with secure access to interactive development environments such as JupyterLab, VS Code, or other internal tools. These environments must be delivered reliably across a wide range of deployment scenarios: single-host Docker Compose setups, hybrid on-prem/cloud installations, private bare-metal clusters, and fully container-orchestrated Kubernetes environments.
Across all of these contexts, teams face the same foundational challenge: how to expose dynamic, per-user development sessions safely and consistently, without exposing infrastructure internals, compromising multi-tenant isolation, or creating operational overhead. The ClearML AI Application Gateway was created to solve this problem, acting as the secure, identity-aware front door to ClearML Sessions regardless of where the underlying infrastructure resides, securing everything that is exposed, whether an endpoint, TCP port, LLMs, and so on.
The Challenge of Exposing Dynamic AI Development Environments
Interactive AI development tools differ fundamentally from static web services. They are ephemeral, user-specific, and often executed on demand. Whether running on Docker Compose or a distributed Kubernetes cluster, each new IDE session creates a container or pod with its own network address, lifecycle, and resource considerations. These sessions:
- Come and go frequently
- Change ports or IPs depending on the environment
- Must be accessible only to their rightful user
- Cannot expose internal node addresses or private ports
- Require support for HTTP, HTTPS, WebSockets, and sometimes TCP
Without a purpose-built routing layer, platform teams must rely on brittle, manual configurations, updating reverse proxies, managing firewall entries, exposing high-risk SSH access, or juggling multiple ingress rules. These patterns do not scale and introduce avoidable security concerns.
What the ClearML Application Gateway Enables
The ClearML App Gateway provides a consistent, secure mechanism for exposing development environments to authenticated users across all deployment models. It does this by serving as a session-aware ingress layer that integrates directly with ClearML Server and dynamically configures itself based on real-time session information.
Furthermore, through the ClearML UI, admins can issue, rotate, revoke, and configure expiry periods for tokens used by automated systems, agents, or external components interacting with the gateway. This provides clear visibility into active tokens, reduces the risk of long-lived or unmanaged credentials, and ensures that all gateway access is governed by explicitly controlled, expiring authentication artifacts.
When a user launches a session through ClearML Sessions, the App Gateway:
- Discovers the session and its location
- Validates the user’s identity and checks whether they are permitted to access the session
- Creates a stable, user-specific endpoint that remains constant throughout the session lifecycle
- Ensures authorization is performed when the user interacts with the endpoint: the gateway inspects the ClearML authentication token on each request and proxies the connection only if the user is the session’s owner or has explicit access rights
- Routes traffic to the correct backend instance no matter where or how it was created
This architecture ensures reliability whether the environment is simple (a single machine running Docker Compose) or complex (a multi-tenant, autoscaling Kubernetes cluster).
Supporting Multiple Deployment Modes
The ClearML App Gateway is architected to operate reliably across the full range of environments in which organizations deploy ClearML, from single-host installations to large-scale, distributed clusters. Although each deployment model presents its own networking patterns and operational characteristics, the App Gateway provides a consistent advantage everywhere: it offers a secure, centralized mechanism for exposing both interactive development sessions and deployed endpoints, such as LLM services and other HTTP-based or TCP-based applications, while abstracting away the underlying infrastructure’s complexity. The differences between environments lie in the specific infrastructure behaviors and traffic patterns that the App Gateway helps simplify, rather than in the types of workloads it is able to support.
The key principle is that the App Gateway is architected to be environment-agnostic while solving the same universal challenge: secure, session-aware access to dynamic AI development environments.
Docker Compose and Single-Host Deployments
In Docker Compose environments, workloads typically run on a single machine or a small collection of machines with relatively stable networking. Despite this simplicity, exposing multiple interactive development environments can still create fragmentation, requiring manually assigned ports, ad hoc reverse proxies, or inconsistent access patterns for different users. The App Gateway streamlines this by providing stable URLs, unified authentication, and standardized session access. Even in compact installations, it enhances clarity, security, and maintainability by consolidating access through a single, coherent entry point.
Hosted Compose (Hybrid) Installations
Docker Compose with hosted server (hybrid) deployments are designed for environments where the ClearML Server is hosted separately, such as a managed ClearML deployment, while the App Gateway and session workloads run within the customer’s own infrastructure. In this model, the App Gateway provides a single, externally reachable entry point for interactive sessions and application endpoints, while routing traffic internally to the appropriate containers based on session metadata from the ClearML Server.
This approach simplifies application-level access by avoiding direct exposure of individual session containers or dynamically assigned ports. Connectivity between the App Gateway, ClearML Server, and session workloads must still be explicitly configured through standard networking mechanisms such as routing, VPNs, or firewall rules. Once in place, the App Gateway ensures that users interact with a consistent endpoint, even though the underlying components may reside in different networks or administrative domains.
Hosted Docker Compose deployments are particularly useful in environments with segmented networks or strict security controls, where operators want to centralize ingress for AI development environments and LLM endpoints without directly exposing internal container topology. The App Gateway acts as a controlled access layer at the application level, while leaving infrastructure-level network design and security boundaries firmly under the organization’s control.
Kubernetes Deployments
Kubernetes introduces a level of dynamism that makes exposing AI development environments and LLM endpoints significantly more complex than in single-host or static deployments. Pods are constantly moving between nodes, their IPs and ports shifting as workloads are rescheduled, and autoscaling events create or remove replicas based on demand. Beyond this, Kubernetes typically operates across several internal networks (pod networks, service networks, node networks, and sometimes overlay networks), none of which are meant to be directly reachable from the outside world. The boundary between internal and external traffic is enforced at the load balancer and ingress layers, and anything exposed externally must traverse these components correctly.
In this environment, simply providing a reachable endpoint for a session or an LLM backend is non-trivial. Without additional orchestration, every change in pod placement would require rewriting ingress rules, updating load balancer targets, or exposing internal networks, none of which is maintainable at scale. While it is possible to build custom automations to track pod lifecycles, apply routing rules, enforce identity, and maintain isolation, doing so requires deep expertise in Kubernetes networking and results in a brittle solution that must be continuously maintained.
The ClearML App Gateway addresses this gap by acting as a stable, authoritative routing layer that understands both who is making a request and where the corresponding workload is currently running. Rather than exposing pods directly or updating ingress rules for each workload, organizations expose only a single, consistent external endpoint. Requests terminate at the load balancer, flow through the ingress layer, and arrive at the App Gateway. From there, the gateway performs the final routing decision by consulting ClearML Server’s session registry: it knows which sessions or endpoints exist, which user owns them, and which pod or container currently backs them. The gateway then proxies the connection internally to the correct target, insulating users and operators from all of Kubernetes’ internal network churn.
This layering preserves the security boundaries of the cluster: internal pod networks remain isolated, no node or pod IPs need to be exposed, and external traffic is funneled through a well-defined control point. It also ensures deterministic backend selection, critical for performance-sensitive workloads such as LLM inference where KV-cache locality or replica consistency must be maintained. Even when autoscaling events introduce new replicas or retire existing ones, the App Gateway transparently updates its routing decisions without requiring ingress rewrites or configuration changes.
In practical terms, the App Gateway assumes responsibility for the parts Kubernetes does not manage: identity-aware routing, stable external addressing, tenant-level isolation, and session or endpoint awareness. Kubernetes continues to provide service discovery, scheduling, autoscaling, and internal networking, while the App Gateway provides the logic required to expose ephemeral, user-specific, AI-focused workloads safely and predictably. This makes it the most maintainable and operationally sound approach for clusters where both interactivity and multi-tenancy are required.
Advanced Capabilities – Static Routes and Intelligent Routing
ClearML’s App Gateway includes several routing capabilities that support modern AI platforms regardless of infrastructure size or complexity.
Static routes make it possible to publish multiple LLM backends through a unified external endpoint, allowing applications to connect to a single, stable URL while the platform manages load distribution, model upgrades, and future capabilities such as canary deployments, rate limiting, and adaptive scaling.
Intelligent routing ensures that traffic intended for a particular user’s session, a specific LLM worker, or an agentic AI workflow is consistently routed to the correct backend instance. This is critical for stateful and performance-sensitive AI applications, including agent-based systems that maintain execution context across multiple steps, as well as LLM services that build and reuse runtime caches such as KV cache during inference. By preserving backend affinity, the App Gateway avoids unnecessary cache invalidation, reduces latency, and ensures predictable behavior across multi-step interactions.
Together, these capabilities allow the App Gateway to replace ad-hoc, application-specific routing logic with a unified, stable, and secure access layer that supports interactive sessions, LLM inference, and agentic AI systems at scale.
Enabling Secure, Multi-tenant AI Platforms
Across all deployment environments, organizations are increasingly building multi-tenant AI platforms. Whether the tenants are departments within an enterprise, different customer accounts, or separate research groups, the App Gateway enforces strict boundaries between different tenants. By binding session access directly to ClearML’s user and organization-level permissions, the gateway ensures that each user can access only the sessions they initiated or those explicitly shared with them.This is vital for cloud service providers offering GPUaaS or AIaaS, where user isolation is both a business requirement and a security imperative.
Conclusion
The ClearML App Gateway is a foundational component for organizations delivering secure, scalable, user-friendly AI development environments and deployed LLMs, regardless of whether they run on Docker Compose, hybrid infrastructure, bare-metal clusters, or Kubernetes. By unifying authentication, session discovery, routing intelligence, and multi-tenant enforcement under a single interface, the App Gateway provides the missing control layer that traditional infrastructure tools cannot supply.
As AI development continues to accelerate and teams demand increasingly seamless access to compute resources and development environments, ClearML is offering the stability, security, and operational discipline required for both small setups and large, multi-tenant AI platforms. And it is through the ClearML AI Application Gateway that every user, in every environment, experiences consistent, secure, and reliable access to the tools they need, without ever exposing the underlying complexity of the infrastructure that powers them.
To learn more about how ClearML could help your organization, contact our sales team by requesting a demo.