ClearML Hyper-Datasets

Build Data-Centric AI workflows

Make the most out of your unstructured-data using queryable datasets

Hyper-Dataset workflow

Leverage unstructured data


Organize, explore, and analyze a searchable catalog of your datasets with fully searchable version trees.


Slice, split, and sort by meta-data matching, allowing to oversample and debias datasets on-the-fly.

Store your queries in a fully reproducible entity, DataView, in conjunction with your code for full provenance and model reproducibility.


Gain insights into your data using the meta-data search capabilities. Visualize any data and edit meta-data directly from the UI or programmatically and instantly view object distribution statistics for any dataset version for additional visibility.


Decouple training and testing code, enabling users to test & validate models directly from the web UI & use a programmatic interface available through a flexible query mechanism with any data distribution and on multiple weighted queries.

Data-Centric model development

Learn how Neural Guard developed a methodology to build datasets for DL models that optimize ROI and TTM to increase competitiveness in their market.

Review Neural Guard’s process and how ClearML Hyper-Datasets granted all needed functionality in a single, highly integrated MLOps toolchain to build specialized pipelines and connect to both research and production environments.
Case study

Hyper-Datasets and beyond

ClearML Hyper-Datasets integrates seamlessly with ClearML Experiment and ClearML Orchestrate, leveraging end-to-end cross-department visibility in your research, development, and production.

