ClearML empowered Trigo’s AI Developers to manage workflows just as they had in traditional software development. And that was just the start...
Trigo is a computer vision startup reshaping the retail experience. Leveraging world-class AI and algorithmics experts, the company’s advanced retail automation platform identifies customers’ shopping items with exceptional levels of accuracy, creating an entirely seamless checkout process.
Trigo’s technology streamlines retail operations, prevents shoplifting, provides invaluable retail insights, and presents opportunities for new levels of customer engagement within retail environments.
They were looking to avoid painful integration costs (both initial and ongoing), as well as the management overhead to juggle a collection of complex systems.
Managing the AI / DL lifecycle just like in software engineering
With such a multitiered system to build, the AI experimentation required to develop and perfect these overlapping algorithms is for-midable; any faulty “micro-decision” by even a single module could yield incorrect analysis and resulting decision. Trigo deploys a number of teams who create the modules to tackle each of these tasks. Team leaders recognized from the outset the complexity required for overcoming two well-known challenges:
First, this level of active and broad experimentation required a fluid, streamlined flow of research to production ‒ the modeling work of the data scientists being passed on for application implementation by the engineering team.
Second, their teams were wellversed in the most important dynamics of traditional software development workflows, many of which apply to AI and Deep Learning as well: versioning, collaboration, DevOps, and CI/CD. They were de-work of the data scientists being passed on for application implementation by the engineering team.
Second, their teams were well-versed in the most important dynamics of traditional software development workflows, many of which apply to AI and Deep Learning as well: versioning, collaboration, DevOps, and CI/CD. They were determined to implement these processes at the level of efficiency they were accustomed to (at a minimum), without compromise.
The team recognized that while it probably wasn’t possible to drive every aspect of AI development in a single platform, they wanted to select and use as few best-of-breed solutions as possible. They were looking to avoid painful integration costs (both initial and ongoing), as well as the management overhead to juggle a collection of complex systems.
After researching and testing a long list of platforms and applications, they quickly settled on the popular open-source PyTorch as their core machine learning framework.
Then they began exploring their options for experiment management and ML pipeline building and automation: model cloning and migration, collaboration, documentation, and resource monitoring. After this stage of the search and evaluation of multiple tools and solutions, they came across ClearML and were intrigued. ClearML is the open source platform that automates and simplifies developing and managing machine learning solutions for thousands of data science teams all over the world.
They were pleased to discover that ClearML integrated into their code instantly with the addition of just a small snippet of additional code. Using ClearML’s automation and orchestration functionality, they didn’t need to manually create new containers to run their code – ClearML did it for them. This functionality was a relief for the research team, as they no longer had to send requests over and wait for DevOps to update a container with a new package multiple times a day.
Trigo had succeeded in accomplishing what they set out to do. First, they empowered their data scientists to manage their entire ML workflow themselves, without a complex set of tools to manage. Using their own machines with a combination of PyTorch and Tensorboard, they could now confirm that an experiment was running smoothly locally, and then, using the ClearML Web UI, simply clone the experiment in a couple of clicks, without the need to repackage the codebase. All that was left was to schedule the execution on one of their many on-prem GPUs – through the simple web (or API) ClearML orchestration interface. This streamlined workflow saved each research scientist substantial time throughout the workday.
Second, they now had, in ClearML, a comprehensive collection of features that made their ML processes as efficient as those they were familiar with from their experience in traditional software development: documentation and versioning for review, provenance and root cause analysis; collaboration to share and build on one another’s successful models; and self-managed DevOps and CI/CD/CT for rapid no-delay iterations throughout the process.
Now armed with the right tools for development, the Trigo teams have leveraged the ClearML features to further streamline their experiment process – including the challenge of merging a single model (used in one module of the complex system), into the master tree, without knowing if or how it may degrade performance. In that particular case, their CI launches a test experiment on a blind dataset for every pull request – a hit on performance means a rejection of the model. Using ClearML, the code never makes its way into the git repo – it remains only as a link to the model file, and ClearML can upload the model as needed, once proven to improve overall performance.
These are the challenges that Trigo has solved using PyTorch, Tensorboard, ClearML, and their own ingenuity. As they move forward transforming and elevating our shopping productivity, their teams have done the same with their AI tools – bringing their own efficiency and output to new levels.