by Victor Sonck, Developer Advocate, ClearML
Sarcasm can be difficult to detect in text, especially for machines. However, with the power of large language models, it’s possible to create a tool that can identify sarcastic comments with high accuracy. That’s exactly what the ClearML team did with their latest project: a sarcasm detector that combines various ClearML tools to showcase the capabilities of MLOps.
In the age of chatGPT and proprietary APIs, this project is meant as an example of how to create tools based on large language models that can run on your own machine, so you have full control over it. And thanks to ClearML being open source, even the whole MLOps stack can run locally.
The Sarcasm Detector was trained on a dataset of 1.3 million sarcastic comments from Reddit, providing a robust foundation for accurate analysis. We utilized open source technology and data from Kaggle, a community of data scientists and machine learning practitioners, to build the detector. Specifically, we trained the model using Reddit comments containing the \s (sarcasm) tag, and implemented Transformers and Gradio to optimize its performance.
Watch the video on how the Sarcasm Detector works:
By combining a large language model, with an easy interface to actively label and correct mistakes, you can easily make a very powerful closed training system. Using ClearML as an MLOps platform can combine everything together, keeping tracking of incoming data, automating training runs, and even hosting the Gradio instance. In this way, the sarcasm detector, while a lot of fun to play with, serves as an example for more practical, business-oriented use cases.
In the spirit of collaboration and open source development, we’re making the code for the Sarcasm Detector available on GitHub. Go take a look and take inspiration from it to build your own MLOps systems, or just play with it and have a good laugh – that works too! 😀
To download the code for ClearML’s Sarcasm Detector, visit: https://github.com/thepycoder/sarcasm_detector.
Get started with ClearML by using our free tier servers or by hosting your own. Read our documentation here. You’ll find more in-depth tutorials about ClearML on our YouTube channel and we also have a very active Slack channel for anyone that needs help. If you need to scale your ML pipelines and data abstraction or need unmatched performance and control, please request a demo.