UnionML 0.2.0 Integrates with BentoML
Ship cloud-agnostic model prediction services
One of the most challenging aspects of building machine learning-driven applications is what I like to call “the deployment chasm.”
I remember one of the first major ML models I deployed: It was a fairly complex beast, requiring substantial data- and feature-engineering work, not to mention the model-training process itself. When it came time to deploy the model, my team and I needed to figure out how best to do it in a way that we could maintain while giving us visibility into model and endpoint health metrics. We decided to use Sagemaker inference endpoints since the rest of the company used AWS and our product required on-demand, low-latency predictions.
We had to rewrite many parts of our research code to conform to Sagemaker’s API (par for the course in many ML projects). That wasn’t ideal because it created two separate implementations of the code: the research version and the production version. If we weren’t careful about how we organized our codebase, it would create code skew, where we’d have to remember to update both places in the codebase if we ever needed to revise certain parts of the inference logic.
But what if I told you that you can write that code once and deploy it to a wide variety of cloud platforms?
I’m excited to announce that the ✨ 0.2.0 Harmony release of UnionML ✨ is out, and that we’ve integrated with BentoML to give you a seamless deployment experience. UnionML reduces the boilerplate code needed to build models and mitigates the risk of code skew when transitioning from development to production.
How Does it Work?
UnionML organizes machine learning systems as apps. When you define a UnionML app, you need to implement a few core components. As you can see in the diagram below, UnionML then bundles these components together into meaningful services that you can deploy to some target infrastructure.
The core abstraction of BentoML is the Bento, which is a file archive containing all the source code, models, data and configuration needed to run a model prediction service. Using the BentoML integration is simple: first, you need to bind a <span class="code-inline">BentoMLService</span> to a <span class="code-inline">unionml.Model</span>:
Then, you can use the <span class="code-inline">model</span> object to train and save a model locally:
Finally, you can create a <span class="code-inline">service.py</span> file that defines the underlying bentoml.Service object you’ll ultimately use to deploy the prediction service. The neat thing is that since the UnionML app already defines the feature processing and prediction logic, creating the service only takes a few lines of code ✨
Get Started with UnionML
With the 0.2.0 release of UnionML, you can now train a model with UnionML locally or on a Flyte Cluster at scale, then deploy it to any of the supported cloud targets like AWS Lambda, Sagemaker, Google Cloud Run and Azure Functions. To learn more, check out the following resources:
- BentoML guide: https://unionml.readthedocs.io/en/stable/serving_bentoml.html
- Github repo: https://github.com/unionai-oss/unionml
- UnionML: https://www.union.ai/unionml
Join the Slack community if you have any questions!