Industry:
Financial Technology
Use Case:
ML

How Stash Cut Pipeline Compute Costs by 67% with Flyte™

Summary

Frustrated with its “homegrown” process for pipeline management, Stash evaluated several popular workflow orchestration platforms before choosing Flyte, the engine that powers Union Cloud. The switch cut provision and model execution time as well as dropping costs.

The company

Stash is a New York fintech company that develops a subscription-based personal finance app aimed at customers who may not have invested money before. Stash’s website and mobile platform help people invest small and grow wealth, bank, budget and save, and get financial advice.

The challenge

Stash’s machine learning team provides critical tools and resources for modelers and researchers to manage the app’s activity, including protecting the system from fraud, optimizing model execution time, and lowering compute costs and time to market.

Katrina Palen, an ML platform engineer at Stash, said the company had been using “a homegrown process” to manage its pipeline and realized it needed a more efficient, less costly solution.

The team performed a comprehensive evaluation of popular orchestrators on the market, seeking a platform that was easily configurable with code and easily debugged, observable and simple to use. The team also wanted transparency and traceability for all models, standardized processes across researchers and machine learning algorithm developers, and flexibility for scientists and researchers while freeing them from DevOps tasks.

The solution

Palen said the ML team at Stash chose Flyte after reviewing Kubeflow, Prefect and Airflow, which it had been using at that time for workflow orchestration.

Explaining why the Stash team chose Flyte, Palen noted that Stash had many researchers with different coding styles and levels of familiarity with structuring. It was critical that they all have a good experience, and Flyte delivered that.

“We want to make writing a workflow for people's jobs pleasant and straightforward, so I think this is a big deal because if it’s not simple and easy, people aren’t going to adopt the technology, so this one was a really big one for us,” she said.

Flyte also brought a unique setup for Stash, which was running a self-managed Kubernetes cluster on AWS with an Apache Spark operator — an open-source, multi-language distributed processing engine that uses Amazon EMR clusters to run big data pipelines.

The Stash ML team implemented Flyte in three stages: First, it deployed Flyte and configured Spark on Kubernetes. Then, it wrote Spark tasks to abstract specific configurations away from modelers, rewriting existing Airflow workflows into Flyte pipelines. Finally, the team transferred all legacy models into Flyte workflows and created a demo workflow for modelers to use.

“What’s great now is that every single new model that comes out is being done on Flyte, and it's really exciting to be working with the modelers and having them write their own workflows and their own launch plans,” Palen said. “They’re becoming more and more self-sufficient with every single release.”

Before Flyte, Stash didn’t have the ability to allocate resources by task, so it was forced to provision huge clusters of work that required vast — and costly — amounts of memory. “We’re running those types of workloads very efficiently now,” she said.

Stash also benefits from Flyte’s domain separation, which allows “environment-based resource allocation and configuration isolation,” Palen said. “Just having that in our new environment allows us a lot of flexibility in terms of testing new workflows as well as being really controlled over the types of resources that we’re using.”

Flyte offers strong typing and a more robust set of tools for building and deploying machine learning models, which has improved quality control, Palen added.

The results

In Flyte, the team found the right solution that provided advanced features, superb flexibility and unprecedented cost savings, on average, for each batch inference run.

By using Flyte, Stash cut its provision time by 50%, reduced its model execution time by 65% and lowered its compute costs by 67%.

Those results don’t yet include any savings from the caching or data set management capabilities that allow resources to be provisioned by task. “So, over time we’re expecting to save even more,” Palen said.

While the fintech company is still using Spark and EMR, it has been able to reduce its dependency on the latter, capturing even more savings. Stash is continuing to integrate Flyte into its processes and is eager for the community’s input to help optimize its pipelines as its business develops.