Stash cuts pipeline compute costs by 67% with Flyte

Industry

Financial Services & Fintech

Use Cases

Data Processing
Model Training

Challenge

Stash needed a scalable orchestration platform to optimize ML pipelines.

Stash is a New York–based fintech company whose subscription platform helps millions save, invest, budget, and manage their financial lives. The machine learning team supports critical capabilities across fraud detection, model training, and performance optimization.

Before Flyte, Stash relied on a homegrown pipeline management process, which introduced inefficiencies, operational overhead, and rising compute costs. ML Platform Engineer Katrina Palen noted the need for a platform that was:

  • Easily configurable with code
  • Simple to debug and observe
  • Transparent and traceable
  • Standardized across teams
  • Flexible for researchers
  • Able to free modelers and scientists from DevOps burdens

The team evaluated modern orchestrators—including Kubeflow, Prefect, and their existing Airflow deployment—before selecting Flyte.

“We want to make writing a workflow for people's jobs pleasant and straightforward… if it’s not simple and easy, people aren’t going to adopt the technology.”

Katrina Palen

ML Platform Engineer at Stash

Solution

Flyte provided a unified, Pythonic, and cost-efficient orchestration platform.

Stash’s ML team adopted Flyte for its strong developer experience and smooth onboarding across teams with different coding styles.

Stash operated a self-managed Kubernetes cluster on AWS using an Apache Spark operator and Amazon EMR. Flyte fit seamlessly into this environment. The team implemented Flyte in three stages:

  1. Deploy Flyte and configure Spark on Kubernetes
  2. Write Spark tasks to abstract configuration complexity
  3. Migrate legacy Airflow workflows into Flyte pipelines and onboard modelers

This migration enabled modelers to write and launch workflows independently.

“Every single new model that comes out is being done on Flyte… They’re becoming more and more self-sufficient with every single release.” —Katrina Palen

Flyte also allowed Stash to allocate compute per task, replacing the previous need to provision large, memory-heavy clusters for entire pipelines. Domain separation enabled testing flexibility and controlled per-environment resource configuration.

Strong typing, better tooling, and structured pipeline development improved quality control and streamlined ML experimentation.

50
%

faster provisioning time

65
%

reduction in model execution time

67
%

lower compute costs per batch inference run

Results

Flyte reduced compute costs and accelerated ML pipeline performance.

By adopting Flyte, Stash unlocked major operational and financial gains. Provisioning time dropped by half. Model execution time dropped by nearly two-thirds. And compute costs for batch inference runs fell by 67%.

These savings do not yet include potential gains from Flyte’s caching or dataset management capabilities, meaning further reductions are expected as pipelines evolve.

Stash also reduced its dependency on Amazon EMR while continuing to use Spark efficiently, capturing even more savings. The ML team continues to integrate Flyte deeper into its processes and collaborate with the Flyte community to optimize pipelines as the business scales.