Industry: 
Vertical Software Platform
Use Case: 
ML

How Porch used Union to migrate off Airflow & consolidate its data & ML operations

Porch is a new kind of insurance company on a mission to partner with home service companies to delight homeowners from moving to improving and everything in between. Union’s data and ML orchestrator enables the company to deliver insights and predictions for its home and services solutions that impact approximately 2 out of every 3 U.S. homebuyers each month.  

About Porch

Porch is a vertical software company focused on making the home simple by providing software and services to companies. From moving companies, home inspectors, large utility companies, real estate professionals, and more, these 11,000+ small and large businesses use Porch to improve their operations, grow their business, and improve their customer experiences. Through these companies, Porch gets introduced to their customers to help make the move and home maintenance simpler.

Spending time and effort ‘keeping the lights on’

Data is critical to Porch’s business as it leverages machine learning and analytics to deliver home market and customer insights. These insights and predictions offer personalized recommendations and services to homeowners, insights on market trends, match service professionals to appropriate jobs, and track the performance of its services in real-time.

The Porch team started by using Airflow to manage its data pipelines. They found the learning curve steep, especially when trying to orchestrate their Spark jobs on Google Dataproc with Airflow. This integration added complexity to the setup and day-to-day use because of additional cluster management, job configuration, and optimization. It required data scientists to rely on the Porch ML team. Instead, they wanted to abstract away the complexity from this integration to spend time developing workflows and applications and less on ‘keeping the lights on.’ They wanted to empower data scientists to focus on model and workflow development.

Additionally, Porch operated an Airflow cluster but found it challenging to maintain, especially without a dedicated DevOps team. The challenges became particularly daunting when the team had to upgrade their Airflow cluster after the version they were running was no longer supported. The upgrade would require a major overhaul, which the team wanted to avoid taking on.

This was the catalyst for the team to look for managed solutions such as Google Composer, managed offerings of Kubeflow, and Union, a fully managed solution built on Flyte. With native Kubernetes support as an essential criterion, the team narrowed their options to Kubeflow-managed offerings and Union. They then compared the two and chose Union orchestration engine for its best-in-class capabilities, fantastic user experience, and ease of troubleshooting all powered by open-source Flyte and one of the most strong and most active communities in AI orchestration.

Shifting time and effort to building and training ML models

Today, Porch runs several critical workloads for data processing, feature ingestion, and model (re)training on Union. Recently, Porch utilized Flyte Agents to solve a particularly challenging problem. The team needed to move off their self-hosted Airflow cluster instead of carrying the burden of upgrading and managing it. There was also legacy Airflow code requiring significant time and effort to refactor. Instead, the team used the Airflow agent to run Airflow tasks as Flyte tasks. The Porch team was able to ‘lift and shift’ without any code changes. The Airflow Flyte agent enabled this migration without code changes. This provided two major benefits: removing the operational cost to maintain the Airflow cluster and the time required to upgrade it. They decommissioned the self-hosted cluster and, in the process, bought some time to migrate to Union native Spark code.

“Without Flyte Agents, we would need to refactor our legacy models by recreating them, evaluating performance, productionizing, etc. This is not trivial, and no business wants to take on this type of effort, especially if the model is not broken. We were able to save nine months of engineering time by avoiding any code changes, and simply lifting and shifting our Airflow code and running it with Union.” — Shih-Gian Lee, Senior Machine Learning Engineer.

The team is now looking to prioritize its efforts to refactor to native Spark and utilize Union’s Kubernetes native Spark offering. This would greatly simplify the integration complexity and operational overhead while leveraging Kubernetes strengths.

“We want to simplify and not have to think about and manage different technology stacks. We want to write everything in a Union workflow and have one platform for orchestrating these jobs; that’s awesome and less stuff for us to worry about.” — Thomas Busath, Machine Learning Engineer.

The team has a unique perspective on the platform's capabilities as a long-time Union customer. Lee finds Union’s ability to propagate errors in the UI very helpful in troubleshooting, which enables self-sufficiency to identify issues and saves his team a lot of time in the process.

The versioning capabilities of Union allow the Porch team to speed up experimentation and development efforts by 200% through working collaboratively with rapid iterations while ensuring reproducibility. They can run parallel experiments across different team members on different versions of the same workflow. Lee commented, “The biggest selling point for Union is how well and easily it integrates into our infrastructure and enables best practices.” Union seamlessly integrates into Porch’s GitOps process, allowing for traceability of code changes down to the specific workflow versions executed in Union. This end-to-end versioning ensures that every change is reviewed, versioned, and can be deployed or rolled back confidently.

The platform’s containerized approach also enables the team to isolate and manage the different images and dependencies of their Python and Spark tasks. This simplifies troubleshooting and supports application development best practices.

The Porch team found Union’s documentation to be comprehensive and was able to find most answers to their questions. When they needed help, Union’s support was responsive and addressed their questions. The Flyte open source community was also an excellent resource and informed many of Porch’s design decisions.

“We’re amazed at the level of customer support we get from engineers at Union. It seems like within minutes we get a response from an engineer who directly goes and looks into our issue. This alone saves an incredible amount of time. I find myself comparing other vendors’ support to Union. We’re always disappointed in them because of the bar that Union has set” — Thomas Busath, Machine Learning Engineer

Porch also utilizes GCP marketplace for Union to centralize billing and utilize its Union subscription towards its GCP spend commitments.

Lee believes that Union is the ideal orchestration solution because of its integrations with existing infrastructure tooling, its containerized approach, ability to customize, versioning capability, etc. Thomas commented on how “the product continuously improves” and is looking forward to upcoming features such as Artifacts and Triggers. Other Porch teams expect to adopt Union for their use cases and needs.