How Warner Bros. Discovery Keeps Its Media Streams Flowing
Created by a merger of WarnerMedia and Discovery Inc. in 2022, Warner Bros. Discovery is growing ahead of schedule. The streaming company added 1.6 million subscribers and announced that it expects to turn a profit in 2023, two years ahead of its guidance to investors.
Getting there means putting WBD’s data about audience behavior to work engaging and retaining each of its 95.8 million subscribers with personalized offers and viewing suggestions. And productizing all that data requires lots of machine learning.
Keeping WBD’s streams on course is Machine Learning Platform Engineer Frank Shen, who works with several teams of data scientists to productize their models.
“We have different models for different purposes,” Shen said. “For customer lifecycles, we can predict the churn of active customers before their next subscription starts. We can put personalized messaging on their screens: ‘Hey, do you want to take this offer? We’re gonna give you this incentive.’ Besides those products, the company uses our scientists’ models to forecast our revenues, our subscription numbers — all kinds of things.”
Another major product area is WBD’s recommendation systems, Shen said, which perform tasks like personalizing what each viewer sees on the “rail” of suggested titles.
Each workflow entails up to 500 features, including viewership, subscriptions and metadata. Tracking those dynamically for almost 100 million active users runs into terabytes of data.
In addition to customer products and recommendations, different WBD data science teams work on areas that include personalization, marketing, reporting and growth, Shen said. And each team works with its own tool sets. “Data scientists tend to do their work in notebooks, whether it’s Databricks or Jupyter or SageMaker notebooks,” he said, “and each has a development environment, integration testing, and product and production environments. We have to help them build a platform to automate deployment of their code using our CI/CD process so their products can be used in the production environment.”
WBD’s ML engineers also help reduce duplication of effort by different developers: “We’ll help use shared libraries and modules so they don’t have to duplicate their efforts. And for feature engineering, they can use shared features.”
The engineering team was using Airflow orchestration to run its ML workflows, but the platform presented a number of challenges. “Airflow is a good tool for data engineering, but it's not perfect for machine learning workflows,” Shen said. “First of all, data scientists’ jobs use a lot of Python modules, and integrating their pure Python into Airflow is a challenge.
“Developing locally was another challenge. Data scientists would develop using notebooks, and notebooks are not compatible with Airflow. So they have to copy and paste their code into the Airflow system, which is a huge pain in the ass. They can’t debug locally, so they have to deploy it to Airflow and see if it works. If not, then they have to do it over again.”
To eliminate duplication and streamline compatibility, WBD turned to Flyte for orchestration. Flyte’s Python support immediately closed gaps between local development and deployment, Shen said. “Now data scientists start developing locally on their machines not using notebook, but using pure Python and just adding the annotation provided by Flyte. It works everywhere; even if you develop in the notebook, you can just port that function over to Flyte.
“That’s the number one benefit,” he continued “The second is you can compose workflows; one workflow can call other workflows, and you can chain them together and reuse them. Workflow compositions aren’t possible in Airflow.”
Another benefit comes from Flyte’s foundation in Kubernetes. “There’s also the benefit of using the standard cloud computing resources versus maintaining the Airflow resources ourselves. You get scalability managed for you as well — with Airflow, you have to have a dedicated DevOps team to achieve the same benefits.”