Blackshark.ai scales Earth’s Digital Twin with Flyte

Challenge
Blackshark needed scalable orchestration to power global geospatial AI.
Blackshark.ai builds real-time, photorealistic 3D maps of the entire planet—an AI-generated “Digital Twin” constructed from massive volumes of satellite and aerial imagery. Their platform extracts global infrastructure, fills in missing attributes, and generates geo-typical or asset-specific 3D content. The technology has powered applications like 1.5+ billion photorealistic buildings in Microsoft Flight Simulator.
The data volume is staggering. As Flyte Python developer Maarten de Jong explained:
- Global satellite imagery: ~2.5 PB
- Processing footprint creation: 1.5B buildings
- Computation: hundreds of machines in parallel
Initially, Blackshark’s internal team handled all infrastructure, code, and compute. But as the company scaled from 30 to 100 employees, parallel experimentation and new model development strained their homegrown platform.
“At a certain point… there are few people who know how to work with the platform. It’s just not scalable anymore.” —Maarten de Jong
Blackshark needed an orchestration system that could keep pace with massive data growth, increasing workflow complexity, and multi-cloud requirements.
“The reason we’re mainly using Flyte is that it has cloud-native capabilities… we don’t want to be limited to a single cloud provider, so running everything through Kubernetes is amazing for us.”

Maarten de Jong
Python Developer at Blackshark.ai
Solution
Flyte provided cloud-native orchestration for massive geospatial pipelines.
Blackshark adopted Flyte—via Union Cloud—to orchestrate its large-scale AI detection and content generation pipelines. Flyte now handles:
- Distributed processing of petabytes of imagery
- Content generation workflows across hundreds of machines
- Continuous model training and experimentation
- Multi-cloud deployments via Kubernetes
Flyte’s ability to scale horizontally, support dynamic tasks, and manage complex interdependent workflows made it the backbone of Blackshark’s Digital Twin production infrastructure.
of imagery orchestrated through Flyte-based pipelines
building footprints generated with Flyte-managed tasks
of machines orchestrated in parallel without over-provisioning
Results
Flyte enabled linear growth, reduced compute waste, and improved flexibility.
Flyte supports Blackshark’s linear growth in data volume and computation, enabling smart caching and task merging to prevent unnecessary storage expansion.
Flyte also makes it easy to serve new clients. Blackshark can register workflow versions with client-specific preprocessing or postprocessing steps—without disrupting existing pipelines.
“Having Flyte makes this really easy for us… since we can easily register different versions of workflows that include client-specific steps.” —Maarten de Jong

At the infrastructure level, Flyte helps Blackshark avoid overspending on compute by enabling cluster-level flexibility:
“Flyte has made it much easier to avoid over-provisioning hardware… We use additional clusters with worker nodes to integrate Flyte with a dedicated compute cluster for heavier ML workflows.” —Maarten de Jong
Flyte now serves as the orchestration engine powering Blackshark’s mission to build an accurate, scalable, and ever-expanding Digital Twin of Earth.


