Blackshark.ai scales Earth’s Digital Twin with Flyte

Industry

Geospatial

Use Cases

Data Processing
Model Training
Inference

Challenge

Blackshark needed scalable orchestration to power global geospatial AI.

Blackshark.ai builds real-time, photorealistic 3D maps of the entire planet—an AI-generated “Digital Twin” constructed from massive volumes of satellite and aerial imagery. Their platform extracts global infrastructure, fills in missing attributes, and generates geo-typical or asset-specific 3D content. The technology has powered applications like 1.5+ billion photorealistic buildings in Microsoft Flight Simulator.

The data volume is staggering. As Flyte Python developer Maarten de Jong explained:

  • Global satellite imagery: ~2.5 PB
  • Processing footprint creation: 1.5B buildings
  • Computation: hundreds of machines in parallel

Initially, Blackshark’s internal team handled all infrastructure, code, and compute. But as the company scaled from 30 to 100 employees, parallel experimentation and new model development strained their homegrown platform.

“At a certain point… there are few people who know how to work with the platform. It’s just not scalable anymore.” —Maarten de Jong

Blackshark needed an orchestration system that could keep pace with massive data growth, increasing workflow complexity, and multi-cloud requirements.

“The reason we’re mainly using Flyte is that it has cloud-native capabilities… we don’t want to be limited to a single cloud provider, so running everything through Kubernetes is amazing for us.”

Maarten de Jong

Python Developer at Blackshark.ai

Solution

Flyte provided cloud-native orchestration for massive geospatial pipelines.

Blackshark adopted Flyte—via Union Cloud—to orchestrate its large-scale AI detection and content generation pipelines. Flyte now handles:

  • Distributed processing of petabytes of imagery
  • Content generation workflows across hundreds of machines
  • Continuous model training and experimentation
  • Multi-cloud deployments via Kubernetes

Flyte’s ability to scale horizontally, support dynamic tasks, and manage complex interdependent workflows made it the backbone of Blackshark’s Digital Twin production infrastructure.

2.5
PB

of imagery orchestrated through Flyte-based pipelines

1.5
B+

building footprints generated with Flyte-managed tasks

100
s

of machines orchestrated in parallel without over-provisioning

Results

Flyte enabled linear growth, reduced compute waste, and improved flexibility.

Flyte supports Blackshark’s linear growth in data volume and computation, enabling smart caching and task merging to prevent unnecessary storage expansion.

Flyte also makes it easy to serve new clients. Blackshark can register workflow versions with client-specific preprocessing or postprocessing steps—without disrupting existing pipelines.

“Having Flyte makes this really easy for us… since we can easily register different versions of workflows that include client-specific steps.” —Maarten de Jong

At the infrastructure level, Flyte helps Blackshark avoid overspending on compute by enabling cluster-level flexibility:

“Flyte has made it much easier to avoid over-provisioning hardware… We use additional clusters with worker nodes to integrate Flyte with a dedicated compute cluster for heavier ML workflows.” —Maarten de Jong

Flyte now serves as the orchestration engine powering Blackshark’s mission to build an accurate, scalable, and ever-expanding Digital Twin of Earth.