Do you offer discounts for startups?

Talk to our Startup Team for information about startup discounts.

Do you offer discounts for NGOs, universities, or non-profits?

Talk to our Public Sector Team for information about discounts for NGOs, universities, or non-profits.

How do monthly credits work for the Team plan?

Your monthly plan fee is issued back to you as usage credits. In practice, this means your monthly plan cost becomes your minimum monthly spend, and you can use that same amount in usage at no additional charge. Any usage beyond that amount is billed separately.

An action is an individual execution of a task. It represents a specific invocation of a task with particular inputs. If a task runs multiple times (such as inside a loop) you'll see multiple actions, one for each invocation.

Can my team have a forward-deployed engineer (FDE) from Union.ai to help us build?

Book a consultation to discuss having a forward-deployed engineer from Union.ai help your team build.

Do you offer a self-hosted control plane as a deployment option?

Book a consultation to discuss self-hosted control plane deployment options.

How do you calculate GPU, CPU, Memory hours of usage?

We report the allocated resources (CPU, Memory, and GPU accelerator) from each container running the actions within your workflows and apply usage-based pricing down to the second. We do not include the resources consumed by any other services. Therefore, if you run Union on a shared K8s cluster, you are only paying for usage on the resources consumed by your Union tasks and workflows.

Is Union.ai a SaaS service?

No. You deploy the Union operator into a Kubernetes cluster you manage, which securely communicates with the Union control plane to poll for work. Your workflow executions, code, images, data, logs, and secrets all remain in your VPC/cloud, and are inaccessible to Union.

What's the difference between action concurrency and actions/run (i.e., task fanout)?

Fanout is the total number of actions a run creates, while concurrency is how many of those actions are running at the same time. For example, a run might fan out to 50,000 actions but only execute around 100 of them concurrently.

Can I run Union.ai in my own cloud environment?

Yes, Union.ai supports bring-your-own-cloud (BYOC) deployments. You can run it in your own AWS, GCP, Azure, or neo-cloud environment while maintaining full control over your data, security, and infrastructure.

Union.ai vs. Airflow

Still orchestrating AI/ML workloads with tools built for ETL?

Union.ai, the managed Flyte platform, is the production runtime for the AI era. Orchestrate compute, models, and data on your own secure cloud. No DAGs needed.

Try the devbox

A free, local sandbox to explore the Union.ai platform.

Chat with an engineer

Old eras of orchestration don’t work for AI

Using a data orchestrator instead of an AI runtime is like using a paper map instead of GPS.

Orchestration has evolved through 3 distinct eras:

Data Era: move data from A to B
ML Era: run workloads on different compute resources
AI & Agentic Era: dynamically determine workflow paths at runtime

AI is non-deterministic, so it needs to branch, handle errors, and provision resources dynamically at runtime.

Static DAGs aren’t built for this reality.

Airflow was built for data, not AI

Airflow was designed to move structured data between systems on a schedule. That worked when every task shared one environment and one compute profile. But AI workloads broke that assumption:

One-size-fits-all compute. Airflow runs tasks in a monolithic environment. When your preprocessing needs CPUs and your training needs GPUs, you're either overprovisioning everything or bolting on workarounds.
No native data handoff. Passing data between tasks means writing custom serialization, storage logic, and glue code — a primary source of bugs and version drift.
Static DAGs, brittle pipelines. The execution graph is defined before runtime. When an LLM call needs to branch, retry with different parameters, or spin up new tasks based on intermediate results, Airflow has no answer.
Lineage is an afterthought. Tracking which data produced which model requires external tooling. At scale, this becomes a compliance and debugging nightmare.

If you’re running simple data workloads, these compromises can be totally fine. But if you’re orchestrating AI or agentic projects, you’ll need an AI runtime.

Airflow

Flyte 2

Per-task compute profiles

Limited

Tasks run in a shared worker environment; separating CPU preprocessing from GPU training requires bolted-on workarounds like KubernetesPodOperator, adding config overhead and complexity

Each task declares its own compute requirements; GPU type, memory, and CPU set per task

Same per-task compute plus task-level routing across clusters; one workflow can preprocess on CPU spot, train on H100s, and validate on cheaper GPUs without manual coordination

Native typed data passing

XCom for small values; large data requires custom serialization, S3 paths, and glue code between tasks — a primary source of bugs and version drift

Typed inputs and outputs passed natively between tasks; no manual serialization or path management

Same typed data model plus artifact registry; outputs are versioned, browsable, and lineage-linked across runs

Dynamic workflows at runtime

DAG structure is fixed at parse time; branching is limited to pre-defined paths; workflows cannot adapt based on what an LLM call or intermediate result returns

Pure Python control flow; tasks can spawn new tasks, branch, and adapt based on intermediate results at runtime

Same dynamic model plus runtime resource overrides; tasks can request different hardware mid-workflow based on what intermediate results require

Data lineage

External only requires OpenLineage or custom tooling; not built into the platform; at scale this becomes a compliance and debugging liability

Partial

Typed inputs and outputs are tracked per task, lineage is limited to execution metadata and user conventions

Full cross-run provenance graph queryable through UI and SDK; when a dataset is bad, immediately identify every model and artifact downstream

Python-native authoring

Partial

DAGs are written in Python but as static declarative definitions; the operator pattern feels like config, not code; loops and conditionals define structure, not execution

Pure Python with decorators; workflows are real Python functions with loops, conditionals, and normal async control flow that execute at runtime

Same Python model; no rewriting needed to move from Flyte OSS to Union.ai

Self-healing and retries

Partial

Task-level retries with configurable backoff, but no adaptive failure recovery or typed exception handling

Configurable retry policies with backoff; failed tasks restart automatically

Same retry model plus typed exception handling; catch specific failure modes and adapt rather than retrying blindly

Local development

Partial DAGs can be tested locally but environment parity with production requires manual setup; Kubernetes executor behavior does not replicate locally

Run any workflow locally with pyflyte run; full environment parity before pushing to production

Same local execution plus Union devbox for iteration against production-identical infrastructure

Cold start latency

30s+

Celery and Kubernetes executor cold starts typically 30s or more; local executor is faster but not production-grade

~30s

Standard Kubernetes pod scheduling and container startup on every task

<1s

Reusable containers keep the process warm; sub-100ms for repeated invocations, the difference between a batch job and an interactive loop

Task fanout

Limited dynamic task mapping helps but large fan-out hits scheduler bottlenecks at high cardinality

~10K tasks

Bounded by Kubernetes control-plane scheduling throughput

250K+ tasks

Purpose-built execution substrate bypasses the K8s pod-scheduler bottleneck that caps Flyte OSS at high cardinality

Deploys in your cloud

DIY ops

Self-managed or cloud-specific MWAA on AWS, Cloud Composer on GCP; managed options exist but are locked to one cloud, or your team owns all ops

DIY ops

Self-managed any cloud, but your team owns all installation, ops, upgrades, and infrastructure

Union.ai manages the platform so your team focuses on workflows, not infrastructure maintenance

Union.ai is AI-native orchestration

Union.ai, the enterprise Flyte platform, is expressly designed for AI engineers. Teams can build workflows that are:

Self-healing, so pipelines that fail autonomously recover and continue
Dynamic, so your AI systems and agents can make decisions on the fly at runtime
Authored in pure Python, so you can easily go from local dev to production in your cloud
Compute-aware, operating in your cloud and auto-scaling to optimize usage
Scalable and efficient, handling large task fanout and parallelism with ease

Union.ai is built for production

The platform deploys to your secure cloud

Enhanced scale and performance, with significantly improved actions/run, concurrency, and task startup time
End-to-end AI lifecycle support, including orchestration, training and fine-tuning, and inference
Developer-loved UI, for faster, easier development cycles
Observability, including for data lineage, resource usage, failure logs, etc.
Portability to open-source, for teams looking to avoid lock-in

Teams report that Union.ai accelerates them from prototype to production, cutting iteration cycle time in half.

The Union.ai team offers high-touch support to ensure users are successful.

Flyte 2 OSS: Open-source AI runtime

Flyte 2 OSS is the most powerful open-source AI runtime, bringing Flyte’s core data model, scalability, and reliability to DIY teams. While it lacks some enterprise capabilities of Union.ai, it remains the most capable open-source AI runtime available. It’s trusted by teams worldwide with 80M+ downloads and growing.