Do you offer discounts for startups?

Talk to our Startup Team for information about startup discounts.

Do you offer discounts for NGOs, universities, or non-profits?

Talk to our Public Sector Team for information about discounts for NGOs, universities, or non-profits.

How do monthly credits work for the Team plan?

Your monthly plan fee is issued back to you as usage credits. In practice, this means your monthly plan cost becomes your minimum monthly spend, and you can use that same amount in usage at no additional charge. Any usage beyond that amount is billed separately.

An action is an individual execution of a task. It represents a specific invocation of a task with particular inputs. If a task runs multiple times (such as inside a loop) you'll see multiple actions, one for each invocation.

Can my team have a forward-deployed engineer (FDE) from Union.ai to help us build?

Book a consultation to discuss having a forward-deployed engineer from Union.ai help your team build.

Do you offer a self-hosted control plane as a deployment option?

Book a consultation to discuss self-hosted control plane deployment options.

How do you calculate GPU, CPU, Memory hours of usage?

We report the allocated resources (CPU, Memory, and GPU accelerator) from each container running the actions within your workflows and apply usage-based pricing down to the second. We do not include the resources consumed by any other services. Therefore, if you run Union on a shared K8s cluster, you are only paying for usage on the resources consumed by your Union tasks and workflows.

Is Union.ai a SaaS service?

No. You deploy the Union operator into a Kubernetes cluster you manage, which securely communicates with the Union control plane to poll for work. Your workflow executions, code, images, data, logs, and secrets all remain in your VPC/cloud, and are inaccessible to Union.

What's the difference between action concurrency and actions/run (i.e., task fanout)?

Fanout is the total number of actions a run creates, while concurrency is how many of those actions are running at the same time. For example, a run might fan out to 50,000 actions but only execute around 100 of them concurrently.

Can I run Union.ai in my own cloud environment?

Yes, Union.ai supports bring-your-own-cloud (BYOC) deployments. You can run it in your own AWS, GCP, Azure, or neo-cloud environment while maintaining full control over your data, security, and infrastructure.

Flyte 1 vs. Flyte 2 vs. Union.ai

All three share the same Python-native authoring model. Your workflows don’t change. What changes is the runtime: how far it scales, how fast it executes, and how much of the infrastructure your team has to own.

Try the devbox

A free, local sandbox to explore the Union.ai platform.

Chat with an engineer

Compare Features

Flyte workflows run on Union.ai without rewriting. The comparison below covers execution architecture, runtime capabilities, and operational model: the parts that determine whether your platform can keep up with your team.
‍
Both platforms share the same Python-native authoring, dynamic workflows, and typed exception handling. Flyte workflows run on Union.ai without rewriting.

Flyte 1

Flyte 2

Scale & Throughput

Flyte's execution model submits one Kubernetes pod per task. At low volumes this is fine. At high cardinality it creates cascading pressure on the pod scheduler, the K8s API server, and etcd. The limits below are symptoms of that architecture, not configuration choices.

Workflow fanout

~5K tasks

bounded by map-task mechanics and K8s control-plane throughput

~10K tasks

improved orchestration, same underlying K8s scheduling path

250K+ tasks

high-cardinality scheduling runs through a purpose-built execution substrate, bypassing pod-scheduler and etcd write pressure

Workflow executions / hour

~10K/hr

~1M/hr

designed for continuous evaluation services, experiment CI, and high-frequency model testing

Task executions / minute

~150/min

each task pays pod scheduling, image pull, container start, and Python import overhead

~250/min

Flyte 2 improves control flow

~5,000/min

warm execution eliminates repeated Kubernetes lifecycle overhead for short-running tasks

Cold start latency

~30s

<1s

sub-100ms for tasks running on warm reusable containers

Concurrent actions

~500

bounded by controller and global cluster limits

~500

~30K

concurrency is tunable at the project and domain level

GPU reuse across invocations

every task requests a GPU, starts a container, runs, and releases; accelerator sits idle during setup/teardown

same pod-per-task model

reusable containers keep the process and GPU context warm across repeated calls; tool calls, rerankers, embedding steps, and chained inference stop wasting accelerator time on lifecycle overhead. Smart batching techniques enabled by GPU reuse can push GPU utilization toward 100%.

Authoring & Workflow Semantics

All three platforms share the same Python-native authoring model introduced in Flyte 2. The differences below are about what the runtime can safely execute and how dynamically workflows can adapt at runtime.

Python-native workflow authoring

Static DAG DSL

workflows must be fully defined at compile time; loops, conditionals, and branching require Flyte-specific constructs

Pure Python

tasks can call tasks directly with loops, conditionals, and normal async control flow

Pure Python

same authoring model; Union.ai adds production runtime capabilities around identical code, priority control, and throttles via Queues.

Workflow sandboxing for generated code

Monty sandbox starts in microseconds and structurally blocks filesystem access, network I/O, OS calls, and arbitrary imports; heavy computation runs in isolated containers

same sandbox foundation with Union.ai runtime, serving, and ops layers around it

Typed exception handling across task boundaries

failures surface through task state and static retry policies only

OOM, spot preemption, and custom errors can be caught and handled in Python control flow

catch OOM and retry with more memory; catch spot interruption and resume from checkpoint; branch recovery logic without rebuilding the workflow

Runtime resource overrides

resource shape is set at compile time; adapting to data size or failure mode requires redesign

‍override GPU, memory, image, retry policy, and env at execution time

same override model with Union.ai's multi-cluster routing available as an additional dispatch layer

Realtime AI & Agents

Flyte was designed as a batch orchestration system. The rows below reflect what it takes to move from running experiments to powering production applications.

Live model / agent / API serving

batch system; a separate serving stack is required

Partial

batch workflows and optional serving via kserve.

serve models, agents, APIs, and applications directly from Union's custom engine, with customized identity aware networking layer

Realtime inference

GPU-backed inference endpoints with autoscaling, reading artifacts produced by Union.ai workflows

Agent execution runtime

every tool call or short unit of work pays full batch-system overhead; impractical for chained steps

Partial

Flyte 2 improves control-flow expressiveness significantly, but short tasks still pay pod-per-task startup cost

No new DSL, just Python. Everything runs in the user's cloud, fully secured, zero-trust by design, SLMs can be hosted locally. No fragmentation: train, process data, run agents, and serve on one platform. No vendor lock-in. MCP servers can be deployed privately within your cloud.

Unified batch and realtime lifecycle

‍training/eval pipelines and serving are separate systems

the same data plane runs training workflows and serves live endpoints: preprocess, train, evaluate, register, serve, and observe without stitching together two platforms

Infrastructure & Scheduling

The capabilities below are either entirely absent in Flyte OSS or left to the platform team to build and operate independently.

Task-level cluster routing

cross-cluster execution requires external orchestration or separate Flyte deployments

placement is workflow-level; individual tasks cannot be dynamically routed to different clusters

Union.ai's global scheduler routes individual tasks dynamically across registered clusters by resource type, cost, availability, region, or policy, without splitting the workflow. Queues add priority and concurrency control per project or domain. One workflow can preprocess on CPU spot, fine-tune on H100s, and validate on cheaper GPUs.

Fail-fast resource validation

unschedulable pods sit Pending until someone inspects K8s events

if requested hardware doesn't exist, pods remain Pending with no workflow-level explanation

validates resource requests against actual schedulable inventory before pod submission; no more workflows stuck for hours because someone requested A100s in a T4 cluster

Accelerated datasets

each pod downloads from object storage at startup unless the customer builds shared cache infrastructure

same per-pod download behavior from S3/GCS

large read-only datasets defined once and pre-mounted as shared volumes (EFS/FSx for Lustre); eliminates 8 to 12 minutes of per-pod download time before useful work starts

Remote image building

local Docker build, push to registry, reference in code

‍same local Docker/registry flow

ImageSpec builds images remotely inside your data plane via Kaniko; works from notebooks, CI, and Apple Silicon with no local Docker or registry credential management

Deployment path

Manual

bounded by cHelm, Postgres, ingress, object store, IAM, auth, logging, and upgrades assembled and operated by your teamontroller and global cluster limits

Manual

same Helm-driven deployment with the same operational burden

Self-service Terraform (BYOC)

provisions data plane, agent, IAM, buckets, and connectivity; platform teams validate workloads in the first hour instead of spending the first sprint wiring infrastructure

Time to first workflow

Days

every task requests a GPU, starts a container, runs, and releases; acceleratofor an experienced K8s/IAM/Flyte teamr sits idle during setup/teardown

~2 days

with k8s experience, following docs, without blockers

<1 hour

via self-service Terraform, plus optional Union.ai onboarding support

Distributed training

Kubeflow operator only

Clustered tasks via Kubernetes jobsets

Clustered tasks with native metrics and observability, gang scheduling on dedicated clusters, ability to run on the primary training node for fast iteration

Observability, Data & Cost

Flyte OSS surfaces execution state. The rows below cover what it takes to understand what your workflows are doing, where your spend is going, and where your data came from.

Realtime and persisted logs

External only

historical search requires your own CloudWatch, Loki, Elasticsearch, or equivalent

External only

logs are not retained after pod termination without a separately managed logging backend

Union.ai collects and indexes task logs in your data plane; query logs from last Tuesday without asking platform engineering to re-run the job or dig through cloud log groups

Per-task CPU/GPU/memory profiling

External only

requires Prometheus/Grafana, Datadog, or custom sidecars

External only

same external monitoring dependency

CPU, GPU, and memory time-series graphs scoped to each task execution in the UI; see that a fine-tuning job ran at 3% GPU utilization, or that a task OOMs at 14.8GB every time, without correlating workflow IDs to a separate dashboard

Cost attribution

no native path from cloud bill to workflow, team, project, or execution

cost broken down by project, domain, workflow, and execution with configurable resource pricing; answer which team or model burned the GPU budget without building a separate cost pipeline on top of Kubernetes billing exports

Artifact registry

typed blobs and execution metadata exist, but no browsable registry with tags, versions, or lineage

raw blobs and metadata are insufficient for model governance or discoverability

artifacts are typed, tagged, versioned, and lineage-linked; find which dataset trained a model, which workflow produced it, and trigger downstream evaluations automatically when a new artifact appears

Cross-run lineage and provenance

limited to execution metadata and user conventions

full provenance graph queryable through the UI and SDK; when a dataset is bad, immediately identify which models, evaluations, and downstream artifacts are affected

Ray and Spark dashboards

Self-managed

teams run Ray Dashboard and Spark History Server themselves and solve ingress, auth, and VPN access independently

Self-managed

linked from the task detail view and proxied securely through Union.ai auth and RBAC; no custom ingress, no port hunting

Security & Governance

Secrets management

Manual

K8s Secrets with manual Vault integration; secrets may live unencrypted in etcd depending on deployment configuration

Manual

same K8s Secrets/Vault integration with no managed lifecycle

‘flyte create secret’ writes to AWS Secrets Manager, GCP Secret Manager, Azure secret manager, Vault, or a custom backend; credentials are injected at runtime and never baked into images or K8s YAML

Fine-grained RBAC

project- and domain-scoped viewer/developer/admin roles mapped from SSO groups; separate dev from prod and team from team without creating a new cluster for every access boundary

SSO and IdP group sync

Basic OIDC

each task pays pod scheduling, image pull, container start, and Python import overhead

Basic OIDC

Flyte 2 improves control flow, not the per-task startup path

managed SAML/OIDC with IdP group sync into Union.ai roles; onboard teams to a shared AI platform without building a custom permissions admin system

Zero-trust data path

Self-managed

data stays in customer infrastructure, but the customer owns full control plane operations and cost

Self-managed

all data, logs, I/O, and auxiliary UIs stay within your VPC via Direct-to-DataPlane routing. Union hosts the control plane, eliminating the ~20% infrastructure cost of running it yourself, without ever touching customer data

Platform Operations

Production support

Community only

GitHub issues and Slack; no SLA, no escalation path

Community only

GitHub issues and Slack; no SLA

enterprise support channel with direct escalation to Union.ai engineering; when production AI infrastructure breaks before a model release, your platform team has an accountable vendor to call, not just a GitHub issue to file

Results, proven in production.

Hopper visualizes 4.4 billion flight prices with pure Python orchestration

Woven by Toyota saves millions and scales autonomous driving with Union.ai

Rezo accelerates drug discovery while saving >90% on compute costs with Union.ai

Frequently asked questions

What makes Union.ai better for production AI?

Union.ai outperforms any OSS alternative on scale and performance in production. It supports 50K+ actions per workflow, 10,000+ concurrent actions per run, and cold start under 5 seconds. Reusable warm-start containers, per-action GPU and CPU profiling, cost attribution per team and workflow, and fail-fast resource validation at launch are the capabilities that separate a platform you can run experiments on from one you can run a business on.

What are reusable containers and when do they matter?

Most orchestrators launch a new Kubernetes pod per action, ~10 seconds of overhead before your code runs. Union.ai supports reusable containers: warm containers you can use across similar tasks. Cold start drops to under 100ms and GPU stays allocated across invocations. For teams building agentic AI, RAG pipelines, or multi-step inference workflows, this adds essential production efficiency.

How hard is it to migrate from Flyte to Union.ai?

Flyte workflows run on Union.ai without rewriting. The SDK is compatible and the authoring model is identical. The migration is mostly operational and straightforward. Most teams run their first workflow on Union.ai within an hour of starting setup.

We’re running open-source Flyte. What’s the real cost?

Flyte OSS is free to license. Operating it (or any open-source orchestrator) is not free. A stable production deployment requires a significant amount of manual maintenance that gets more costly as you scale. Engineers must manage Helm values, Postgres, ingress config, a separate secrets solution, an external log aggregation stack, and ongoing K8s maintenance. Union.ai offloads this maintenance so your team focuses on workflows, not infrastructure. The break-even on engineer time tends to come faster than most teams expect.

Does Union.ai make sense for smaller teams?

Scale is one part of the value. The features that tend to matter first for smaller teams are data lineage, persistent logs and built-in observability, and managed secrets that pass a security review without custom engineering. RBAC and cost attribution matter as soon as a second team starts touching the same platform. The operational overhead of self-managed Flyte tends to grow faster than the team itself does.

Why is Union’s zero trust architecture trusted by extremely security-sensitive industries?

Union’s zero trust security architecture means data NEVER transits outside your secure cloud. No model weights, pipeline outputs, or execution logs leave your environment. This is more secure than the industry status quo, where you’re required to trust a vendor to handle your data safely.

Start today and scale with confidence.

Chat with an engineer

Flyte 1 vs. Flyte 2 vs. Union.ai

Compare Features

Scale & Throughput

Authoring & Workflow Semantics

Realtime AI & Agents

Infrastructure & Scheduling

Observability, Data & Cost

Security & Governance

Platform Operations

Results, proven in production.

Hopper visualizes 4.4 billion flight prices with pure Python orchestration

Woven by Toyota saves millions and scales autonomous driving with Union.ai

Rezo accelerates drug discovery while saving >90% on compute costs with Union.ai

Frequently asked questions

What makes Union.ai better for production AI?

What are reusable containers and when do they matter?

How hard is it to migrate from Flyte to Union.ai?

We’re running open-source Flyte. What’s the real cost?

Does Union.ai make sense for smaller teams?

Why is Union’s zero trust architecture trusted by extremely security-sensitive industries?

Start today and scale with confidence.

Get updates on new features and releases

Platform

Solutions

Resources

Company