Announcing Zero Trust Security Architecture

Haytham Abuelfutuh

Your data never transits Union.ai's infrastructure

Every enterprise security review eventually narrows to one question: where, exactly, does our data go?

The industry standard answer is: through the vendor. Workflow inputs and outputs are stored in the vendor's cloud. Logs stream through the vendor's services. The UI renders data that was decrypted on the vendor's machines. "Encryption in transit" gets offered as reassurance, but the data is still passing through infrastructure the vendor operates and controls.

The strongest security posture is not strong encryption - it’s when the data was never there in the first place.

Think of it this way: most vendors ask you to store your valuables in their safe, not yours. They assure you the safe is locked, you can stop by anytime you want, and only you have the combination. But the safe lives in their building, on their infrastructure, accessible to their employees. For security-conscious teams (like those in defense, finance, healthcare), that arrangement fails to deliver the security they need.

Union.ai is built differently. Your safe stays in your building.

How other platforms handle this

On other platforms, data transits vendor infrastructure

A common pattern for software infrastructure is to route data through a vendor-operated control plane. Signed URL generation, log fetching, input and output retrieval, and auxiliary UIs (Ray dashboards, Spark history, in-task debuggers) all transit vendor infrastructure before reaching your screen. The data is encrypted in flight, but it passes through machines the vendor runs, and a vendor employee with the right access can see it.

On other platforms, data is stored in vendor infrastructure

Some platforms compound this with a structural problem: their control plane stores workflow payloads in a vendor-operated database. This means payload sizes get capped in the low single-digit megabytes, the vendor charges for the storage, and the workaround is a "payload codec" that customers implement, deploy, and key-manage themselves. The goal of the codec is to encrypt payloads before they reach the vendor's service, so that data stored on vendor infrastructure remains opaque even if that service is compromised. But it is a fix to a problem that should not exist in the first place.

How Union.ai Zero Trust is architected

Union.ai runs on a split-plane model. The control plane handles orchestration, scheduling, identity, and the API surface. The data plane runs inside the customer's own secure cloud, against the customer's object store, and is where every input, output, log, and report lives.

Flyte 2's design for Union.ai: the control plane only ever holds references to data, never the data itself. Customers routinely pass terabytes between tasks at low latency and no extra cost, because the bytes never leave their cloud. There are no payload caps, no storage charges, and no encryption workaround to maintain.

The dataproxy service, responsible for signed URLs, log fetching, I/O retrieval, and auxiliary UIs, lives inside the customer's data plane within their security perimeter. Every read of a workflow input, output, log, or code bundle is served directly from inside your cloud. Writes go the same way: the client uploads inputs straight to your object store, then tells Union.ai "this run uses that data" by reference. While the control plane holds orchestration-related metadata (e.g., run IDs, scheduling, etc.), sensitive customer data never flows through Union.ai infrastructure.

Union.ai provisions a secure Cloudflare tunnel for the data connection, initiated outbound from inside your cluster. No inbound ports are opened on the data plane, no firewall rules need to change, and no public load balancer is provisioned. Your cluster dials out; traffic comes back along the same authenticated channel. Every request is authorized against your Union.ai identity and RBAC policy, enforced by an Envoy router inside the data plane.

The control plane coordinates the run. It does not touch the payload.

Examples of what Zero Trust protects against

‍A compromised Union.ai control plane yields workflow metadata: run IDs, status, scheduling references, and identity records. Workflow data, logs, code bundles, and auxiliary UIs are unreachable from it. The control plane does not hold or proxy any of that. Moving laterally to your actual data requires also compromising your IAM and KMS, which live in your cloud account.‍
A malicious Union.ai employee has the same blast radius as a compromised control plane. Operator access to Union.ai infrastructure does not extend to your cloud account. The data lives there.‍
An external attacker finds no inbound port open on your data plane. The tunnel is initiated outbound from inside your cluster. The only Union.ai surface reachable from the public internet is the control plane API, which holds only metadata and identity. Your workflow data sits behind your own cloud perimeter.‍
Future cryptanalysis is where the architectural choice matters most. Recorded ciphertext from data that transited a control plane could eventually be decrypted. Data that never transited the control plane cannot be, because those bytes were never on a third-party wire to record. The strongest security posture is not strong encryption - it’s when the data was never there in the first place.

The guarantee is verifiable by inspection: read the Helm chart, the tunnel config, the dataproxy deployment manifest, and the Envoy filter chain. The property falls out of the topology. It does not depend on Union.ai employees behaving correctly, on incident response procedures, on a future audit catching a regression, or on cryptographic primitives remaining unbroken.

What this means for you

Our Zero Trust security architecture is designed to give both platform engineers and AI engineers what they need.

Your security review gets shorter.

Your security team does not need to take our word for it. Cloud audit logs confirm the tunnel pod's outbound-only connection pattern. Control plane logs record every API call; by inspection, none carry data payloads. Cloud-native audit trails (CloudTrail, GCP Cloud Audit, Storage logs) record every read and write to your object store, with a customer-controlled identity as the principal on every payload-bearing access, never Union.ai. All of this is ingestible into your own SIEM without our involvement.

Union.ai holds SOC 2 Type II across Security, Availability, and Processing Integrity. The upgraded MSA codifies the data isolation guarantee contractually. The architecture makes it true; the contract makes it binding.

If you have been carrying a hand-rolled encryption layer, a custom proxy, or an internal debate about whether the orchestration platform your team wants can clear your security review, Zero Trust resolves that directly.

Your engineering team gets the full product.

Most platforms that offer a stricter data path do it by disabling the parts of the product that make it useful: no log streaming, no input/output inspection, no built-in dashboards. Zero Trust keeps every visualization feature intact, including Ray dashboards, Spark history server, and in-task debuggers. Tightening the security posture does not cost you the product.

Direct-to-DataPlane also has lower latency than the standard architecture, not higher. One fewer hop. Log streaming is faster. Large input uploads are faster. Egress costs fall because data no longer leaves your cloud and returns.

Full security architecture documentation: union.ai/docs/v2/union/security

Try the devbox

A free, local sandbox to explore the Union.ai platform.

Chat with an engineer