Inference

Ultra-low latency, <100ms task startup, dynamic scaling. Realtime or batch workloads.

96%

iteration time

50k+

actions/run

<100ms

latency

“We get significant cost efficiency from running [...] AI inference on TPUs. Having the ability to scale dynamically—to go from zero to 500 TPUs across four regions—is unique and highly valuable. We get that from Union.ai, and I don’t know who else could give us that.”

Smiling man with short dark hair, glasses, and light facial hair wearing a dark jacket.

Greg Friedland

Principal ML Engineer, Rezo

Start today and scale
with confidence.