Security model
An on-device sandbox shares a container with the code that launched it. That’s fine for development on a laptop, CI or sanity-checking your install, but it doesn’t isolate the sandboxed process from your task’s own code, secrets and cloud credentials. This page is about production posture, which means a remote sandbox. The “blast radius” section below justifies why; the rest of the page assumes you’re picking knobs on a remote SandboxEnvironment.
A production sandbox is built from two independent layers:
- The isolation backend running inside the sandbox pod, which constrains the sandboxed process (filesystem, syscalls, capabilities, network namespace).
- The pod runtime, which is whether the pod’s syscalls hit the host kernel directly or go through a user-space kernel like gVisor.
These are independent. A sandbox pod can run userns inside a gVisor pod, or bubblewrap inside a vanilla container pod. Pick each layer for what it actually defends against.
Isolation backends
The library reports the backend on each process as proc.backend. On a remote sandbox, set it with SandboxEnvironment(sandbox_mode=...).
| Backend | How it works | Default? |
|---|---|---|
userns |
unshare(2) + prctl(NO_NEW_PRIVS) + capset + setrlimit, plus Landlock and a seccomp BPF deny-list |
Remote default |
bubblewrap |
bwrap(1) with --unshare-all --die-with-parent --cap-drop ALL, plus a Landlock ruleset as a kernel-side backstop |
On-device default (needs CAP_SYS_ADMIN + unconfined AppArmor); opt-in for remote (DEFAULT_SANDBOX_ENV_BWRAP or sandbox_mode="bwrap") |
sandbox-exec |
macOS wrapper around Apple’s sandbox-exec; restricts writes to the work dir and can deny outbound sockets |
macOS on-device only |
none |
setpgid + best-effort setrlimit; logs a warning |
Dev only (no isolation) |
Both userns and bubblewrap layer namespaces, dropped capabilities, a
Landlock filesystem ruleset, and a seccomp BPF deny-list. They are not equally strong though: bubblewrap is the stronger backend. With CAP_SYS_ADMIN it pivots into a fresh mount root, which closes the gap where the sandbox shares the pod’s root filesystem. userns is the lite variant: it runs in a vanilla pod with no extra capabilities, but it leaves that shared-rootfs gap open, so its mount isolation is weaker.
The remote default is userns because it runs anywhere (bwrap needs CAP_SYS_ADMIN + unconfined AppArmor and isn’t always present in minimal images). When you can grant the pod those capabilities and want the strongest in-pod isolation, choose bubblewrap.
Pod security for the bubblewrap backend
bubblewrap runs as a non-root user via unprivileged user namespaces. But the containerd default seccomp profile only permits the mount / pivot_root / setns / unshare syscalls bwrap needs when the container’s capability set includes CAP_SYS_ADMIN, and the default AppArmor profile must be unconfined so those calls aren’t blocked.
flyte.PodTemplate().allow_nested_sandboxing() grants exactly that: CAP_SYS_ADMIN plus unconfined AppArmor, allowPrivilegeEscalation: false. How you apply it depends on the transport:
-
On-device: put it on the task that opens the session, since the sandbox child runs in that pod.
bwrap_env = flyte.TaskEnvironment( name="sandboxed-task", image=sb.base_sandbox_image, pod_template=flyte.PodTemplate().allow_nested_sandboxing(), ) @bwrap_env.task async def main() -> str: async with sb.on_device.session(backend="bubblewrap") as sbx: ... -
Remote: the
SandboxEnvironmentderives the pod template fromsandbox_mode/sys_cap_adminfor you andsandbox_mode="bwrap"carries the grant automatically. See Deployment.
The userns backend needs none of this because it runs in a vanilla pod. Choose userns when you can’t (or don’t want to) grant the pod extra capabilities; choose bubblewrap when you can, for its stronger isolation — at the cost of the CAP_SYS_ADMIN + AppArmor grant above.
Blast radius: why remote
The backend constrains the sandboxed process. What an escaping process can reach is determined by where the sandbox runs.
An on-device sandbox runs inside the same container as the code that launched it. If the backend is breached, the escaping process can reach your task’s own code, mounted secrets, and service-account or cloud credentials. The pod boundary is the only thing still containing it; unless the task pod itself runs under gVisor, that boundary is the host kernel.
This matters only when the sandboxed code is untrusted: for trusted code (your own prompts and tools, not exposed to end users) on-device is a perfectly good production choice. When the code is untrusted, prefer a remote sandbox so an escape lands in a throwaway pod, not your workload.
A remote sandbox runs in its own pod with:
- a typically minimal image (no caller code, no toolchain)
- its own service account (no task secrets, no cloud credentials)
- no access to whatever the caller mounted
The escape blast radius is the sandbox pod, not your workload. Hardening that pod with gVisor (below) further reduces what an escape can do to the host kernel.
Pod runtime: gVisor
Independent from the in-pod backend:
sandbox_mode(usernsorbwrap) selects the in-pod backend constraining the sandboxed process.runtime(containerorgvisor) selects how the pod itself is run.
Setting runtime="gvisor" puts runtimeClassName: gvisor on the sandbox pod, so its syscalls go through the
gVisor application kernel rather than hitting the host kernel directly. Recommended whenever the sandboxed code is untrusted or the workload is multi-tenant.
hardened = sb.SandboxEnvironment(
name="hardened-sandbox",
sandbox_mode="userns",
runtime="gvisor",
)runtime="gvisor" requires the gvisor RuntimeClass to be installed and enabled in your cluster. In most cases, talk to your Union solutions engineer to enable it.
Choosing a posture
Backend choice does affect isolation strength: bubblewrap is stronger than userns-lite (it closes the shared-rootfs gap, as above), so prefer bubblewrap when the pod can carry CAP_SYS_ADMIN + AppArmor and userns when it can’t. But both are solid process-level isolation that’s fine for production in a normal container pod. You don’t need gVisor to run a sandbox responsibly. The bigger lever as trust drops is the pod runtime (container vs gvisor) and tenant isolation (shared SandboxEnvironment vs one per tenant).
| Trust level | Pod runtime | Tenant isolation | Per-call notes |
|---|---|---|---|
| Trusted (your own code/prompts, not exposed to end users) | container |
Shared env is fine | Default network_mode="blocked"; allow-list when you need it. Process isolation is sufficient. |
| Semi-trusted (vetted third-party libraries, your own ML code) | container |
Shared env is fine | Default network_mode="blocked"; allow-list when you need it. |
| Untrusted (LLM-generated from end-user input, user-submitted) | gvisor (recommended) |
Shared env is fine | Stage inputs via put_bytes; keep network_mode="blocked" unless a step needs egress. |
| Multi-tenant, hostile inputs assumed | gvisor |
One SandboxEnvironment per tenant; no cross-tenant pod reuse |
network_mode="blocked" on every run(); the proxy allow-list is not adversarial-safe (see
Networking). |
The principle: let the workload pick the floor, let the threat model pick the ceiling. The default backend is the floor and is appropriate for the common case. Reach for gVisor when you’re actually running hostile code or sharing the system across tenants — not as a blanket requirement.
What’s not in scope
- Side-channel attacks (timing, cache, Spectre-class) are not addressed by any backend here. If you need defense against them, you need hardware partitioning, not a sandbox.
- Resource exhaustion is bounded by
Resources(...)on the sandbox pod and per-calltimeout_sonrun(). The backends do not prevent a sandboxed process from using all the CPU and memory the pod gives it. - The proxy-based network allow-list is not a kernel-level firewall. See Networking for what it does and does not protect against.
Related
-
Networking. Per-call
network_modeand what the allow-list actually constrains. - Filesystem. Default filesystem allow-list and how to extend it.
-
Deployment.
SandboxEnvironment, custom images, and per-launch overrides.