Volumes
A Volume is a persistent file system that your task mounts and uses like an
ordinary local directory. Unlike
flyte.io.File and flyte.io.Dir,
which pass a snapshot of data between tasks, a Volume is long-lived and
versioned: you write to it during one run, seal it into an immutable version,
and any later task or run can mount that version again — picking up exactly where
you left off.
Every time you seal a Volume you get a new immutable version, and versions share unchanged data, so keeping history is cheap. Forking a Volume is a copy-on-write operation: you get an independent, writable branch without copying the underlying data.
When to use a Volume
Volumes are built for AI and agentic workloads, where work is long-running, stateful, and file-heavy:
- Agent memory and state. Give an agent a durable workspace it builds up across turns, tasks, and sessions — notes, intermediate artifacts, a growing working set of files — and resume exactly where it left off on the next run, instead of starting cold each time.
- Sandboxes and code execution. Back a sandbox or code-execution environment with a Volume so agent- or model-generated code has a real, writable file system to read and write many files in. Fork a clean base per session so concurrent runs stay isolated from each other.
- Model and dataset caching. Pull a model, dataset, or package cache into a Volume once and mount it across runs instead of re-downloading gigabytes each time. This pays off most with reusable containers, where the container — and the cache it has already loaded — stays warm across runs.
- Branching experiments and parallel runs. Fork a base Volume per experiment or per agent run; copy-on-write makes each branch independent and cheap, with full version history to compare or roll back.
More broadly, reach for a Volume whenever you need long-lived, versioned state that carries forward across tasks or runs — anything you’d otherwise rebuild from scratch every time.
If you only need to hand a finished file or folder from one task to the next,
use
flyte.io.File or flyte.io.Dir instead — they’re
simpler and need no setup.
Setup
Volumes are mounted inside the task pod, so the task environment must grant the
pod permission to mount a filesystem and install the flyteplugins-union
package. The mount permission comes from a pod template with allow_fuse():
import flyte
from flyteplugins.union.io import Volume, ROVolume
image = (
flyte.Image.from_debian_base()
.with_pip_packages("flyteplugins-union")
)
env = flyte.TaskEnvironment(
name="volumes-demo",
image=image,
# grant the pod what it needs to mount a Volume
pod_template=flyte.PodTemplate().allow_fuse(),
resources=flyte.Resources(cpu="1", memory="2Gi"),
)flyte.PodTemplate.allow_fuse() is what makes a Volume mountable: it requests
the FUSE device resource and grants the capability the mount needs, without
running the container as privileged. Your cluster must run a FUSE device
plugin for this — the Union data plane ships an opt-in one. (For clusters
without it, allow_fuse(privileged=True) is a fallback that runs the
container privileged instead.) A task whose environment doesn’t apply
allow_fuse() cannot mount a Volume.
Get started
The lifecycle is: create → mount → write → return. Returning a writable
volume from a task automatically seals it into an immutable ROVolume;
downstream tasks receive that and mount it read-only.
from pathlib import Path
import flyte
from flyteplugins.union.io import Volume, RWVolume, ROVolume
@env.task
async def create_dataset() -> RWVolume:
vol = Volume.new(name="my-dataset") # a fresh writable RWVolume
await vol.mount() # mounts at /workspace by default
Path("/workspace/greeting.txt").write_text("hello from a volume\n")
return vol # returning it seals the volume automatically
@env.task
async def read_dataset(vol: ROVolume) -> str:
await vol.mount() # ROVolume always mounts read-only
return Path("/workspace/greeting.txt").read_text()
@env.task
async def main() -> str:
dataset = await create_dataset()
return await read_dataset(dataset)Volume.new() hands you a writable RWVolume. When you return it from a task,
your writes are flushed and the volume is sealed into an immutable ROVolume —
a durable version safe to pass between tasks. The next task receives that
ROVolume and mounts the exact same data.
To attach a description to the sealed version, or to seal partway through a
task instead of at return, call finalize() explicitly — for example
await vol.finalize(message="initial dataset"). It returns the immutable
ROVolume.
Updating an existing Volume
An ROVolume is immutable, so to change one you fork it into a new writable
branch, edit, and seal again:
@env.task
async def add_file(vol: ROVolume) -> ROVolume:
rw = await vol.fork(name="my-dataset-v2") # writable copy-on-write branch
await rw.mount()
Path("/workspace/extra.txt").write_text("added in a later run\n")
return await rw.finalize(message="add extra.txt")The fork shares all unchanged data with its parent, so this is fast and cheap
even for very large Volumes. The original ROVolume is untouched, giving you a
clean version history.
Going further
Checkpoint while you work
Use commit() to snapshot the current state without unmounting — ideal for
long-running loops like training, where you want a durable checkpoint every few
steps but intend to keep writing:
@env.task
async def train(base: ROVolume) -> ROVolume:
rw = await base.fork(name="training-run")
await rw.mount()
checkpoints = []
for epoch in range(100):
train_one_epoch() # writes to /workspace
if epoch % 10 == 0:
snap = await rw.commit(message=f"epoch {epoch}")
checkpoints.append(snap) # each is an immutable ROVolume
return await rw.finalize(message="training complete")Each commit() returns an immutable ROVolume you can keep, branch from, or
hand to another task, while the mount stays live and writable.
High-throughput mode
The default configuration suits most workloads. For workloads that create or
update very large numbers of files — package installs, build trees, code
generation — switch on high-throughput mode by preparing the image with
flyteplugins.union.io.with_high_throughput_volume_deps:
from flyteplugins.union.io import with_high_throughput_volume_deps
image = with_high_throughput_volume_deps(
flyte.Image.from_debian_base().with_pip_packages("flyteplugins-union")
)
env = flyte.TaskEnvironment(
name="high-throughput-volumes",
image=image,
pod_template=flyte.PodTemplate().allow_fuse(),
)Volumes created in this environment automatically use the faster metadata path — no change to your task code is required.
Tuning the mount
mount() accepts options to match the I/O profile of your workload:
await vol.mount(
mount_path="/data", # where to mount (default: /workspace)
max_uploads=100, # more concurrent uploads for write-heavy bursts
upload_delay="30m", # defer uploads; skip files deleted before then (scratch space)
attr_cache=120.0, # cache file metadata longer (default 60s)
entry_cache=120.0, # cache name lookups longer
dir_entry_cache=120.0, # cache directory listings longer
)| Option | Default | Use it to… |
|---|---|---|
mount_path |
/workspace |
Mount somewhere other than the default. |
max_uploads |
50 |
Raise upload concurrency for write-heavy bursts. |
upload_delay |
None |
Defer uploads (e.g. "1h"); useful for scratch files that are deleted before the delay elapses, so they’re never uploaded. |
attr_cache / entry_cache / dir_entry_cache |
60.0 |
Cache metadata, name lookups, and directory listings longer to collapse repeated stat/listing calls. |
The larger caching values above are safe because a Volume has a single writer while it is mounted. Raise them when a tool repeatedly stats or lists the same paths (common with package managers and build systems).
Reference
- API:
Volume,RWVolume,ROVolume, andflyteplugins.union.io.with_high_throughput_volume_deps. - Related: Files and directories for passing snapshot data between tasks.