0.4.0

ROVolume

Package: flyteplugins.union.io

Immutable, versioned volume — PRD §Core Concepts.

Always mounts read-only. Obtain one by:

  • calling :meth:RWVolume.commit on a writable working copy;
  • reading run.outputs.<name> after a prior task returned an :class:RWVolume (auto-committed at task return);
  • passing an already-resolved Volume value through one of the Volume.empty / Volume.new flows once it’s been committed.

There is no commit method here — that’s intentional. The only path from RO back to a writable working copy is :meth:fork.

Parameters

class ROVolume(
    kind: typing.Literal['flyte.volume/v1'],
    name: str,
    bucket: str,
    storage: typing.Literal['s3', 'gs', 'wasb'],
    region: typing.Optional[str],
    index: typing.Optional[flyte.io._file.File],
    metadata_store_type: typing.Optional[str],
    used_bytes: typing.Optional[int],
    inode_count: typing.Optional[int],
    index_bytes: typing.Optional[int],
    message: typing.Optional[str],
    produced_by: typing.Optional[flyteplugins.union.io._base_volume.ActionRef],
    parent_produced_by: typing.Optional[flyteplugins.union.io._base_volume.ActionRef],
)

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameter Type Description
kind typing.Literal['flyte.volume/v1']
name str
bucket str
storage typing.Literal['s3', 'gs', 'wasb']
region typing.Optional[str]
index typing.Optional[flyte.io._file.File]
metadata_store_type typing.Optional[str]
used_bytes typing.Optional[int]
inode_count typing.Optional[int]
index_bytes typing.Optional[int]
message typing.Optional[str]
produced_by typing.Optional[flyteplugins.union.io._base_volume.ActionRef]
parent_produced_by typing.Optional[flyteplugins.union.io._base_volume.ActionRef]

Methods

Method Description
commit() **Deprecated.
empty() Declare a brand-new volume.
fork() Branch this immutable into a new writable working copy.
migrate_metadata_store_type() Re-host this Volume’s metadata on new_metadata_store_type.
model_post_init() This function is meant to behave like a BaseModel method to initialize private attributes.
mount() Mount this volume read-only at mount_path.
new() PRD §Lifecycle: create a fresh empty :class:RWVolume.

commit()

def commit(
    mount_path: str,
    meta_dir: str,
    timeout: float,
    message: Optional[str],
) -> 'Volume'

Deprecated. Drain + unmount + publish, returning a new Volume.

Prefer the typed lifecycle: create an :class:RWVolume (:meth:Volume.new / :meth:ROVolume.fork), use :meth:RWVolume.commit for a keep-alive snapshot, and let the type transformer call :meth:RWVolume.finalize automatically when you return an :class:RWVolume from a task. This base-class method is retained as a thin wrapper so existing Volume callers keep working; it emits :class:DeprecationWarning.

Parameter Type Description
mount_path str
meta_dir str
timeout float
message Optional[str]

empty()

def empty(
    name: str,
    bucket: Optional[str],
    storage: Optional[StorageBackend],
    region: Optional[str],
    metadata_store_type: Optional[str],
) -> 'Volume'

Declare a brand-new volume. The first mount() call will bootstrap the namespace (the underlying client refuses to format over a non-empty bucket prefix).

If bucket is omitted, it is derived from the currently active task context as {raw_data_root}/{project}/{domain}/volumes — following Flyte’s own layout for offloaded data. Must be called from inside a task in that case.

metadata_store_type controls the in-pod metadata backend. When omitted it resolves from $UNION_VOLUME_METADATA_STORE and otherwise defaults to "sqlite".

  • "sqlite" (default) keeps the namespace in a local SQLite file — runs in-process with no extra package in the task image, and supports :meth:fork.
  • "redis" runs an in-process redis-server and persists the namespace as an RDB snapshot — faster than the embedded stores for metadata-heavy workloads, but requires redis-server in the image (see :func:with_high_throughput_volume_deps, which also sets $UNION_VOLUME_METADATA_STORE=redis so it is the default there).

The choice is baked into the Volume and travels with it through lineage; subsequent mounts of the same Volume must use the same store type (use :meth:migrate_metadata_store_type to change it).

storage is the JuiceFS object-store backend. When omitted it is inferred from the bucket URI scheme (s3://s3, gs://gs, abfs(s)://wasb), so a GCS/Azure bucket — including the default one derived from raw_data_path — gets the right backend without the caller spelling it out.

region pins the object-store region onto the Volume (S3 only — it forms the endpoint host). When omitted it’s derived from the ambient AWS_REGION / AWS_DEFAULT_REGION at mount time; pass it to make the Volume self-describing so a cross-region remount doesn’t depend on the consumer’s env.

Parameter Type Description
name str
bucket Optional[str]
storage Optional[StorageBackend]
region Optional[str]
metadata_store_type Optional[str]

fork()

def fork(
    name: str,
    meta_dir: str,
    timeout: float,
) -> 'RWVolume'

Branch this immutable into a new writable working copy.

PRD §Lifecycle: ROVolume.fork() → RWVolume. The new :class:RWVolume shares chunk objects with self (copy-on- write) but is allocator-disjoint so both can write without colliding on shared keys.

Parameter Type Description
name str
meta_dir str
timeout float

migrate_metadata_store_type()

def migrate_metadata_store_type(
    new_metadata_store_type: str,
    meta_dir: str,
    new_meta_dir: Optional[str],
) -> 'Volume'

Re-host this Volume’s metadata on new_metadata_store_type without copying data chunks.

new_metadata_store_type must differ from the current store type. The full namespace is exported and re-imported into a fresh meta store pointing at the same bucket. Chunks are addressed by stable IDs that are preserved across the migration, so no chunk traffic is required.

The returned Volume has a fresh index (snapshot of the loaded meta store), parent_produced_by linking to the pre-migration version, the same bucket / storage / name, and the new metadata_store_type.

Intent is migration, not fork: the old store is meant to be retired. As a defense against accidental concurrent use, the loaded store’s chunk-slice / inode / session counters are advanced by a random offset (same mechanism as :meth:fork) so that even if the old store is still mounted somewhere, its writes can’t collide with the migrated store’s writes in shared object-store keys.

Does not require a FUSE mount on either side. Safe to call from any task pod running the Volume runtime (Redis tooling is only needed if one of the stores is "redis").

Parameter Type Description
new_metadata_store_type str
meta_dir str
new_meta_dir Optional[str]

model_post_init()

def model_post_init(
    context: Any,
)

This function is meant to behave like a BaseModel method to initialize private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameter Type Description
context Any The context.

mount()

def mount(
    mount_path: str,
    meta_dir: str,
    cache_dir: str,
    timeout: float,
    attr_cache: float,
    entry_cache: float,
    dir_entry_cache: float,
)

Mount this volume read-only at mount_path.

read_only is intentionally absent from the signature — an :class:ROVolume is statically un-writable, so the writeback / upload-delay knobs that only make sense for write-paths are omitted too.

Parameter Type Description
mount_path str
meta_dir str
cache_dir str
timeout float
attr_cache float
entry_cache float
dir_entry_cache float

new()

def new(
    name: Optional[str],
    bucket: Optional[str],
    storage: Optional[StorageBackend],
    region: Optional[str],
    metadata_store_type: Optional[str],
) -> 'RWVolume'

PRD §Lifecycle: create a fresh empty :class:RWVolume.

Equivalent in mechanics to :meth:empty, but:

  • name is optional (auto-generated when omitted, matching the PRD’s flyte.Volume.new(name=None) signature).
  • Returns the strictly-typed :class:RWVolume rather than the generic :class:Volume, so mypy / pyright can enforce a task’s RO/RW contract at the signature boundary.

Prefer :meth:new in new code; :meth:empty is retained for existing callers that already declare -> Volume.

Parameter Type Description
name Optional[str]
bucket Optional[str]
storage Optional[StorageBackend]
region Optional[str]
metadata_store_type Optional[str]