Structuring Flyte projects with uv

The two layers

Every Flyte + uv project involves two distinct layers. Understanding this distinction is the foundation for every decision that follows.

The image (slow-changing): the Python environment, installed packages, system dependencies, the interpreter. The SDK computes an MD5 hash of the image’s layer stack and only rebuilds when a layer actually changes.

The code bundle (fast-changing): your task source code, packaged as a tarball and uploaded on every run. The container downloads and unpacks it at startup.

+-----------------------------------+
|       Docker Image (slow path)    |
|  Python interpreter               |
|  Installed packages (uv sync)     |  <- rebuilt only when deps change
|  System packages (apt)            |
|  Content-hashed, registry-cached  |
+-----------------------------------+
|       Code Bundle (fast path)     |
|  Your task source files           |  <- uploaded on every run
|  Local library code               |
|  Tarball extracted at startup     |
+-----------------------------------+

Keep these two layers separate. Your image definition should describe only the environment. Source code travels in the code bundle (in fast-deploy mode) or gets baked in at deploy time (in full-build mode).

Violating this principle (copying source into the image with with_source_folder() or using install_project mode for local code) means your image hash changes on every code edit, causing a full Docker build and push on every iteration.

How the image gets built

flyte.Image is a frozen, content-addressed layer stack. Each .with_*() call appends an immutable layer. The final image tag is an MD5 hash of all layers.

The primary method for uv projects is .with_uv_project():

image = flyte.Image.from_debian_base().with_uv_project(
    pyproject_file=Path("my_app/pyproject.toml"),
)

Two installation modes:

  • dependencies_only (default): Only pyproject.toml and uv.lock are included in the build context. The image hash covers only these two files. Your code does not affect the image hash.

  • install_project: The entire project directory is copied into the build context. Any code change triggers a full image rebuild. Use this only when you need the project installed as a proper package (e.g., when you need package entry points or compiled extension modules).

with_code_bundle() — one image for dev and prod

with_code_bundle() is how you write an image definition that works for both development and production without changing any code.

image = (
    flyte.Image.from_debian_base()
    .with_uv_project(pyproject_file=Path("pyproject.toml"))
    .with_code_bundle()
)

Its behavior depends on copy_style at run time:

  • Fast deploy (default): with_code_bundle() is a no-op. Source travels as a tarball. The image only rebuilds when pyproject.toml or uv.lock changes.
  • Full build (copy_style="none"): with_code_bundle() resolves to a COPY instruction. Source is baked into the image. This is your production path.
# Development
flyte.run(my_task)

# Production
flyte.deploy(my_env, copy_style="none", version="1.2.3")

root_dir

root_dir tells Flyte where to look when building the code bundle and what path prefix to strip when packaging.

The rule: set root_dir to the directory you would cd into before running python -c "import my_module".

For src-layout projects, set root_dir to src/:

flyte.init_from_config(root_dir=Path(__file__).parent.parent)  # -> src/

For flat layout projects, set root_dir to the project root:

flyte.init_from_config(root_dir=Path(__file__).parent)  # -> my_project/

Monorepo patterns

Two patterns cover most cases:

Pattern A: Shared Lockfile Pattern B: Independent Packages
Lockfile One uv.lock for everything Each package has its own
Image isolation Dependency groups (--only-group etl) Separate pyproject.toml per package
Use when Packages developed together, shared dep graph Different release cadences, fully independent

All packages live under one src/ directory with a single pyproject.toml and uv.lock. Tasks install different subsets via dependency groups.

workspace_root/
├── pyproject.toml         <- defines dependency groups
├── uv.lock                <- one lockfile for everything
└── src/
    ├── workspace_app/
    │   ├── main.py
    │   └── tasks/
    │       ├── envs.py
    │       ├── etl_tasks.py
    │       └── ml_tasks.py
    ├── lib_transforms/
    │   └── ops.py
    └── lib_models/
        └── baseline.py

pyproject.toml — only external PyPI deps in dependency groups. Local libraries travel via the code bundle:

pyproject.toml
[project]
name = "workspace-app"
version = "0.1.0"
description = "uv workspace monorepo example for Flyte"
requires-python = ">=3.11"
dependencies = ["flyte>=2.0"]

[build-system]
requires = ["uv_build>=0.9,<0.10"]
build-backend = "uv_build"

[tool.uv]
package = true

[dependency-groups]
# Only external PyPI deps. lib_transforms and lib_models live under src/ alongside
# workspace_app and are included in the code bundle automatically.
etl = ["pandas"]
ml  = ["scikit-learn"]
dev = ["pytest", "ruff"]

Per-task images using dependency groups:

envs.py
import pathlib

import flyte

WORKSPACE_ROOT = pathlib.Path(__file__).parent.parent.parent.parent  # -> 01_workspace_monorepo/

etl_env = flyte.TaskEnvironment(
    name="etl",
    resources=flyte.Resources(memory="512Mi", cpu="1"),
    image=flyte.Image.from_debian_base()
    .with_uv_project(
        pyproject_file=WORKSPACE_ROOT / "pyproject.toml",
        extra_args="--only-group etl",
    )
    .with_code_bundle(),
)

ml_env = flyte.TaskEnvironment(
    name="ml",
    resources=flyte.Resources(memory="1Gi", cpu="1"),
    image=flyte.Image.from_debian_base()
    .with_uv_project(
        pyproject_file=WORKSPACE_ROOT / "pyproject.toml",
        extra_args="--only-group ml",
    )
    .with_code_bundle(),
)

Both etl_env and ml_env point to the same pyproject.toml but install different dependency groups. The extra_args string is included in the image hash, so they produce separate images.

ETL tasks (use the shared lib_transforms library):

etl_tasks.py
from lib_transforms.ops import normalize

from workspace_app.tasks.envs import etl_env


@etl_env.task
async def load_data(n: int) -> list[float]:
    """Simulate loading raw data."""
    return [float(i * 1.5) for i in range(n)]


@etl_env.task
async def transform_data(raw: list[float]) -> list[float]:
    """Normalize raw data."""
    return normalize(raw)


@etl_env.task
async def etl_pipeline(n: int) -> list[float]:
    """Load and normalize data end-to-end."""
    raw = await load_data(n=n)
    return await transform_data(raw=raw)

ML tasks (use the shared lib_models library):

ml_tasks.py
from lib_models.baseline import predict, train_mean_predictor

from workspace_app.tasks.envs import ml_env


@ml_env.task
async def train(features: list[float], labels: list[float]) -> dict:
    """Train a simple model."""
    return train_mean_predictor(features, labels)


@ml_env.task
async def evaluate(model: dict, features: list[float]) -> float:
    """Evaluate the model on a set of features."""
    return predict(model, features)


@ml_env.task
async def ml_pipeline(features: list[float], labels: list[float]) -> float:
    """Train and evaluate a model end-to-end."""
    model = await train(features=features, labels=labels)
    return await evaluate(model=model, features=features)

Entry pointroot_dir is set to src/ so the code bundle covers all packages:

main.py
import pathlib

import flyte
from workspace_app.tasks.etl_tasks import etl_pipeline
from workspace_app.tasks.ml_tasks import ml_pipeline

SRC_DIR = pathlib.Path(__file__).parent.parent  # -> 01_workspace_monorepo/src/


if __name__ == "__main__":
    flyte.init_from_config(root_dir=SRC_DIR)

    features = [1.5, 3.0, 4.5, 6.0, 7.5]
    labels = [0.0, 1.0, 2.0, 3.0, 4.0]

    # Development: fast deploy (code bundle delivers source at runtime)
    etl_run = flyte.run(etl_pipeline, n=10)
    print(f"ETL run: {etl_run.url}")

    ml_run = flyte.run(ml_pipeline, features=features, labels=labels)
    print(f"ML run: {ml_run.url}")

    # Production: bake source into the image (uncomment and set a version)
    # flyte.deploy(etl_env, copy_style="none", version="1.0.0")
    # flyte.deploy(ml_env, copy_style="none", version="1.0.0")

Pattern B: Independent packages

Each package has its own pyproject.toml and uv.lock. Fully independent image builds.

repo_root/
├── pyproject.toml         <- dev-only convenience (optional)
├── my_app/
│   ├── pyproject.toml     <- lists external deps + my-lib as editable path dep
│   ├── uv.lock            <- deployment lockfile
│   └── src/my_app/
│       ├── env.py
│       ├── main.py
│       └── tasks.py
└── my_lib/
    ├── pyproject.toml
    └── src/my_lib/
        └── stats.py

Root pyproject.toml — dev-only, installs both packages as editable for local development:

pyproject.toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "sibling-packages-dev"
version = "0.1.0"
description = "Dev-only root: installs both packages as editable for local development"
requires-python = ">=3.11"
dependencies = ["my-app", "my-lib"]

[tool.uv.sources]
my-app = { path = "my_app", editable = true }
my-lib = { path = "my_lib", editable = true }

[tool.hatch.build.targets.wheel]
packages = [
    "my_app/src/my_app",
    "my_lib/src/my_lib",
]

my_app/pyproject.toml — declares my-lib as an editable path dep:

pyproject.toml
[project]
name = "my-app"
version = "0.1.0"
description = "Flyte app"
requires-python = ">=3.11"
dependencies = ["flyte>=2.0", "my-lib"]

[tool.uv]
package = true

[tool.uv.sources]
my-lib = { path = "../my_lib", editable = true }

[build-system]
requires = ["uv_build>=0.9,<0.10"]
build-backend = "uv_build"

Image definition — sibling library baked into the image via with_source_folder():

env.py
import pathlib

import flyte

MY_APP_ROOT = pathlib.Path(__file__).parent.parent.parent  # -> my_app/
MY_LIB_PKG = MY_APP_ROOT.parent / "my_lib" / "src" / "my_lib"  # -> my_lib/src/my_lib/

env = flyte.TaskEnvironment(
    name="my_app",
    resources=flyte.Resources(memory="256Mi", cpu="1"),
    # my_lib is an editable path dep in pyproject.toml (so uv_build can find its source
    # during image build). Its package files are also baked into the image at /root/my_lib/
    # via with_source_folder, so they're importable at runtime without relying on the
    # editable install's .pth file (which points to a build-stage-only path).
    image=flyte.Image.from_debian_base()
    .with_uv_project(pyproject_file=MY_APP_ROOT / "pyproject.toml")
    .with_source_folder(MY_LIB_PKG)
    .with_code_bundle(),
)

with_source_folder(MY_LIB_PKG) copies the my_lib package directory into the image at /root/my_lib/. This is necessary because the editable install’s .pth file points to a path that only exists during the image build stage. The my_lib layer is part of the image hash, so the image rebuilds when my_lib changes — correct behavior for a dependency.

Task definitions:

tasks.py
from my_app.env import env


@env.task
async def compute_stats(values: list[float]) -> dict:
    """Compute basic statistics using the my_lib utility library."""
    from my_lib.stats import mean, std

    return {
        "mean": mean(values),
        "std": std(values),
        "count": len(values),
    }


@env.task
async def summarize(stats: dict) -> str:
    return f"n={stats['count']}, mean={stats['mean']:.2f}, std={stats['std']:.2f}"

Entry pointroot_dir covers only my_app source; my_lib is baked into the image:

main.py
import pathlib

import flyte
from my_app.env import env
from my_app.tasks import compute_stats, summarize

MY_APP_ROOT = pathlib.Path(__file__).parent.parent.parent  # -> my_app/
SRC_DIR = MY_APP_ROOT / "src"  # -> my_app/src/


@env.task
async def stats_pipeline(values: list[float]) -> str:
    stats = await compute_stats(values=values)
    return await summarize(stats=stats)


if __name__ == "__main__":
    # my_lib is installed in the image; root_dir only needs to cover my_app source
    flyte.init_from_config(root_dir=SRC_DIR)

    # Development -- run a task directly, code bundle handles source delivery
    run = flyte.run(stats_pipeline, values=[1.0, 2.0, 3.0, 4.0, 5.0])
    print(f"Run URL: {run.url}")

    # Production -- deploy an environment with source baked into the image
    # flyte.deploy(env, copy_style="none", version="1.0.0")

The full build path (production)

For production deployments where you need immutable, self-contained images:

flyte.deploy(my_env, copy_style="none", version="1.2.3")

with_code_bundle() on the image resolves to a COPY instruction. The image is fully self-contained.

Use a deterministic version string — a git commit SHA, a git tag, a CI build number. Avoid auto-generated strings so you can trace which code is in which image.

Do not use install_project mode for production builds. install_project copies the entire project directory into the build context and hashes all of it. Every code change triggers a full image rebuild. with_code_bundle() + copy_style="none" is more surgical: only the files you select are in the image.