Structuring Flyte projects with uv
The two layers
Every Flyte + uv project involves two distinct layers. Understanding this distinction is the foundation for every decision that follows.
The image (slow-changing): the Python environment, installed packages, system dependencies, the interpreter. The SDK computes an MD5 hash of the image’s layer stack and only rebuilds when a layer actually changes.
The code bundle (fast-changing): your task source code, packaged as a tarball and uploaded on every run. The container downloads and unpacks it at startup.
+-----------------------------------+
| Docker Image (slow path) |
| Python interpreter |
| Installed packages (uv sync) | <- rebuilt only when deps change
| System packages (apt) |
| Content-hashed, registry-cached |
+-----------------------------------+
| Code Bundle (fast path) |
| Your task source files | <- uploaded on every run
| Local library code |
| Tarball extracted at startup |
+-----------------------------------+Keep these two layers separate. Your image definition should describe only the environment. Source code travels in the code bundle (in fast-deploy mode) or gets baked in at deploy time (in full-build mode).
Violating this principle (copying source into the image with with_source_folder() or using install_project mode for local code) means your image hash changes on every code edit, causing a full Docker build and push on every iteration.
How the image gets built
flyte.Image is a frozen, content-addressed layer stack. Each .with_*() call appends an immutable layer. The final image tag is an MD5 hash of all layers.
The primary method for uv projects is .with_uv_project():
image = flyte.Image.from_debian_base().with_uv_project(
pyproject_file=Path("my_app/pyproject.toml"),
)Two installation modes:
-
dependencies_only(default): Onlypyproject.tomlanduv.lockare included in the build context. The image hash covers only these two files. Your code does not affect the image hash. -
install_project: The entire project directory is copied into the build context. Any code change triggers a full image rebuild. Use this only when you need the project installed as a proper package (e.g., when you need package entry points or compiled extension modules).
with_code_bundle() — one image for dev and prod
with_code_bundle() is how you write an image definition that works for both development and production without changing any code.
image = (
flyte.Image.from_debian_base()
.with_uv_project(pyproject_file=Path("pyproject.toml"))
.with_code_bundle()
)Its behavior depends on copy_style at run time:
- Fast deploy (default):
with_code_bundle()is a no-op. Source travels as a tarball. The image only rebuilds whenpyproject.tomloruv.lockchanges. - Full build (
copy_style="none"):with_code_bundle()resolves to aCOPYinstruction. Source is baked into the image. This is your production path.
# Development
flyte.run(my_task)
# Production
flyte.deploy(my_env, copy_style="none", version="1.2.3")root_dir
root_dir tells Flyte where to look when building the code bundle and what path prefix to strip when packaging.
The rule: set root_dir to the directory you would cd into before running python -c "import my_module".
For src-layout projects, set root_dir to src/:
flyte.init_from_config(root_dir=Path(__file__).parent.parent) # -> src/For flat layout projects, set root_dir to the project root:
flyte.init_from_config(root_dir=Path(__file__).parent) # -> my_project/Monorepo patterns
Two patterns cover most cases:
| Pattern A: Shared Lockfile | Pattern B: Independent Packages | |
|---|---|---|
| Lockfile | One uv.lock for everything |
Each package has its own |
| Image isolation | Dependency groups (--only-group etl) |
Separate pyproject.toml per package |
| Use when | Packages developed together, shared dep graph | Different release cadences, fully independent |
Pattern A: Shared lockfile (recommended)
All packages live under one src/ directory with a single pyproject.toml and uv.lock. Tasks install different subsets via dependency groups.
workspace_root/
├── pyproject.toml <- defines dependency groups
├── uv.lock <- one lockfile for everything
└── src/
├── workspace_app/
│ ├── main.py
│ └── tasks/
│ ├── envs.py
│ ├── etl_tasks.py
│ └── ml_tasks.py
├── lib_transforms/
│ └── ops.py
└── lib_models/
└── baseline.pypyproject.toml — only external PyPI deps in dependency groups. Local libraries travel via the code bundle:
[project]
name = "workspace-app"
version = "0.1.0"
description = "uv workspace monorepo example for Flyte"
requires-python = ">=3.11"
dependencies = ["flyte>=2.0"]
[build-system]
requires = ["uv_build>=0.9,<0.10"]
build-backend = "uv_build"
[tool.uv]
package = true
[dependency-groups]
# Only external PyPI deps. lib_transforms and lib_models live under src/ alongside
# workspace_app and are included in the code bundle automatically.
etl = ["pandas"]
ml = ["scikit-learn"]
dev = ["pytest", "ruff"]
Per-task images using dependency groups:
import pathlib
import flyte
WORKSPACE_ROOT = pathlib.Path(__file__).parent.parent.parent.parent # -> 01_workspace_monorepo/
etl_env = flyte.TaskEnvironment(
name="etl",
resources=flyte.Resources(memory="512Mi", cpu="1"),
image=flyte.Image.from_debian_base()
.with_uv_project(
pyproject_file=WORKSPACE_ROOT / "pyproject.toml",
extra_args="--only-group etl",
)
.with_code_bundle(),
)
ml_env = flyte.TaskEnvironment(
name="ml",
resources=flyte.Resources(memory="1Gi", cpu="1"),
image=flyte.Image.from_debian_base()
.with_uv_project(
pyproject_file=WORKSPACE_ROOT / "pyproject.toml",
extra_args="--only-group ml",
)
.with_code_bundle(),
)
Both etl_env and ml_env point to the same pyproject.toml but install different dependency groups. The extra_args string is included in the image hash, so they produce separate images.
ETL tasks (use the shared lib_transforms library):
from lib_transforms.ops import normalize
from workspace_app.tasks.envs import etl_env
@etl_env.task
async def load_data(n: int) -> list[float]:
"""Simulate loading raw data."""
return [float(i * 1.5) for i in range(n)]
@etl_env.task
async def transform_data(raw: list[float]) -> list[float]:
"""Normalize raw data."""
return normalize(raw)
@etl_env.task
async def etl_pipeline(n: int) -> list[float]:
"""Load and normalize data end-to-end."""
raw = await load_data(n=n)
return await transform_data(raw=raw)
ML tasks (use the shared lib_models library):
from lib_models.baseline import predict, train_mean_predictor
from workspace_app.tasks.envs import ml_env
@ml_env.task
async def train(features: list[float], labels: list[float]) -> dict:
"""Train a simple model."""
return train_mean_predictor(features, labels)
@ml_env.task
async def evaluate(model: dict, features: list[float]) -> float:
"""Evaluate the model on a set of features."""
return predict(model, features)
@ml_env.task
async def ml_pipeline(features: list[float], labels: list[float]) -> float:
"""Train and evaluate a model end-to-end."""
model = await train(features=features, labels=labels)
return await evaluate(model=model, features=features)
Entry point — root_dir is set to src/ so the code bundle covers all packages:
import pathlib
import flyte
from workspace_app.tasks.etl_tasks import etl_pipeline
from workspace_app.tasks.ml_tasks import ml_pipeline
SRC_DIR = pathlib.Path(__file__).parent.parent # -> 01_workspace_monorepo/src/
if __name__ == "__main__":
flyte.init_from_config(root_dir=SRC_DIR)
features = [1.5, 3.0, 4.5, 6.0, 7.5]
labels = [0.0, 1.0, 2.0, 3.0, 4.0]
# Development: fast deploy (code bundle delivers source at runtime)
etl_run = flyte.run(etl_pipeline, n=10)
print(f"ETL run: {etl_run.url}")
ml_run = flyte.run(ml_pipeline, features=features, labels=labels)
print(f"ML run: {ml_run.url}")
# Production: bake source into the image (uncomment and set a version)
# flyte.deploy(etl_env, copy_style="none", version="1.0.0")
# flyte.deploy(ml_env, copy_style="none", version="1.0.0")
Pattern B: Independent packages
Each package has its own pyproject.toml and uv.lock. Fully independent image builds.
repo_root/
├── pyproject.toml <- dev-only convenience (optional)
├── my_app/
│ ├── pyproject.toml <- lists external deps + my-lib as editable path dep
│ ├── uv.lock <- deployment lockfile
│ └── src/my_app/
│ ├── env.py
│ ├── main.py
│ └── tasks.py
└── my_lib/
├── pyproject.toml
└── src/my_lib/
└── stats.pyRoot pyproject.toml — dev-only, installs both packages as editable for local development:
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "sibling-packages-dev"
version = "0.1.0"
description = "Dev-only root: installs both packages as editable for local development"
requires-python = ">=3.11"
dependencies = ["my-app", "my-lib"]
[tool.uv.sources]
my-app = { path = "my_app", editable = true }
my-lib = { path = "my_lib", editable = true }
[tool.hatch.build.targets.wheel]
packages = [
"my_app/src/my_app",
"my_lib/src/my_lib",
]
my_app/pyproject.toml — declares my-lib as an editable path dep:
[project]
name = "my-app"
version = "0.1.0"
description = "Flyte app"
requires-python = ">=3.11"
dependencies = ["flyte>=2.0", "my-lib"]
[tool.uv]
package = true
[tool.uv.sources]
my-lib = { path = "../my_lib", editable = true }
[build-system]
requires = ["uv_build>=0.9,<0.10"]
build-backend = "uv_build"
Image definition — sibling library baked into the image via with_source_folder():
import pathlib
import flyte
MY_APP_ROOT = pathlib.Path(__file__).parent.parent.parent # -> my_app/
MY_LIB_PKG = MY_APP_ROOT.parent / "my_lib" / "src" / "my_lib" # -> my_lib/src/my_lib/
env = flyte.TaskEnvironment(
name="my_app",
resources=flyte.Resources(memory="256Mi", cpu="1"),
# my_lib is an editable path dep in pyproject.toml (so uv_build can find its source
# during image build). Its package files are also baked into the image at /root/my_lib/
# via with_source_folder, so they're importable at runtime without relying on the
# editable install's .pth file (which points to a build-stage-only path).
image=flyte.Image.from_debian_base()
.with_uv_project(pyproject_file=MY_APP_ROOT / "pyproject.toml")
.with_source_folder(MY_LIB_PKG)
.with_code_bundle(),
)
with_source_folder(MY_LIB_PKG) copies the my_lib package directory into the image at /root/my_lib/. This is necessary because the editable install’s .pth file points to a path that only exists during the image build stage. The my_lib layer is part of the image hash, so the image rebuilds when my_lib changes — correct behavior for a dependency.
Task definitions:
from my_app.env import env
@env.task
async def compute_stats(values: list[float]) -> dict:
"""Compute basic statistics using the my_lib utility library."""
from my_lib.stats import mean, std
return {
"mean": mean(values),
"std": std(values),
"count": len(values),
}
@env.task
async def summarize(stats: dict) -> str:
return f"n={stats['count']}, mean={stats['mean']:.2f}, std={stats['std']:.2f}"
Entry point — root_dir covers only my_app source; my_lib is baked into the image:
import pathlib
import flyte
from my_app.env import env
from my_app.tasks import compute_stats, summarize
MY_APP_ROOT = pathlib.Path(__file__).parent.parent.parent # -> my_app/
SRC_DIR = MY_APP_ROOT / "src" # -> my_app/src/
@env.task
async def stats_pipeline(values: list[float]) -> str:
stats = await compute_stats(values=values)
return await summarize(stats=stats)
if __name__ == "__main__":
# my_lib is installed in the image; root_dir only needs to cover my_app source
flyte.init_from_config(root_dir=SRC_DIR)
# Development -- run a task directly, code bundle handles source delivery
run = flyte.run(stats_pipeline, values=[1.0, 2.0, 3.0, 4.0, 5.0])
print(f"Run URL: {run.url}")
# Production -- deploy an environment with source baked into the image
# flyte.deploy(env, copy_style="none", version="1.0.0")
The full build path (production)
For production deployments where you need immutable, self-contained images:
flyte.deploy(my_env, copy_style="none", version="1.2.3")with_code_bundle() on the image resolves to a COPY instruction. The image is fully self-contained.
Use a deterministic version string — a git commit SHA, a git tag, a CI build number. Avoid auto-generated strings so you can trace which code is in which image.
Do not use install_project mode for production builds. install_project copies the entire project directory into the build context and hashes all of it. Every code change triggers a full image rebuild. with_code_bundle() + copy_style="none" is more surgical: only the files you select are in the image.