Bring your own image
This guide is for teams who own their Docker images and want Flyte for orchestration without handing over their build pipeline.
This guide does not cover flyte.Image.from_debian_base(), the Flyte-managed image builder. It assumes you already have images.
The multi-team problem
Two teams. Two images. One workflow.
| Team A (data-prep) | Team B (training) | |
|---|---|---|
| Base | python:3.11-slim |
python:3.10-slim (prod: CUDA) |
| Python | 3.11 | 3.10 |
| WORKDIR | /app |
/workspace |
| Packages | pandas, pyarrow | torch, numpy |
The prepare task runs in Team A’s container. It processes the input and calls train, which runs in Team B’s container. One workflow, two images, different filesystem layouts.
Two patterns solve this. Pick based on who controls what:
| Pattern 1: Pure BYOI | Pattern 2: Remote Builder | |
|---|---|---|
| Who owns the image? | Each team owns everything | Each team owns the base |
| Flyte-aware? | Yes — code is baked in | No — Flyte adapts on top |
| Code change = image rebuild? | Yes | No |
| Use when | Teams can’t let Flyte touch images | Teams can hand off a base |
Pattern 1: Pure BYOI
Teams build complete, Flyte-aware images. Workflow code is COPYed into the Dockerfile. Flyte runs the container as a black box — it sends no code and modifies nothing.
Dockerfiles
Both teams install flyte and COPY the shared workflow_code/ into their image. The only difference is their base, Python version, and WORKDIR.
Team A (data prep):
# Team A's image: data preparation
#
# This team owns this entire Dockerfile. They control Python version, WORKDIR,
# and PYTHONPATH. Flyte has no say here.
#
# Pure BYOI constraint: workflow code must be baked in because there is no
# code bundle. Every code change requires rebuilding and pushing this image.
FROM python:3.11-slim
# System deps this team needs
RUN apt-get update && apt-get install -y --no-install-recommends \
libpq-dev \
&& rm -rf /var/lib/apt/lists/*
# Team A's WORKDIR. Python will find modules here because PYTHONPATH includes it.
WORKDIR /app
# Team A's Python packages. These are their own dependencies — Flyte doesn't
# install anything on top in pure BYOI mode.
RUN pip install --no-cache-dir \
flyte \
pandas==2.1.4 \
pyarrow==14.0.1
# Bake the workflow code into the image.
# In pure BYOI, this is the ONLY way Flyte can find your task functions.
# Downside: every edit to tasks.py requires a new image tag + CI build.
COPY workflow_code/ /app/workflow_code/
# /app is on PYTHONPATH so `import workflow_code.tasks` resolves at runtime.
ENV PYTHONPATH=/app
Team B (training):
# Team B's image: model training (GPU workload)
#
# Intentionally different from Team A:
# - Different base: CUDA runtime instead of debian slim
# - Different Python version: 3.10 (team B's standard)
# - Different WORKDIR: /workspace (not /app)
# - Different PYTHONPATH: /workspace
# - Different system packages: CUDA tools
#
# Pure BYOI: Flyte injects nothing at runtime. Everything must be baked in.
# Each team owns their filesystem layout entirely.
#
# In practice this would be:
# FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
# Using python:3.10-slim here so you can build/test without a GPU machine.
FROM python:3.10-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Team B uses /workspace, not /app. Each team controls their own layout.
WORKDIR /workspace
RUN pip install --no-cache-dir \
flyte \
torch==2.1.2 \
numpy==1.26.4
# Bake in workflow code at /workspace/workflow_code/.
# Python finds it because PYTHONPATH includes /workspace.
COPY workflow_code/ /workspace/workflow_code/
# /workspace is on PYTHONPATH — same import path as Team A's image despite
# the different WORKDIR. Both images expose `import workflow_code.tasks`.
ENV PYTHONPATH=/workspace
Both expose import workflow_code.tasks at runtime because each image’s PYTHONPATH points to its own WORKDIR where the code was COPYed.
Build and push
The build context is the pure_byoi/ directory so that workflow_code/ is available to both Dockerfiles:
docker build -f data_prep/Dockerfile -t <your-registry>/data-prep:latest .
docker build -f training/Dockerfile -t <your-registry>/training:latest .
docker push <your-registry>/data-prep:latest
docker push <your-registry>/training:latestPython code
Environment definitions — image names are specified via from_ref_name():
import flyte
# Image refs are resolved at runtime via init_from_config(images=...) in main.py.
# from_ref_name() is used instead of hardcoding the URI because this file is
# COPYed into both images — hardcoding would create a circular reference (the
# image baked into itself would reference its own tag).
env_train = flyte.TaskEnvironment(
name="training",
image=flyte.Image.from_ref_name("training"),
)
env_data = flyte.TaskEnvironment(
name="data-prep",
image=flyte.Image.from_ref_name("data-prep"),
depends_on=[env_train],
)
from_ref_name() is a placeholder resolved at runtime. The actual URIs are passed in the entry point via init_from_config(images=...). This is necessary because the envs file is COPYed into both images — hardcoding a URI would create a circular reference.
Task definitions:
from workflow_code.envs import env_data, env_train
@env_train.task
async def train(processed: str) -> float:
# Runs in Team B's container: python 3.10, WORKDIR /workspace
# torch is available (installed in training/Dockerfile)
return float(len(processed))
@env_data.task
async def prepare(raw: str = "Hello World") -> float:
# Runs in Team A's container: python 3.11, WORKDIR /app
# pandas is available (installed in data_prep/Dockerfile)
processed = raw.strip().lower()
return await train(processed)
Entry point — this is where image URIs are wired in:
# /// script
# requires-python = ">=3.10"
# dependencies = ["flyte"]
# ///
"""Pure BYOI entry point.
workflow_code/ is baked into both images via their Dockerfiles.
Flyte runs each container as a black box — no code bundle is sent.
Build and push both images first (from v2_guide/pure_byoi/):
docker build -f data_prep/Dockerfile -t <your-registry>/data-prep:latest .
docker build -f training/Dockerfile -t <your-registry>/training:latest .
docker push <your-registry>/data-prep:latest
docker push <your-registry>/training:latest
Run (from v2_guide/pure_byoi/):
uv run main.py
"""
from workflow_code.tasks import prepare
import flyte
DATA_PREP_IMAGE = "<your-registry>/data-prep:latest"
TRAINING_IMAGE = "<your-registry>/training:latest"
if __name__ == "__main__":
# No root_dir — Flyte does not inject code. The images contain everything.
flyte.init_from_config(
project="flytesnacks",
domain="development",
images=(
f"data-prep={DATA_PREP_IMAGE}",
f"training={TRAINING_IMAGE}",
),
)
# Development: run the pipeline.
# copy_style="none": no code bundle, the image IS the deployment.
run = flyte.with_runcontext(copy_style="none", version="dev").run(prepare, raw="Hello World")
print(run.url)
run.wait()
# Production: register task environments against the cluster.
# from workflow_code.envs import env_data, env_train
# flyte.deploy(env_data, copy_style="none", version="1.0.0")
# flyte.deploy(env_train, copy_style="none", version="1.0.0")
Run and deploy
uv run main.pyThere is no separate deploy step. The image tag is the version. To ship a code change: edit tasks, rebuild both images, push new tags, update the tag constants in main.py, run again.
Pattern 2: Remote Builder
Teams hand you their base images. They built these images for their own purposes — Flyte was never a consideration. Your job is to adapt them.
The base images
Team A uses continuumio/miniconda3 as their base:
# Team A's image: data preparation
# Uses conda. Team A has no knowledge of Flyte.
# Python is at /opt/conda/bin/python.
# No PYTHONPATH set — teams don't know Flyte needs it.
FROM continuumio/miniconda3:latest
RUN conda install -y -c conda-forge pandas==2.2.3 pyarrow==19.0.1 && \
conda clean -afy
WORKDIR /app
- Python at
/opt/conda/bin/python(conda manages this) - conda’s Dockerfile already adds
/opt/conda/bintoPATH - No PYTHONPATH set
- WORKDIR
/app
Team B uses python:3.10-slim with a pip venv at /opt/venv:
# Team B's image: model training
# Uses a pip venv at /opt/venv. Team B has no knowledge of Flyte.
# Python is at /opt/venv/bin/python.
# No PYTHONPATH set, PATH doesn't include /opt/venv/bin.
FROM python:3.10-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*
RUN python -m venv /opt/venv && \
/opt/venv/bin/pip install --no-cache-dir \
torch==2.1.2+cpu \
numpy==1.26.4 \
--extra-index-url https://download.pytorch.org/whl/cpu
WORKDIR /workspace
- Python at
/opt/venv/bin/python PATHdoes not include/opt/venv/bin— the venv was created but never activated- No PYTHONPATH set
- WORKDIR
/workspace
Adapting with flyte.Image
flyte.Image.from_base() takes the base URI and lets you layer on top. This is where the adaptation happens:
import pathlib
import flyte
# Root of the remote_builder example (one level up from this file)
HERE = pathlib.Path(__file__).parent.parent
REGISTRY = "<your-registry>" # e.g. "ghcr.io/your-org" or "123456789.dkr.ecr.us-east-1.amazonaws.com"
# ── Team B: pip venv ────────────────────────────────────────────────────────────
# Base image uses a venv at /opt/venv — python at /opt/venv/bin/python.
# The venv is not activated: PATH doesn't include /opt/venv/bin.
# Flyte adapts: installs flyte via venv's pip, adds venv to PATH, sets PYTHONPATH.
# $PATH expands at Docker build time to the base image's PATH value.
env_train_image = (
flyte.Image.from_base("<your-registry>/training-base:latest")
.clone(name="<your-org>/<your-image>", registry=REGISTRY, extendable=True)
.with_commands(["/opt/venv/bin/pip install flyte"])
.with_env_vars(
{
"PATH": "/opt/venv/bin:$PATH",
"PYTHONPATH": "/workspace", # /workspace is WORKDIR
}
)
.with_code_bundle()
)
# ── Team A: conda ───────────────────────────────────────────────────────────────
# Base image uses conda — python at /opt/conda/bin/python.
# conda's own Dockerfile already adds /opt/conda/bin to PATH, so python is findable.
# Flyte adapts: installs flyte via conda's pip, sets PYTHONPATH.
env_data_image = (
flyte.Image.from_base("<your-registry>/data-prep-base:latest")
.clone(name="<your-org>/<your-image>", registry=REGISTRY, extendable=True)
.with_commands(["/opt/conda/bin/pip install flyte"])
.with_env_vars({"PYTHONPATH": "/app"}) # /app is WORKDIR; code bundle extracts here
.with_code_bundle()
)
env_train = flyte.TaskEnvironment(name="training", image=env_train_image)
env_data = flyte.TaskEnvironment(name="data-prep", image=env_data_image, depends_on=[env_train])
Team A only needs flyte installed and PYTHONPATH set. conda’s PATH is already correct.
Team B needs three things: flyte installed in the venv, PATH updated so the venv’s python is the default, and PYTHONPATH set. $PATH in an ENV instruction expands at Docker build time.
.with_code_bundle() tells Flyte to inject task source at runtime (dev) or bake it into the image at deploy time (prod).
Task definitions
from tasks.envs import env_data, env_train
@env_train.task
async def train(processed: str) -> float:
# Runs in /opt/venv — torch and numpy available from Team B's base image
return float(len(processed))
@env_data.task
async def prepare(raw: str = "Hello World") -> float:
# Runs in conda env — pandas and pyarrow available from Team A's base image
processed = raw.strip().lower()
return await train(processed)
Entry point
# /// script
# requires-python = ">=3.10"
# dependencies = ["flyte"]
# ///
"""Remote builder BYOI: two teams, two Flyte-unaware images, Flyte fills the gaps.
Each team owns their base image — built with their preferred package manager,
no knowledge of Flyte. The Flyte engineer adapts by installing flyte via the
right pip, fixing PATH, and setting PYTHONPATH.
data_prep: continuumio/miniconda3 — conda env, python at /opt/conda/bin/python
training: python:3.10-slim — venv at /opt/venv, python at /opt/venv/bin/python
Build and push base images first (from v2_guide/remote_builder/):
docker build -f data_prep/Dockerfile -t <your-registry>/data-prep-base:latest data_prep/
docker build -f training/Dockerfile -t <your-registry>/training-base:latest training/
docker push <your-registry>/data-prep-base:latest
docker push <your-registry>/training-base:latest
Run (from v2_guide/remote_builder/):
uv run main.py
"""
import pathlib
import flyte
HERE = pathlib.Path(__file__).parent
if __name__ == "__main__":
flyte.init_from_config(root_dir=HERE)
# Development: fast code iteration — image only rebuilds when base image changes.
# run = flyte.with_runcontext(version="dev").run(prepare, raw="Hello World")
# print(run.url)
# run.wait()
# Production: bake code into both images, pin to a version.
from tasks.envs import env_data, env_train
flyte.deploy(env_data, copy_style="none", version="3.0.0")
flyte.deploy(env_train, copy_style="none", version="3.0.0")
Run and deploy
uv run main.pyDuring development you only rebuild the base image when the Dockerfile changes. Code changes are free — they travel as a tarball at runtime.
Decision matrix
| Scenario | Pattern |
|---|---|
| Teams own full images, can’t let Flyte touch them | Pure BYOI |
| Teams hand off a base image (no Flyte knowledge required) | Remote Builder |
| Code change should not require image rebuild | Remote Builder + with_code_bundle() |
| Base has non-standard Python location | .with_commands() to fix PATH before Flyte uses it |
| Production deploy, self-contained containers | copy_style="none" in flyte.deploy() |