Bring your own image

This guide is for teams who own their Docker images and want Flyte for orchestration without handing over their build pipeline.

This guide does not cover flyte.Image.from_debian_base(), the Flyte-managed image builder. It assumes you already have images.

The multi-team problem

Two teams. Two images. One workflow.

Team A (data-prep) Team B (training)
Base python:3.11-slim python:3.10-slim (prod: CUDA)
Python 3.11 3.10
WORKDIR /app /workspace
Packages pandas, pyarrow torch, numpy

The prepare task runs in Team A’s container. It processes the input and calls train, which runs in Team B’s container. One workflow, two images, different filesystem layouts.

Two patterns solve this. Pick based on who controls what:

Pattern 1: Pure BYOI Pattern 2: Remote Builder
Who owns the image? Each team owns everything Each team owns the base
Flyte-aware? Yes — code is baked in No — Flyte adapts on top
Code change = image rebuild? Yes No
Use when Teams can’t let Flyte touch images Teams can hand off a base

Pattern 1: Pure BYOI

Teams build complete, Flyte-aware images. Workflow code is COPYed into the Dockerfile. Flyte runs the container as a black box — it sends no code and modifies nothing.

Dockerfiles

Both teams install flyte and COPY the shared workflow_code/ into their image. The only difference is their base, Python version, and WORKDIR.

Team A (data prep):

Dockerfile
# Team A's image: data preparation
#
# This team owns this entire Dockerfile. They control Python version, WORKDIR,
# and PYTHONPATH. Flyte has no say here.
#
# Pure BYOI constraint: workflow code must be baked in because there is no
# code bundle. Every code change requires rebuilding and pushing this image.

FROM python:3.11-slim

# System deps this team needs
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Team A's WORKDIR. Python will find modules here because PYTHONPATH includes it.
WORKDIR /app

# Team A's Python packages. These are their own dependencies — Flyte doesn't
# install anything on top in pure BYOI mode.
RUN pip install --no-cache-dir \
    flyte \
    pandas==2.1.4 \
    pyarrow==14.0.1

# Bake the workflow code into the image.
# In pure BYOI, this is the ONLY way Flyte can find your task functions.
# Downside: every edit to tasks.py requires a new image tag + CI build.
COPY workflow_code/ /app/workflow_code/

# /app is on PYTHONPATH so `import workflow_code.tasks` resolves at runtime.
ENV PYTHONPATH=/app

Team B (training):

Dockerfile
# Team B's image: model training (GPU workload)
#
# Intentionally different from Team A:
#   - Different base: CUDA runtime instead of debian slim
#   - Different Python version: 3.10 (team B's standard)
#   - Different WORKDIR: /workspace (not /app)
#   - Different PYTHONPATH: /workspace
#   - Different system packages: CUDA tools
#
# Pure BYOI: Flyte injects nothing at runtime. Everything must be baked in.
# Each team owns their filesystem layout entirely.
#
# In practice this would be:
#   FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
# Using python:3.10-slim here so you can build/test without a GPU machine.

FROM python:3.10-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

# Team B uses /workspace, not /app. Each team controls their own layout.
WORKDIR /workspace

RUN pip install --no-cache-dir \
    flyte \
    torch==2.1.2 \
    numpy==1.26.4

# Bake in workflow code at /workspace/workflow_code/.
# Python finds it because PYTHONPATH includes /workspace.
COPY workflow_code/ /workspace/workflow_code/

# /workspace is on PYTHONPATH — same import path as Team A's image despite
# the different WORKDIR. Both images expose `import workflow_code.tasks`.
ENV PYTHONPATH=/workspace

Both expose import workflow_code.tasks at runtime because each image’s PYTHONPATH points to its own WORKDIR where the code was COPYed.

Build and push

The build context is the pure_byoi/ directory so that workflow_code/ is available to both Dockerfiles:

docker build -f data_prep/Dockerfile -t <your-registry>/data-prep:latest .
docker build -f training/Dockerfile  -t <your-registry>/training:latest  .
docker push <your-registry>/data-prep:latest
docker push <your-registry>/training:latest

Python code

Environment definitions — image names are specified via from_ref_name():

envs.py
import flyte

# Image refs are resolved at runtime via init_from_config(images=...) in main.py.
# from_ref_name() is used instead of hardcoding the URI because this file is
# COPYed into both images — hardcoding would create a circular reference (the
# image baked into itself would reference its own tag).
env_train = flyte.TaskEnvironment(
    name="training",
    image=flyte.Image.from_ref_name("training"),
)

env_data = flyte.TaskEnvironment(
    name="data-prep",
    image=flyte.Image.from_ref_name("data-prep"),
    depends_on=[env_train],
)

from_ref_name() is a placeholder resolved at runtime. The actual URIs are passed in the entry point via init_from_config(images=...). This is necessary because the envs file is COPYed into both images — hardcoding a URI would create a circular reference.

Task definitions:

tasks.py
from workflow_code.envs import env_data, env_train


@env_train.task
async def train(processed: str) -> float:
    # Runs in Team B's container: python 3.10, WORKDIR /workspace
    # torch is available (installed in training/Dockerfile)
    return float(len(processed))


@env_data.task
async def prepare(raw: str = "Hello World") -> float:
    # Runs in Team A's container: python 3.11, WORKDIR /app
    # pandas is available (installed in data_prep/Dockerfile)
    processed = raw.strip().lower()
    return await train(processed)

Entry point — this is where image URIs are wired in:

main.py
# /// script
# requires-python = ">=3.10"
# dependencies = ["flyte"]
# ///
"""Pure BYOI entry point.

workflow_code/ is baked into both images via their Dockerfiles.
Flyte runs each container as a black box — no code bundle is sent.

Build and push both images first (from v2_guide/pure_byoi/):
  docker build -f data_prep/Dockerfile -t <your-registry>/data-prep:latest .
  docker build -f training/Dockerfile  -t <your-registry>/training:latest  .
  docker push <your-registry>/data-prep:latest
  docker push <your-registry>/training:latest

Run (from v2_guide/pure_byoi/):
  uv run main.py
"""

from workflow_code.tasks import prepare

import flyte

DATA_PREP_IMAGE = "<your-registry>/data-prep:latest"
TRAINING_IMAGE = "<your-registry>/training:latest"

if __name__ == "__main__":
    # No root_dir — Flyte does not inject code. The images contain everything.
    flyte.init_from_config(
        project="flytesnacks",
        domain="development",
        images=(
            f"data-prep={DATA_PREP_IMAGE}",
            f"training={TRAINING_IMAGE}",
        ),
    )

    # Development: run the pipeline.
    # copy_style="none": no code bundle, the image IS the deployment.
    run = flyte.with_runcontext(copy_style="none", version="dev").run(prepare, raw="Hello World")
    print(run.url)
    run.wait()

    # Production: register task environments against the cluster.
    # from workflow_code.envs import env_data, env_train
    # flyte.deploy(env_data, copy_style="none", version="1.0.0")
    # flyte.deploy(env_train, copy_style="none", version="1.0.0")

Run and deploy

uv run main.py

There is no separate deploy step. The image tag is the version. To ship a code change: edit tasks, rebuild both images, push new tags, update the tag constants in main.py, run again.

Pattern 2: Remote Builder

Teams hand you their base images. They built these images for their own purposes — Flyte was never a consideration. Your job is to adapt them.

The base images

Team A uses continuumio/miniconda3 as their base:

Dockerfile
# Team A's image: data preparation
# Uses conda. Team A has no knowledge of Flyte.
# Python is at /opt/conda/bin/python.
# No PYTHONPATH set — teams don't know Flyte needs it.

FROM continuumio/miniconda3:latest

RUN conda install -y -c conda-forge pandas==2.2.3 pyarrow==19.0.1 && \
    conda clean -afy

WORKDIR /app

  • Python at /opt/conda/bin/python (conda manages this)
  • conda’s Dockerfile already adds /opt/conda/bin to PATH
  • No PYTHONPATH set
  • WORKDIR /app

Team B uses python:3.10-slim with a pip venv at /opt/venv:

Dockerfile
# Team B's image: model training
# Uses a pip venv at /opt/venv. Team B has no knowledge of Flyte.
# Python is at /opt/venv/bin/python.
# No PYTHONPATH set, PATH doesn't include /opt/venv/bin.

FROM python:3.10-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

RUN python -m venv /opt/venv && \
    /opt/venv/bin/pip install --no-cache-dir \
        torch==2.1.2+cpu \
        numpy==1.26.4 \
    --extra-index-url https://download.pytorch.org/whl/cpu

WORKDIR /workspace

  • Python at /opt/venv/bin/python
  • PATH does not include /opt/venv/bin — the venv was created but never activated
  • No PYTHONPATH set
  • WORKDIR /workspace

Adapting with flyte.Image

flyte.Image.from_base() takes the base URI and lets you layer on top. This is where the adaptation happens:

envs.py
import pathlib

import flyte

# Root of the remote_builder example (one level up from this file)
HERE = pathlib.Path(__file__).parent.parent

REGISTRY = "<your-registry>"  # e.g. "ghcr.io/your-org" or "123456789.dkr.ecr.us-east-1.amazonaws.com"

# ── Team B: pip venv ────────────────────────────────────────────────────────────
# Base image uses a venv at /opt/venv — python at /opt/venv/bin/python.
# The venv is not activated: PATH doesn't include /opt/venv/bin.
# Flyte adapts: installs flyte via venv's pip, adds venv to PATH, sets PYTHONPATH.
# $PATH expands at Docker build time to the base image's PATH value.
env_train_image = (
    flyte.Image.from_base("<your-registry>/training-base:latest")
    .clone(name="<your-org>/<your-image>", registry=REGISTRY, extendable=True)
    .with_commands(["/opt/venv/bin/pip install flyte"])
    .with_env_vars(
        {
            "PATH": "/opt/venv/bin:$PATH",
            "PYTHONPATH": "/workspace",  # /workspace is WORKDIR
        }
    )
    .with_code_bundle()
)

# ── Team A: conda ───────────────────────────────────────────────────────────────
# Base image uses conda — python at /opt/conda/bin/python.
# conda's own Dockerfile already adds /opt/conda/bin to PATH, so python is findable.
# Flyte adapts: installs flyte via conda's pip, sets PYTHONPATH.
env_data_image = (
    flyte.Image.from_base("<your-registry>/data-prep-base:latest")
    .clone(name="<your-org>/<your-image>", registry=REGISTRY, extendable=True)
    .with_commands(["/opt/conda/bin/pip install flyte"])
    .with_env_vars({"PYTHONPATH": "/app"})  # /app is WORKDIR; code bundle extracts here
    .with_code_bundle()
)

env_train = flyte.TaskEnvironment(name="training", image=env_train_image)
env_data = flyte.TaskEnvironment(name="data-prep", image=env_data_image, depends_on=[env_train])

Team A only needs flyte installed and PYTHONPATH set. conda’s PATH is already correct.

Team B needs three things: flyte installed in the venv, PATH updated so the venv’s python is the default, and PYTHONPATH set. $PATH in an ENV instruction expands at Docker build time.

.with_code_bundle() tells Flyte to inject task source at runtime (dev) or bake it into the image at deploy time (prod).

Task definitions

tasks.py
from tasks.envs import env_data, env_train


@env_train.task
async def train(processed: str) -> float:
    # Runs in /opt/venv — torch and numpy available from Team B's base image
    return float(len(processed))


@env_data.task
async def prepare(raw: str = "Hello World") -> float:
    # Runs in conda env — pandas and pyarrow available from Team A's base image
    processed = raw.strip().lower()
    return await train(processed)

Entry point

main.py
# /// script
# requires-python = ">=3.10"
# dependencies = ["flyte"]
# ///
"""Remote builder BYOI: two teams, two Flyte-unaware images, Flyte fills the gaps.

Each team owns their base image — built with their preferred package manager,
no knowledge of Flyte. The Flyte engineer adapts by installing flyte via the
right pip, fixing PATH, and setting PYTHONPATH.

  data_prep:  continuumio/miniconda3  — conda env, python at /opt/conda/bin/python
  training:   python:3.10-slim        — venv at /opt/venv, python at /opt/venv/bin/python

Build and push base images first (from v2_guide/remote_builder/):
  docker build -f data_prep/Dockerfile -t <your-registry>/data-prep-base:latest data_prep/
  docker build -f training/Dockerfile  -t <your-registry>/training-base:latest  training/
  docker push <your-registry>/data-prep-base:latest
  docker push <your-registry>/training-base:latest

Run (from v2_guide/remote_builder/):
  uv run main.py
"""

import pathlib

import flyte

HERE = pathlib.Path(__file__).parent

if __name__ == "__main__":
    flyte.init_from_config(root_dir=HERE)

    # Development: fast code iteration — image only rebuilds when base image changes.
    # run = flyte.with_runcontext(version="dev").run(prepare, raw="Hello World")
    # print(run.url)
    # run.wait()

    # Production: bake code into both images, pin to a version.
    from tasks.envs import env_data, env_train

    flyte.deploy(env_data, copy_style="none", version="3.0.0")
    flyte.deploy(env_train, copy_style="none", version="3.0.0")

Run and deploy

uv run main.py

During development you only rebuild the base image when the Dockerfile changes. Code changes are free — they travel as a tarball at runtime.

Decision matrix

Scenario Pattern
Teams own full images, can’t let Flyte touch them Pure BYOI
Teams hand off a base image (no Flyte knowledge required) Remote Builder
Code change should not require image rebuild Remote Builder + with_code_bundle()
Base has non-standard Python location .with_commands() to fix PATH before Flyte uses it
Production deploy, self-contained containers copy_style="none" in flyte.deploy()