# Bring your own image

This guide is for teams who own their Docker images and want Flyte for orchestration without handing over their build pipeline.

> [!NOTE]
> This guide does **not** cover `flyte.Image.from_debian_base()`, the Flyte-managed image builder. It assumes you already have images.

## The multi-team problem

Two teams. Two images. One workflow.

| | Team A (data-prep) | Team B (training) |
|---|---|---|
| Base | `python:3.11-slim` | `python:3.10-slim` (prod: CUDA) |
| Python | 3.11 | 3.10 |
| WORKDIR | `/app` | `/workspace` |
| Packages | pandas, pyarrow | torch, numpy |

The `prepare` task runs in Team A's container. It processes the input and calls `train`, which runs in Team B's container. One workflow, two images, different filesystem layouts.

Two patterns solve this. Pick based on who controls what:

| | Pattern 1: Pure BYOI | Pattern 2: Remote Builder |
|---|---|---|
| Who owns the image? | Each team owns everything | Each team owns the base |
| Flyte-aware? | Yes — code is baked in | No — Flyte adapts on top |
| Code change = image rebuild? | Yes | No |
| Use when | Teams can't let Flyte touch images | Teams can hand off a base |

## Pattern 1: Pure BYOI

Teams build complete, Flyte-aware images. Workflow code is COPYed into the Dockerfile. Flyte runs the container as a black box — it sends no code and modifies nothing.

### Dockerfiles

Both teams install `flyte` and COPY the shared `workflow_code/` into their image. The only difference is their base, Python version, and WORKDIR.

**Team A (data prep):**

```dockerfile
# Team A's image: data preparation
#
# This team owns this entire Dockerfile. They control Python version, WORKDIR,
# and PYTHONPATH. Flyte has no say here.
#
# Pure BYOI constraint: workflow code must be baked in because there is no
# code bundle. Every code change requires rebuilding and pushing this image.

FROM python:3.11-slim

# System deps this team needs
RUN apt-get update && apt-get install -y --no-install-recommends \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Team A's WORKDIR. Python will find modules here because PYTHONPATH includes it.
WORKDIR /app

# Team A's Python packages. These are their own dependencies — Flyte doesn't
# install anything on top in pure BYOI mode.
RUN pip install --no-cache-dir \
    flyte \
    pandas==2.1.4 \
    pyarrow==14.0.1

# Bake the workflow code into the image.
# In pure BYOI, this is the ONLY way Flyte can find your task functions.
# Downside: every edit to tasks.py requires a new image tag + CI build.
COPY workflow_code/ /app/workflow_code/

# /app is on PYTHONPATH so `import workflow_code.tasks` resolves at runtime.
ENV PYTHONPATH=/app
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/pure_byoi/data_prep/Dockerfile*

**Team B (training):**

```dockerfile
# Team B's image: model training (GPU workload)
#
# Intentionally different from Team A:
#   - Different base: CUDA runtime instead of debian slim
#   - Different Python version: 3.10 (team B's standard)
#   - Different WORKDIR: /workspace (not /app)
#   - Different PYTHONPATH: /workspace
#   - Different system packages: CUDA tools
#
# Pure BYOI: Flyte injects nothing at runtime. Everything must be baked in.
# Each team owns their filesystem layout entirely.
#
# In practice this would be:
#   FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
# Using python:3.10-slim here so you can build/test without a GPU machine.

FROM python:3.10-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

# Team B uses /workspace, not /app. Each team controls their own layout.
WORKDIR /workspace

RUN pip install --no-cache-dir \
    flyte \
    torch==2.1.2 \
    numpy==1.26.4

# Bake in workflow code at /workspace/workflow_code/.
# Python finds it because PYTHONPATH includes /workspace.
COPY workflow_code/ /workspace/workflow_code/

# /workspace is on PYTHONPATH — same import path as Team A's image despite
# the different WORKDIR. Both images expose `import workflow_code.tasks`.
ENV PYTHONPATH=/workspace
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/pure_byoi/training/Dockerfile*

Both expose `import workflow_code.tasks` at runtime because each image's PYTHONPATH points to its own WORKDIR where the code was COPYed.

### Build and push

The build context is the `pure_byoi/` directory so that `workflow_code/` is available to both Dockerfiles:

```bash
docker build -f data_prep/Dockerfile -t <your-registry>/data-prep:latest .
docker build -f training/Dockerfile  -t <your-registry>/training:latest  .
docker push <your-registry>/data-prep:latest
docker push <your-registry>/training:latest
```

### Python code

**Environment definitions** — image names are specified via `from_ref_name()`:

```python
import flyte

# Image refs are resolved at runtime via init_from_config(images=...) in main.py.
# from_ref_name() is used instead of hardcoding the URI because this file is
# COPYed into both images — hardcoding would create a circular reference (the
# image baked into itself would reference its own tag).
env_train = flyte.TaskEnvironment(
    name="training",
    image=flyte.Image.from_ref_name("training"),
)

env_data = flyte.TaskEnvironment(
    name="data-prep",
    image=flyte.Image.from_ref_name("data-prep"),
    depends_on=[env_train],
)
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/pure_byoi/workflow_code/envs.py*

`from_ref_name()` is a placeholder resolved at runtime. The actual URIs are passed in the entry point via `init_from_config(images=...)`. This is necessary because the envs file is COPYed into both images — hardcoding a URI would create a circular reference.

**Task definitions:**

```python
from workflow_code.envs import env_data, env_train

@env_train.task
async def train(processed: str) -> float:
    # Runs in Team B's container: python 3.10, WORKDIR /workspace
    # torch is available (installed in training/Dockerfile)
    return float(len(processed))

@env_data.task
async def prepare(raw: str = "Hello World") -> float:
    # Runs in Team A's container: python 3.11, WORKDIR /app
    # pandas is available (installed in data_prep/Dockerfile)
    processed = raw.strip().lower()
    return await train(processed)
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/pure_byoi/workflow_code/tasks.py*

**Entry point** — this is where image URIs are wired in:

```python
# /// script
# requires-python = ">=3.10"
# dependencies = ["flyte"]
# ///
"""Pure BYOI entry point.

workflow_code/ is baked into both images via their Dockerfiles.
Flyte runs each container as a black box — no code bundle is sent.

Build and push both images first (from v2_guide/pure_byoi/):
  docker build -f data_prep/Dockerfile -t <your-registry>/data-prep:latest .
  docker build -f training/Dockerfile  -t <your-registry>/training:latest  .
  docker push <your-registry>/data-prep:latest
  docker push <your-registry>/training:latest

Run (from v2_guide/pure_byoi/):
  uv run main.py
"""

from workflow_code.tasks import prepare

import flyte

DATA_PREP_IMAGE = "<your-registry>/data-prep:latest"
TRAINING_IMAGE = "<your-registry>/training:latest"

if __name__ == "__main__":
    # No root_dir — Flyte does not inject code. The images contain everything.
    flyte.init_from_config(
        project="flytesnacks",
        domain="development",
        images=(
            f"data-prep={DATA_PREP_IMAGE}",
            f"training={TRAINING_IMAGE}",
        ),
    )

    # Development: run the pipeline.
    # copy_style="none": no code bundle, the image IS the deployment.
    run = flyte.with_runcontext(copy_style="none", version="dev").run(prepare, raw="Hello World")
    print(run.url)
    run.wait()

    # Production: register task environments against the cluster.
    # from workflow_code.envs import env_data, env_train
    # flyte.deploy(env_data, copy_style="none", version="1.0.0")
    # flyte.deploy(env_train, copy_style="none", version="1.0.0")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/pure_byoi/main.py*

### Run and deploy

```bash
uv run main.py
```

There is no separate deploy step. The image tag is the version. To ship a code change: edit tasks, rebuild both images, push new tags, update the tag constants in `main.py`, run again.

## Pattern 2: Remote Builder

Teams hand you their base images. They built these images for their own purposes — Flyte was never a consideration. Your job is to adapt them.

### The base images

**Team A** uses `continuumio/miniconda3` as their base:

```dockerfile
# Team A's image: data preparation
# Uses conda. Team A has no knowledge of Flyte.
# Python is at /opt/conda/bin/python.
# No PYTHONPATH set — teams don't know Flyte needs it.

FROM continuumio/miniconda3:latest

RUN conda install -y -c conda-forge pandas==2.2.3 pyarrow==19.0.1 && \
    conda clean -afy

WORKDIR /app
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/remote_builder/data_prep/Dockerfile*

- Python at `/opt/conda/bin/python` (conda manages this)
- conda's Dockerfile already adds `/opt/conda/bin` to `PATH`
- No PYTHONPATH set
- WORKDIR `/app`

**Team B** uses `python:3.10-slim` with a pip venv at `/opt/venv`:

```dockerfile
# Team B's image: model training
# Uses a pip venv at /opt/venv. Team B has no knowledge of Flyte.
# Python is at /opt/venv/bin/python.
# No PYTHONPATH set, PATH doesn't include /opt/venv/bin.

FROM python:3.10-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*

RUN python -m venv /opt/venv && \
    /opt/venv/bin/pip install --no-cache-dir \
        torch==2.1.2+cpu \
        numpy==1.26.4 \
    --extra-index-url https://download.pytorch.org/whl/cpu

WORKDIR /workspace
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/remote_builder/training/Dockerfile*

- Python at `/opt/venv/bin/python`
- `PATH` does **not** include `/opt/venv/bin` — the venv was created but never activated
- No PYTHONPATH set
- WORKDIR `/workspace`

### Adapting with `flyte.Image`

`flyte.Image.from_base()` takes the base URI and lets you layer on top. This is where the adaptation happens:

```python
import pathlib

import flyte

# Root of the remote_builder example (one level up from this file)
HERE = pathlib.Path(__file__).parent.parent

REGISTRY = "<your-registry>"  # e.g. "ghcr.io/your-org" or "123456789.dkr.ecr.us-east-1.amazonaws.com"

# ── Team B: pip venv ────────────────────────────────────────────────────────────
# Base image uses a venv at /opt/venv — python at /opt/venv/bin/python.
# The venv is not activated: PATH doesn't include /opt/venv/bin.
# Flyte adapts: installs flyte via venv's pip, adds venv to PATH, sets PYTHONPATH.
# $PATH expands at Docker build time to the base image's PATH value.
env_train_image = (
    flyte.Image.from_base("<your-registry>/training-base:latest")
    .clone(name="<your-org>/<your-image>", registry=REGISTRY, extendable=True)
    .with_commands(["/opt/venv/bin/pip install flyte"])
    .with_env_vars(
        {
            "PATH": "/opt/venv/bin:$PATH",
            "PYTHONPATH": "/workspace",  # /workspace is WORKDIR
        }
    )
    .with_code_bundle()
)

# ── Team A: conda ───────────────────────────────────────────────────────────────
# Base image uses conda — python at /opt/conda/bin/python.
# conda's own Dockerfile already adds /opt/conda/bin to PATH, so python is findable.
# Flyte adapts: installs flyte via conda's pip, sets PYTHONPATH.
env_data_image = (
    flyte.Image.from_base("<your-registry>/data-prep-base:latest")
    .clone(name="<your-org>/<your-image>", registry=REGISTRY, extendable=True)
    .with_commands(["/opt/conda/bin/pip install flyte"])
    .with_env_vars({"PYTHONPATH": "/app"})  # /app is WORKDIR; code bundle extracts here
    .with_code_bundle()
)

env_train = flyte.TaskEnvironment(name="training", image=env_train_image)
env_data = flyte.TaskEnvironment(name="data-prep", image=env_data_image, depends_on=[env_train])
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/remote_builder/tasks/envs.py*

**Team A** only needs `flyte` installed and `PYTHONPATH` set. conda's PATH is already correct.

**Team B** needs three things: `flyte` installed in the venv, `PATH` updated so the venv's `python` is the default, and `PYTHONPATH` set. `$PATH` in an `ENV` instruction expands at Docker build time.

`.with_code_bundle()` tells Flyte to inject task source at runtime (dev) or bake it into the image at deploy time (prod).

### Task definitions

```python
from tasks.envs import env_data, env_train

@env_train.task
async def train(processed: str) -> float:
    # Runs in /opt/venv — torch and numpy available from Team B's base image
    return float(len(processed))

@env_data.task
async def prepare(raw: str = "Hello World") -> float:
    # Runs in conda env — pandas and pyarrow available from Team A's base image
    processed = raw.strip().lower()
    return await train(processed)
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/remote_builder/tasks/tasks.py*

### Entry point

```python
# /// script
# requires-python = ">=3.10"
# dependencies = ["flyte"]
# ///
"""Remote builder BYOI: two teams, two Flyte-unaware images, Flyte fills the gaps.

Each team owns their base image — built with their preferred package manager,
no knowledge of Flyte. The Flyte engineer adapts by installing flyte via the
right pip, fixing PATH, and setting PYTHONPATH.

  data_prep:  continuumio/miniconda3  — conda env, python at /opt/conda/bin/python
  training:   python:3.10-slim        — venv at /opt/venv, python at /opt/venv/bin/python

Build and push base images first (from v2_guide/remote_builder/):
  docker build -f data_prep/Dockerfile -t <your-registry>/data-prep-base:latest data_prep/
  docker build -f training/Dockerfile  -t <your-registry>/training-base:latest  training/
  docker push <your-registry>/data-prep-base:latest
  docker push <your-registry>/training-base:latest

Run (from v2_guide/remote_builder/):
  uv run main.py
"""

import pathlib

import flyte

HERE = pathlib.Path(__file__).parent

if __name__ == "__main__":
    flyte.init_from_config(root_dir=HERE)

    # Development: fast code iteration — image only rebuilds when base image changes.
    # run = flyte.with_runcontext(version="dev").run(prepare, raw="Hello World")
    # print(run.url)
    # run.wait()

    # Production: bake code into both images, pin to a version.
    from tasks.envs import env_data, env_train

    flyte.deploy(env_data, copy_style="none", version="3.0.0")
    flyte.deploy(env_train, copy_style="none", version="3.0.0")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/project-patterns/bring-your-own-image/remote_builder/main.py*

### Run and deploy

```bash
uv run main.py
```

During development you only rebuild the base image when the Dockerfile changes. Code changes are free — they travel as a tarball at runtime.

## Decision matrix

| Scenario | Pattern |
|---|---|
| Teams own full images, can't let Flyte touch them | Pure BYOI |
| Teams hand off a base image (no Flyte knowledge required) | Remote Builder |
| Code change should not require image rebuild | Remote Builder + `with_code_bundle()` |
| Base has non-standard Python location | `.with_commands()` to fix PATH before Flyte uses it |
| Production deploy, self-contained containers | `copy_style="none"` in `flyte.deploy()` |

---
**Source**: https://github.com/unionai/unionai-docs/blob/main/content/user-guide/project-patterns/bring-your-own-image.md
**HTML**: https://www.union.ai/docs/v2/union/user-guide/project-patterns/bring-your-own-image/
