# Native app integrations
> This bundle contains all pages in the Native app integrations section.
> Source: https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/native-app-integrations ===

# Native app integrations

> **📝 Note**
>
> An LLM-optimized bundle of this entire section is available at [`section.md`](section.md).
> This single file contains all pages in this section, optimized for AI coding agent context.

Flyte ships with a set of pre-built [`AppEnvironment`](https://www.union.ai/docs/v2/union/user-guide/build-apps/_index) integrations that wrap popular frameworks and serving runtimes, so you can deploy common app types without writing the integration glue yourself. Each integration provides a ready-to-use environment class — just configure your app, image, resources, and scaling, and Flyte handles the rest.

> [!TIP]
> If you're new to apps in Flyte, start with [Introducing apps](https://www.union.ai/docs/v2/union/user-guide/core-concepts/introducing-apps/page.md) for an overview, then see [Build apps](https://www.union.ai/docs/v2/union/user-guide/build-apps/_index) to learn how to build custom app environments from scratch.

## When to use a native integration

Use a native integration when your app fits one of the supported frameworks and you want:

- **A minimal, opinionated setup** — sensible defaults for the framework, no boilerplate
- **First-class support** — features like model streaming, OpenAI-compatible APIs, and passthrough auth wired in for you
- **Faster time-to-deploy** — focus on your app logic, not on packaging and serving plumbing

For app types not covered here, build a custom [`AppEnvironment`](https://www.union.ai/docs/v2/union/user-guide/build-apps/_index) using the patterns in the [Build apps](https://www.union.ai/docs/v2/union/user-guide/build-apps/_index) section.

## Available integrations

| Integration | Framework | Typical use case |
|---|---|---|
| **Native app integrations > Streamlit app** | [Streamlit](https://streamlit.io/) | Interactive dashboards and data apps |
| **Native app integrations > FastAPI app** | [FastAPI](https://fastapi.tiangolo.com/) | REST APIs, webhooks, and backend services |
| **Native app integrations > vLLM app** | [vLLM](https://docs.vllm.ai/) | High-throughput LLM inference with an OpenAI-compatible API |
| **Native app integrations > SGLang app** | [SGLang](https://docs.sglang.io/) | Structured generation and LLM serving with an OpenAI-compatible API |
| **Native app integrations > Flyte webhook** | [FastAPI](https://fastapi.tiangolo.com/) | Pre-built HTTP endpoints for common Flyte control-plane operations |

## Next steps

- **Native app integrations > Streamlit app**: Build interactive Streamlit dashboards
- **Native app integrations > FastAPI app**: Create REST APIs and backend services
- **Native app integrations > vLLM app**: Serve large language models with vLLM
- **Native app integrations > SGLang app**: Serve LLMs with SGLang for structured generation
- **Native app integrations > Flyte webhook**: Pre-built webhook for common Flyte operations

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/streamlit-app ===

# Streamlit app

Streamlit is a popular framework for building interactive web applications and dashboards. Flyte makes it easy to deploy Streamlit apps as long-running services.

## Basic Streamlit app

The simplest way to deploy a Streamlit app is to use the built-in Streamlit "hello" demo:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
# ]
# ///

"""A basic Streamlit app using the built-in hello demo."""

# {{docs-fragment app-definition}}
import flyte
import flyte.app

image = flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages("streamlit==1.41.1")

app_env = flyte.app.AppEnvironment(
    name="streamlit-hello",
    image=image,
    args="streamlit hello --server.port 8080",
    port=8080,
    resources=flyte.Resources(cpu="1", memory="1Gi"),
    requires_auth=False,
)

if __name__ == "__main__":
    flyte.init_from_config()
    app = flyte.deploy(app_env)
    print(f"Deployed app: {app[0].summary_repr()}")
# {{/docs-fragment app-definition}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/streamlit/basic_streamlit.py*

This just serves the built-in Streamlit "hello" demo.

## Single-file Streamlit app

For a single-file Streamlit app, you can wrap the app code in a function and use the `args` parameter to specify the command to run the app.
Note that the command is running the file itself, and uses the `--server` flag to start the server.

This is useful when you have a relatively small and simple app that you want to deploy as a single file.

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "streamlit",
# ]
# ///

"""A single-script Streamlit app example."""

import sys
from pathlib import Path

import streamlit as st

import flyte
import flyte.app

# {{docs-fragment streamlit-app}}
def main():
    st.set_page_config(page_title="Simple Streamlit App", page_icon="🚀")

    st.title("Hello from Streamlit!")
    st.write("This is a simple single-script Streamlit app.")

    name = st.text_input("What's your name?", "World")
    st.write(f"Hello, {name}!")

    if st.button("Click me!"):
        st.balloons()
        st.success("Button clicked!")
# {{/docs-fragment streamlit-app}}

file_name = Path(__file__).name
# {{docs-fragment app-env}}
app_env = flyte.app.AppEnvironment(
    name="streamlit-single-script",
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages("streamlit==1.41.1"),
    args=[
        "streamlit",
        "run",
        file_name,
        "--server.port",
        "8080",
        "--",
        "--server",
    ],
    port=8080,
    resources=flyte.Resources(cpu="1", memory="1Gi"),
    requires_auth=False,
)
# {{/docs-fragment app-env}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    import logging
    import sys

    if "--server" in sys.argv:
        main()
    else:
        flyte.init_from_config(
            root_dir=Path(__file__).parent,
            log_level=logging.DEBUG,
        )
        app = flyte.serve(app_env)
        print(f"App URL: {app.url}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/streamlit/single_file_streamlit.py*

Note that the `if __name__ == "__main__"` block is used to both serve the `AppEnvironment` *and* run the app code via
the `streamlit run` command using the `--server` flag.

## Multi-file Streamlit app

When your streamlit application grows more complex, you may want to split your app into multiple files.
For a multi-file Streamlit app, use the `include` parameter to bundle your app files:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
# ]
# ///

"""A custom Streamlit app with multiple files."""

import pathlib
import flyte
import flyte.app

# {{docs-fragment app-env}}
image = flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
    "streamlit==1.41.1",
    "pandas==2.2.3",
    "numpy==2.2.3",
)

app_env = flyte.app.AppEnvironment(
    name="streamlit-multi-file-app",
    image=image,
    args="streamlit run main.py --server.port 8080",
    port=8080,
    include=["main.py", "utils.py"],  # Include your app files
    resources=flyte.Resources(cpu="1", memory="1Gi"),
    requires_auth=False,
)
# {{/docs-fragment app-env}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    flyte.init_from_config(root_dir=pathlib.Path(__file__).parent)
    app = flyte.deploy(app_env)
    print(f"Deployed app: {app[0].summary_repr()}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/streamlit/multi_file_streamlit.py*

Where your project structure looks like this:

```
project/
├── main.py           # Main Streamlit app
├── utils.py          # Utility functions
└── components.py     # Reusable components
```

Your `main.py` file would contain your Streamlit app code:

```
import os

import streamlit as st
from utils import generate_data

# {{docs-fragment streamlit-app}}
all_columns = ["Apples", "Orange", "Pineapple"]
with st.container(border=True):
    columns = st.multiselect("Columns", all_columns, default=all_columns)

all_data = st.cache_data(generate_data)(columns=all_columns, seed=101)

data = all_data[columns]

tab1, tab2 = st.tabs(["Chart", "Dataframe"])
tab1.line_chart(data, height=250)
tab2.dataframe(data, height=250, use_container_width=True)
st.write(f"Environment: {os.environ}")
# {{/docs-fragment streamlit-app}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/streamlit/main.py*

## Example: Data visualization dashboard

Here's a complete example of a Streamlit dashboard, all in a single file.

Define the streamlit app in the `main` function:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "streamlit",
#    "pandas",
#    "numpy",
# ]
# ///

"""A data visualization dashboard example using Streamlit."""

import sys
from pathlib import Path

import numpy as np
import pandas as pd
import streamlit as st

import flyte
import flyte.app

# {{docs-fragment streamlit-app}}
def main():
    st.set_page_config(page_title="Sales Dashboard", page_icon="📊")

    st.title("Sales Dashboard")

    # Load data
    @st.cache_data
    def load_data():
        return pd.DataFrame({
            "date": pd.date_range("2024-01-01", periods=100, freq="D"),
            "sales": np.random.randint(1000, 5000, 100),
        })

    data = load_data()

    # Sidebar filters
    st.sidebar.header("Filters")
    start_date = st.sidebar.date_input("Start date", value=data["date"].min())
    end_date = st.sidebar.date_input("End date", value=data["date"].max())

    # Filter data
    filtered_data = data[
        (data["date"] >= pd.Timestamp(start_date)) &
        (data["date"] <= pd.Timestamp(end_date))
    ]

    # Display metrics
    col1, col2, col3 = st.columns(3)
    with col1:
        st.metric("Total Sales", f"${filtered_data['sales'].sum():,.0f}")
    with col2:
        st.metric("Average Sales", f"${filtered_data['sales'].mean():,.0f}")
    with col3:
        st.metric("Days", len(filtered_data))

    # Chart
    st.line_chart(filtered_data.set_index("date")["sales"])

# {{/docs-fragment streamlit-app}}

# {{docs-fragment app-env}}
file_name = Path(__file__).name
app_env = flyte.app.AppEnvironment(
    name="sales-dashboard",
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "streamlit==1.41.1",
        "pandas==2.2.3",
        "numpy==2.2.3",
    ),
    args=["streamlit run", file_name, "--server.port", "8080", "--", "--server"],
    port=8080,
    resources=flyte.Resources(cpu="2", memory="2Gi"),
    requires_auth=False,
)
# {{/docs-fragment app-env}}

# {{docs-fragment serve}}
if __name__ == "__main__":
    import logging
    import sys

    if "--server" in sys.argv:
        main()
    else:
        flyte.init_from_config(
            root_dir=Path(__file__).parent,
            log_level=logging.DEBUG,
        )
        app = flyte.serve(app_env)
        print(f"Dashboard URL: {app.url}")
# {{/docs-fragment serve}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/streamlit/data_visualization_dashboard.py*

Define the `AppEnvironment` to serve the app:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "streamlit",
#    "pandas",
#    "numpy",
# ]
# ///

"""A data visualization dashboard example using Streamlit."""

import sys
from pathlib import Path

import numpy as np
import pandas as pd
import streamlit as st

import flyte
import flyte.app

# {{docs-fragment streamlit-app}}
def main():
    st.set_page_config(page_title="Sales Dashboard", page_icon="📊")

    st.title("Sales Dashboard")

    # Load data
    @st.cache_data
    def load_data():
        return pd.DataFrame({
            "date": pd.date_range("2024-01-01", periods=100, freq="D"),
            "sales": np.random.randint(1000, 5000, 100),
        })

    data = load_data()

    # Sidebar filters
    st.sidebar.header("Filters")
    start_date = st.sidebar.date_input("Start date", value=data["date"].min())
    end_date = st.sidebar.date_input("End date", value=data["date"].max())

    # Filter data
    filtered_data = data[
        (data["date"] >= pd.Timestamp(start_date)) &
        (data["date"] <= pd.Timestamp(end_date))
    ]

    # Display metrics
    col1, col2, col3 = st.columns(3)
    with col1:
        st.metric("Total Sales", f"${filtered_data['sales'].sum():,.0f}")
    with col2:
        st.metric("Average Sales", f"${filtered_data['sales'].mean():,.0f}")
    with col3:
        st.metric("Days", len(filtered_data))

    # Chart
    st.line_chart(filtered_data.set_index("date")["sales"])

# {{/docs-fragment streamlit-app}}

# {{docs-fragment app-env}}
file_name = Path(__file__).name
app_env = flyte.app.AppEnvironment(
    name="sales-dashboard",
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "streamlit==1.41.1",
        "pandas==2.2.3",
        "numpy==2.2.3",
    ),
    args=["streamlit run", file_name, "--server.port", "8080", "--", "--server"],
    port=8080,
    resources=flyte.Resources(cpu="2", memory="2Gi"),
    requires_auth=False,
)
# {{/docs-fragment app-env}}

# {{docs-fragment serve}}
if __name__ == "__main__":
    import logging
    import sys

    if "--server" in sys.argv:
        main()
    else:
        flyte.init_from_config(
            root_dir=Path(__file__).parent,
            log_level=logging.DEBUG,
        )
        app = flyte.serve(app_env)
        print(f"Dashboard URL: {app.url}")
# {{/docs-fragment serve}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/streamlit/data_visualization_dashboard.py*

And finally the app serving logic:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "streamlit",
#    "pandas",
#    "numpy",
# ]
# ///

"""A data visualization dashboard example using Streamlit."""

import sys
from pathlib import Path

import numpy as np
import pandas as pd
import streamlit as st

import flyte
import flyte.app

# {{docs-fragment streamlit-app}}
def main():
    st.set_page_config(page_title="Sales Dashboard", page_icon="📊")

    st.title("Sales Dashboard")

    # Load data
    @st.cache_data
    def load_data():
        return pd.DataFrame({
            "date": pd.date_range("2024-01-01", periods=100, freq="D"),
            "sales": np.random.randint(1000, 5000, 100),
        })

    data = load_data()

    # Sidebar filters
    st.sidebar.header("Filters")
    start_date = st.sidebar.date_input("Start date", value=data["date"].min())
    end_date = st.sidebar.date_input("End date", value=data["date"].max())

    # Filter data
    filtered_data = data[
        (data["date"] >= pd.Timestamp(start_date)) &
        (data["date"] <= pd.Timestamp(end_date))
    ]

    # Display metrics
    col1, col2, col3 = st.columns(3)
    with col1:
        st.metric("Total Sales", f"${filtered_data['sales'].sum():,.0f}")
    with col2:
        st.metric("Average Sales", f"${filtered_data['sales'].mean():,.0f}")
    with col3:
        st.metric("Days", len(filtered_data))

    # Chart
    st.line_chart(filtered_data.set_index("date")["sales"])

# {{/docs-fragment streamlit-app}}

# {{docs-fragment app-env}}
file_name = Path(__file__).name
app_env = flyte.app.AppEnvironment(
    name="sales-dashboard",
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "streamlit==1.41.1",
        "pandas==2.2.3",
        "numpy==2.2.3",
    ),
    args=["streamlit run", file_name, "--server.port", "8080", "--", "--server"],
    port=8080,
    resources=flyte.Resources(cpu="2", memory="2Gi"),
    requires_auth=False,
)
# {{/docs-fragment app-env}}

# {{docs-fragment serve}}
if __name__ == "__main__":
    import logging
    import sys

    if "--server" in sys.argv:
        main()
    else:
        flyte.init_from_config(
            root_dir=Path(__file__).parent,
            log_level=logging.DEBUG,
        )
        app = flyte.serve(app_env)
        print(f"Dashboard URL: {app.url}")
# {{/docs-fragment serve}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/streamlit/data_visualization_dashboard.py*

## Best practices

1. **Use `include` for custom apps**: Always include your app files when deploying custom Streamlit code
2. **Set the port correctly**: Ensure your Streamlit app uses `--server.port 8080` (or match your `port` setting)
3. **Cache data**: Use `@st.cache_data` for expensive computations to improve performance
4. **Resource sizing**: Adjust resources based on your app's needs (data size, computations)
5. **Public vs private**: Set `requires_auth=False` for public dashboards, `True` for internal tools

## Troubleshooting

**App not loading:**
- Verify the port matches (use `--server.port 8080`)
- Check that all required files are included
- Review container logs for errors

**Missing dependencies:**
- Ensure all required packages are in your image's pip packages
- Check that file paths in `include` are correct

**Performance issues:**
- Increase CPU/memory resources
- Use Streamlit's caching features (`@st.cache_data`, `@st.cache_resource`)
- Optimize data processing

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/fastapi-app ===

# FastAPI app

FastAPI is a modern, fast web framework for building APIs. Flyte provides `FastAPIAppEnvironment` which makes it easy to deploy FastAPI applications.

## Basic FastAPI app

Here's a simple FastAPI app:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "fastapi",
# ]
# ///

"""A basic FastAPI app example."""

from fastapi import FastAPI
import pathlib
import flyte
from flyte.app.extras import FastAPIAppEnvironment

# {{docs-fragment fastapi-app}}
app = FastAPI(
    title="My API",
    description="A simple FastAPI application",
    version="1.0.0",
)
# {{/docs-fragment fastapi-app}}

# {{docs-fragment fastapi-env}}
env = FastAPIAppEnvironment(
    name="my-fastapi-app",
    app=app,
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi",
        "uvicorn",
    ),
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=False,
)
# {{/docs-fragment fastapi-env}}

# {{docs-fragment endpoints}}
@app.get("/")
async def root():
    return {"message": "Hello, World!"}

@app.get("/health")
async def health_check():
    return {"status": "healthy"}
# {{/docs-fragment endpoints}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    flyte.init_from_config(root_dir=pathlib.Path(__file__).parent)
    app_deployment = flyte.deploy(env)
    print(f"Deployed: {app_deployment[0].summary_repr()}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/fastapi/basic_fastapi.py*

Once deployed, you can:
- Access the API at the generated URL
- View interactive API docs at `/docs` (Swagger UI)
- View alternative docs at `/redoc`

## Serving a machine learning model

Here's an example of serving a scikit-learn model:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "fastapi",
#    "scikit-learn",
#    "joblib",
# ]
# ///

"""Example of serving a machine learning model with FastAPI."""

import os
from contextlib import asynccontextmanager
from pathlib import Path

import joblib
import flyte
from fastapi import FastAPI
from flyte.app.extras import FastAPIAppEnvironment
from pydantic import BaseModel

# {{docs-fragment ml-model}}
app = FastAPI(title="ML Model API")

# Define request/response models
class PredictionRequest(BaseModel):
    feature1: float
    feature2: float
    feature3: float

class PredictionResponse(BaseModel):
    prediction: float
    probability: float

# Load model (you would typically load this from storage)
model = None

@asynccontextmanager
async def lifespan(app: FastAPI):
    global model
    model_path = os.getenv("MODEL_PATH", "/app/models/model.joblib")
    # In production, load from your storage
    if os.path.exists(model_path):
        with open(model_path, "rb") as f:
            model = joblib.load(f)
    yield

@app.post("/predict", response_model=PredictionResponse)
async def predict(request: PredictionRequest):
    # Make prediction
    # prediction = model.predict([[request.feature1, request.feature2, request.feature3]])

    # Dummy prediction for demo
    prediction = 0.85
    probability = 0.92

    return PredictionResponse(
        prediction=prediction,
        probability=probability,
    )

env = FastAPIAppEnvironment(
    name="ml-model-api",
    app=app,
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi",
        "uvicorn",
        "scikit-learn",
        "pydantic",
        "joblib",
    ),
    parameters=[
        flyte.app.Parameter(
            name="model_file",
            value=flyte.io.File("s3://bucket/models/model.joblib"),
            mount="/app/models",
            env_var="MODEL_PATH",
        ),
    ],
    resources=flyte.Resources(cpu=2, memory="2Gi"),
    requires_auth=False,
)
# {{/docs-fragment ml-model}}

if __name__ == "__main__":
    flyte.init_from_config(root_dir=Path(__file__).parent)
    app_deployment = flyte.deploy(env)
    print(f"API URL: {app_deployment[0].url}")
    print(f"Swagger docs: {app_deployment[0].url}/docs")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/fastapi/ml_model_serving.py*

## Accessing Swagger documentation

FastAPI automatically generates interactive API documentation. Once deployed:

- **Swagger UI**: Access at `{app_url}/docs`
- **ReDoc**: Access at `{app_url}/redoc`
- **OpenAPI JSON**: Access at `{app_url}/openapi.json`

The Swagger UI provides an interactive interface where you can:
- See all available endpoints
- Test API calls directly from the browser
- View request/response schemas
- See example payloads

## Example: REST API with multiple endpoints

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "fastapi",
# ]
# ///

"""Example REST API with multiple endpoints."""

from pathlib import Path
from typing import List
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import flyte
from flyte.app.extras import FastAPIAppEnvironment

# {{docs-fragment rest-api}}
app = FastAPI(title="Product API")

# Data models
class Product(BaseModel):
    id: int
    name: str
    price: float

class ProductCreate(BaseModel):
    name: str
    price: float

# In-memory database (use real database in production)
products_db = []

@app.get("/products", response_model=List[Product])
async def get_products():
    return products_db

@app.get("/products/{product_id}", response_model=Product)
async def get_product(product_id: int):
    product = next((p for p in products_db if p["id"] == product_id), None)
    if not product:
        raise HTTPException(status_code=404, detail="Product not found")
    return product

@app.post("/products", response_model=Product)
async def create_product(product: ProductCreate):
    new_product = {
        "id": len(products_db) + 1,
        "name": product.name,
        "price": product.price,
    }
    products_db.append(new_product)
    return new_product

env = FastAPIAppEnvironment(
    name="product-api",
    app=app,
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi",
        "uvicorn",
    ),
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=False,
)
# {{/docs-fragment rest-api}}

if __name__ == "__main__":
    flyte.init_from_config(root_dir=Path(__file__).parent)
    app_deployment = flyte.deploy(env)
    print(f"API URL: {app_deployment[0].url}")
    print(f"Swagger docs: {app_deployment[0].url}/docs")
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/fastapi/rest_api.py*

## Multi-file FastAPI app

Here's an example of a multi-file FastAPI app:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "fastapi",
# ]
# ///

"""Multi-file FastAPI app example."""

from fastapi import FastAPI
from module import function  # Import from another file
import pathlib

import flyte
from flyte.app.extras import FastAPIAppEnvironment

# {{docs-fragment app-definition}}
app = FastAPI(title="Multi-file FastAPI Demo")

app_env = FastAPIAppEnvironment(
    name="fastapi-multi-file",
    app=app,
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi",
        "uvicorn",
    ),
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=False,
    # FastAPIAppEnvironment automatically includes necessary files
    # But you can also specify explicitly:
    # include=["app.py", "module.py"],
)
# {{/docs-fragment app-definition}}

# {{docs-fragment endpoint}}
@app.get("/")
async def root():
    return function()  # Uses function from module.py
# {{/docs-fragment endpoint}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    flyte.init_from_config(root_dir=pathlib.Path(__file__).parent)
    app_deployment = flyte.deploy(app_env)
    print(f"Deployed: {app_deployment[0].summary_repr()}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/fastapi/multi_file/app.py*

The helper module:

```
# {{docs-fragment helper-function}}
def function():
    """Helper function used by the FastAPI app."""
    return {"message": "Hello from module.py!"}
# {{/docs-fragment helper-function}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/fastapi/multi_file/module.py*

See [Multi-script apps](https://www.union.ai/docs/v2/union/user-guide/build-apps/multi-script-apps) for more details on building FastAPI apps with multiple files.

## Local-to-remote model serving

A common ML pattern: train a model with a Flyte pipeline, then serve predictions from it. During local development, the app loads the model from a local file (e.g. `model.pt` saved by your training pipeline). When deployed remotely, Flyte's `Parameter` system automatically resolves the model from the latest training run output.

```python
from contextlib import asynccontextmanager
from pathlib import Path
import os

from fastapi import FastAPI
import flyte
from flyte.app import Parameter, RunOutput
from flyte.app.extras import FastAPIAppEnvironment

MODEL_PATH_ENV = "MODEL_PATH"

@asynccontextmanager
async def lifespan(app: FastAPI):
    """Load model on startup, either local file or remote run output."""
    model_path = Path(os.environ.get(MODEL_PATH_ENV, "model.pt"))
    model = load_model(model_path)
    app.state.model = model
    yield

app = FastAPI(title="MNIST Predictor", lifespan=lifespan)

serving_env = FastAPIAppEnvironment(
    name="mnist-predictor",
    app=app,
    parameters=[
        # Remote: resolves model from the latest train run and sets MODEL_PATH
        Parameter(
            name="model",
            value=RunOutput(task_name="ml_pipeline.pipeline", type="file", getter=(1,)),
            download=True,
            env_var=MODEL_PATH_ENV,
        ),
    ],
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi", "uvicorn", "torch", "torchvision",
    ),
    resources=flyte.Resources(cpu=1, memory="4Gi"),
)

@app.get("/predict")
async def predict(index: int = 0) -> dict:
    return {"prediction": app.state.model(index)}

if __name__ == "__main__":
    # Local: skip RunOutput resolution, lifespan falls back to local model.pt
    serving_env.parameters = []
    local_app = flyte.with_servecontext(mode="local").serve(serving_env)
    local_app.activate(wait=True)
```

Locally, the app loads `model.pt` from disk:

```bash
python serve_model.py
```

Remotely, Flyte resolves the model from the latest training run:

```bash
flyte deploy serve_model.py serving_env
```

The key idea: `Parameter` with `RunOutput` bridges the gap between local and remote. Locally, the app falls back to a local file. Remotely, Flyte resolves the model artifact from the latest pipeline run automatically.

## Best practices

1. **Use Pydantic models**: Define request/response models for type safety and automatic validation
2. **Handle errors**: Use HTTPException for proper error responses
3. **Async operations**: Use async/await for I/O operations
4. **Environment variables**: Use environment variables for configuration
5. **Logging**: Add proper logging for debugging and monitoring
6. **Health checks**: Always include a `/health` endpoint
7. **API documentation**: FastAPI auto-generates docs, but add descriptions to your endpoints

## Advanced features

FastAPI supports many features that work with Flyte:

- **Dependencies**: Use FastAPI's dependency injection system
- **Background tasks**: Run background tasks with BackgroundTasks
- **WebSockets**: See [WebSocket apps](https://www.union.ai/docs/v2/union/user-guide/build-apps/websocket-apps) for details
- **Authentication**: Add authentication middleware (see [secret-based authentication](https://www.union.ai/docs/v2/union/user-guide/build-apps/secret-based-authentication))
- **CORS**: Configure CORS for cross-origin requests
- **Rate limiting**: Add rate limiting middleware

## Troubleshooting

**App not starting:**
- Check that uvicorn can find your app module
- Verify all dependencies are installed in the image
- Check container logs for startup errors

**Import errors:**
- Ensure all imported modules are available
- Use `include` parameter if you have custom modules
- Check that file paths are correct

**API not accessible:**
- Verify `requires_auth` setting
- Check that the app is listening on the correct port (8080)
- Review network/firewall settings

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/vllm-app ===

# vLLM app

vLLM is a high-performance library for serving large language models (LLMs). Flyte provides `VLLMAppEnvironment` for deploying vLLM model servers.

## Installation

First, install the vLLM plugin:

```bash
pip install flyteplugins-vllm
```

## Basic vLLM app

Here's a simple example serving a HuggingFace model:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "flyteplugins-vllm>=2.0.0b45",
# ]
# ///

"""A simple vLLM app example."""

from flyteplugins.vllm import VLLMAppEnvironment
import flyte

# {{docs-fragment basic-vllm-app}}
vllm_app = VLLMAppEnvironment(
    name="my-llm-app",
    model_hf_path="Qwen/Qwen3-0.6B",  # HuggingFace model path
    model_id="qwen3-0.6b",  # Model ID exposed by vLLM
    resources=flyte.Resources(
        cpu="4",
        memory="16Gi",
        gpu="L40s:1",  # GPU required for LLM serving
        disk="10Gi",
    ),
    scaling=flyte.app.Scaling(
        replicas=(0, 1),
        scaledown_after=300,  # Scale down after 5 minutes of inactivity
    ),
    requires_auth=False,
)
# {{/docs-fragment basic-vllm-app}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    flyte.init_from_config()
    app = flyte.serve(vllm_app)
    print(f"Deployed vLLM app: {app.url}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/vllm/basic_vllm.py*

## Using prefetched models

You can use models prefetched with `flyte.prefetch`:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "flyteplugins-vllm>=2.0.0b45",
# ]
# override-dependencies = [
#    "cel-python; sys_platform == 'never'",
# ]
# ///

"""vLLM app using prefetched models."""

from flyteplugins.vllm import VLLMAppEnvironment
import flyte

# {{docs-fragment prefetch}}

# Use the prefetched model
vllm_app = VLLMAppEnvironment(
    name="my-llm-app",
    model_hf_path="Qwen/Qwen3-0.6B",  # this is a placeholder
    model_id="qwen3-0.6b",
    resources=flyte.Resources(cpu="4", memory="16Gi", gpu="L40s:1", disk="10Gi"),
    stream_model=True,  # Stream model directly from blob store to GPU
    requires_auth=False,
)

if __name__ == "__main__":
    flyte.init_from_config()

    # Prefetch the model first
    run = flyte.prefetch.hf_model(repo="Qwen/Qwen3-0.6B")
    run.wait()

    # Use the prefetched model
    app = flyte.serve(
        vllm_app.clone_with(
            vllm_app.name,
            model_hf_path=None,
            model_path=flyte.app.RunOutput(type="directory", run_name=run.name),
        )
    )
    print(f"Deployed vLLM app: {app.url}")
# {{/docs-fragment prefetch}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/vllm/vllm_with_prefetch.py*

## Model streaming

`VLLMAppEnvironment` supports streaming models directly from blob storage to GPU memory, reducing startup time.
When `stream_model=True` and the `model_path` argument is provided with either a `flyte.io.Dir` or `RunOutput` pointing
to a path in object store:

- Model weights stream directly from storage to GPU
- Faster startup time (no full download required)
- Lower disk space requirements

> [!NOTE]
> The contents of the model directory must be compatible with the vLLM-supported formats, e.g. the HuggingFace model
> serialization format.

## Custom vLLM arguments

Use `extra_args` to pass additional arguments to vLLM:

```python
vllm_app = VLLMAppEnvironment(
    name="custom-vllm-app",
    model_hf_path="Qwen/Qwen3-0.6B",
    model_id="qwen3-0.6b",
    extra_args=[
        "--max-model-len", "8192",  # Maximum context length
        "--gpu-memory-utilization", "0.8",  # GPU memory utilization
        "--trust-remote-code",  # Trust remote code in models
    ],
    resources=flyte.Resources(cpu="4", memory="16Gi", gpu="L40s:1"),
    # ...
)
```

See the [vLLM documentation](https://docs.vllm.ai/en/stable/configuration/engine_args.html) for all available arguments.

## Using the OpenAI-compatible API

Once deployed, your vLLM app exposes an OpenAI-compatible API:

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://your-app-url/v1",  # vLLM endpoint
    api_key="your-api-key",  # If you passed an --api-key argument
)

response = client.chat.completions.create(
    model="qwen3-0.6b",  # Your model_id
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ],
)

print(response.choices[0].message.content)
```

> [!TIP]
> If you passed an `--api-key` argument, you can use the `api_key` parameter to authenticate your requests.
> See [here](https://www.union.ai/docs/v2/union/user-guide/build-apps/secret-based-authentication) for more details on how to pass auth secrets to your app.

## Multi-GPU inference (Tensor Parallelism)

For larger models, use multiple GPUs with tensor parallelism:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "flyteplugins-vllm>=2.0.0b45",
# ]
# ///

"""vLLM app with multi-GPU tensor parallelism."""

from flyteplugins.vllm import VLLMAppEnvironment
import flyte

# {{docs-fragment multi-gpu}}
vllm_app = VLLMAppEnvironment(
    name="multi-gpu-llm-app",
    model_hf_path="meta-llama/Llama-2-70b-hf",
    model_id="llama-2-70b",
    resources=flyte.Resources(
        cpu="8",
        memory="32Gi",
        gpu="L40s:4",  # 4 GPUs for tensor parallelism
        disk="100Gi",
    ),
    extra_args=[
        "--tensor-parallel-size", "4",  # Use 4 GPUs
        "--max-model-len", "4096",
        "--gpu-memory-utilization", "0.9",
    ],
    requires_auth=False,
)
# {{/docs-fragment multi-gpu}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    flyte.init_from_config()
    app = flyte.serve(vllm_app)
    print(f"Deployed vLLM app: {app.url}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/vllm/vllm_multi_gpu.py*

The `tensor-parallel-size` should match the number of GPUs specified in resources.

## Model sharding with prefetch

You can prefetch and shard models for multi-GPU inference:

```python
# Prefetch with sharding configuration
run = flyte.prefetch.hf_model(
    repo="meta-llama/Llama-2-70b-hf",
    accelerator="L40s:4",
    shard_config=flyte.prefetch.ShardConfig(
        engine="vllm",
        args=flyte.prefetch.VLLMShardArgs(
            tensor_parallel_size=4,
            dtype="auto",
            trust_remote_code=True,
        ),
    ),
)
run.wait()

# Use the sharded model
vllm_app = VLLMAppEnvironment(
    name="sharded-llm-app",
    model_path=flyte.app.RunOutput(type="directory", run_name=run.name),
    model_id="llama-2-70b",
    resources=flyte.Resources(cpu="8", memory="32Gi", gpu="L40s:4", disk="100Gi"),
    extra_args=["--tensor-parallel-size", "4"],
    stream_model=True,
)
```

See [Prefetching models](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/prefetching-models) for more details on sharding.

## Autoscaling

vLLM apps work well with autoscaling:

```python
vllm_app = VLLMAppEnvironment(
    name="autoscaling-llm-app",
    model_hf_path="Qwen/Qwen3-0.6B",
    model_id="qwen3-0.6b",
    resources=flyte.Resources(cpu="4", memory="16Gi", gpu="L40s:1"),
    scaling=flyte.app.Scaling(
        replicas=(0, 1),  # Scale to zero when idle
        scaledown_after=600,  # 10 minutes idle before scaling down
    ),
    # ...
)
```

## Best practices

1. **Use prefetching**: Prefetch models for faster deployment and better reproducibility
2. **Enable streaming**: Use `stream_model=True` to reduce startup time and disk usage
3. **Right-size GPUs**: Match GPU memory to model size
4. **Configure memory utilization**: Use `--gpu-memory-utilization` to control memory usage
5. **Use tensor parallelism**: For large models, use multiple GPUs with `tensor-parallel-size`
6. **Set autoscaling**: Use appropriate idle TTL to balance cost and performance
7. **Limit context length**: Use `--max-model-len` for smaller models to reduce memory usage

## Troubleshooting

**Model loading fails:**
- Verify GPU memory is sufficient for the model
- Check that the model path or HuggingFace path is correct
- Review container logs for detailed error messages

**Out of memory errors:**
- Reduce `--max-model-len`
- Lower `--gpu-memory-utilization`
- Use a smaller model or more GPUs

**Slow startup:**
- Enable `stream_model=True` for faster loading
- Prefetch models before deployment
- Use faster storage backends

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/sglang-app ===

# SGLang app

SGLang is a fast structured generation library for large language models (LLMs). Flyte provides `SGLangAppEnvironment` for deploying SGLang model servers.

## Installation

First, install the SGLang plugin:

```bash
pip install flyteplugins-sglang
```

## Basic SGLang app

Here's a simple example serving a HuggingFace model:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "flyteplugins-sglang>=2.0.0b45",
# ]
# ///

"""A simple SGLang app example."""

from flyteplugins.sglang import SGLangAppEnvironment
import flyte

# {{docs-fragment basic-sglang-app}}
sglang_app = SGLangAppEnvironment(
    name="my-sglang-app",
    model_hf_path="Qwen/Qwen3-0.6B",  # HuggingFace model path
    model_id="qwen3-0.6b",  # Model ID exposed by SGLang
    resources=flyte.Resources(
        cpu="4",
        memory="16Gi",
        gpu="L40s:1",  # GPU required for LLM serving
        disk="10Gi",
    ),
    scaling=flyte.app.Scaling(
        replicas=(0, 1),
        scaledown_after=300,  # Scale down after 5 minutes of inactivity
    ),
    requires_auth=False,
)
# {{/docs-fragment basic-sglang-app}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    flyte.init_from_config()
    app = flyte.serve(sglang_app)
    print(f"Deployed SGLang app: {app.url}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/sglang/basic_sglang.py*

## Using prefetched models

You can use models prefetched with `flyte.prefetch`:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "flyteplugins-sglang>=2.0.0b45",
# ]
# ///

"""SGLang app using prefetched models."""

from flyteplugins.sglang import SGLangAppEnvironment
import flyte

# {{docs-fragment prefetch}}

# Use the prefetched model
sglang_app = SGLangAppEnvironment(
    name="my-sglang-app",
    model_hf_path="Qwen/Qwen3-0.6B",  # this is a placeholder
    model_id="qwen3-0.6b",
    resources=flyte.Resources(cpu="4", memory="16Gi", gpu="L40s:1", disk="10Gi"),
    stream_model=True,  # Stream model directly from blob store to GPU
    requires_auth=False,
)

if __name__ == "__main__":
    flyte.init_from_config()

    # Prefetch the model first
    run = flyte.prefetch.hf_model(repo="Qwen/Qwen3-0.6B")
    run.wait()

    app = flyte.serve(
        sglang_app.clone_with(
            sglang_app.name,
            model_hf_path=None,
            model_path=flyte.app.RunOutput(type="directory", run_name=run.name),
        )
    )
    print(f"Deployed SGLang app: {app.url}")
# {{/docs-fragment prefetch}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/sglang/sglang_with_prefetch.py*

## Model streaming

`SGLangAppEnvironment` supports streaming models directly from blob storage to GPU memory, reducing startup time.
When `stream_model=True` and the `model_path` argument is provided with either a `flyte.io.Dir` or `RunOutput` pointing
to a path in object store:

- Model weights stream directly from storage to GPU
- Faster startup time (no full download required)
- Lower disk space requirements

> [!NOTE]
> The contents of the model directory must be compatible with the SGLang-supported formats, e.g. the HuggingFace model
> serialization format.

## Custom SGLang arguments

Use `extra_args` to pass additional arguments to SGLang:

```python
sglang_app = SGLangAppEnvironment(
    name="custom-sglang-app",
    model_hf_path="Qwen/Qwen3-0.6B",
    model_id="qwen3-0.6b",
    extra_args=[
        "--max-model-len", "8192",  # Maximum context length
        "--mem-fraction-static", "0.8",  # Memory fraction for static allocation
        "--trust-remote-code",  # Trust remote code in models
    ],
    resources=flyte.Resources(cpu="4", memory="16Gi", gpu="L40s:1"),
    # ...
)
```

See the [SGLang server arguments documentation](https://docs.sglang.io/advanced_features/server_arguments.html) for all available options.

## Using the OpenAI-compatible API

Once deployed, your SGLang app exposes an OpenAI-compatible API:

```python
from openai import OpenAI

client = OpenAI(
    base_url="https://your-app-url/v1",  # SGLang endpoint
    api_key="your-api-key",  # If you passed an --api-key argument
)

response = client.chat.completions.create(
    model="qwen3-0.6b",  # Your model_id
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ],
)

print(response.choices[0].message.content)
```

> [!TIP]
> If you passed an `--api-key` argument, you can use the `api_key` parameter to authenticate your requests.
> See [here](https://www.union.ai/docs/v2/union/user-guide/build-apps/secret-based-authentication) for more details on how to pass auth secrets to your app.

## Multi-GPU inference (Tensor Parallelism)

For larger models, use multiple GPUs with tensor parallelism:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "flyteplugins-sglang>=2.0.0b45",
# ]
# ///

"""SGLang app with multi-GPU tensor parallelism."""

from flyteplugins.sglang import SGLangAppEnvironment
import flyte

# {{docs-fragment multi-gpu}}
sglang_app = SGLangAppEnvironment(
    name="multi-gpu-sglang-app",
    model_hf_path="meta-llama/Llama-2-70b-hf",
    model_id="llama-2-70b",
    resources=flyte.Resources(
        cpu="8",
        memory="32Gi",
        gpu="L40s:4",  # 4 GPUs for tensor parallelism
        disk="100Gi",
    ),
    extra_args=[
        "--tp", "4",  # Tensor parallelism size (4 GPUs)
        "--max-model-len", "4096",
        "--mem-fraction-static", "0.9",
    ],
    requires_auth=False,
)
# {{/docs-fragment multi-gpu}}

# {{docs-fragment deploy}}
if __name__ == "__main__":
    flyte.init_from_config()
    app = flyte.serve(sglang_app)
    print(f"Deployed SGLang app: {app.url}")
# {{/docs-fragment deploy}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/sglang/sglang_multi_gpu.py*

The tensor parallelism size (`--tp`) should match the number of GPUs specified in resources.

## Model sharding with prefetch

You can prefetch and shard models for multi-GPU inference using SGLang's sharding:

```python
# Prefetch with sharding configuration
run = flyte.prefetch.hf_model(
    repo="meta-llama/Llama-2-70b-hf",
    accelerator="L40s:4",
    shard_config=flyte.prefetch.ShardConfig(
        engine="vllm",
        args=flyte.prefetch.VLLMShardArgs(
            tensor_parallel_size=4,
            dtype="auto",
            trust_remote_code=True,
        ),
    ),
)
run.wait()

# Use the sharded model
sglang_app = SGLangAppEnvironment(
    name="sharded-sglang-app",
    model_path=flyte.app.RunOutput(type="directory", run_name=run.name),
    model_id="llama-2-70b",
    resources=flyte.Resources(cpu="8", memory="32Gi", gpu="L40s:4", disk="100Gi"),
    extra_args=["--tp", "4"],
    stream_model=True,
)
```

See [Prefetching models](https://www.union.ai/docs/v2/union/user-guide/serve-and-deploy-apps/prefetching-models) for more details on sharding.

## Autoscaling

SGLang apps work well with autoscaling:

```python
sglang_app = SGLangAppEnvironment(
    name="autoscaling-sglang-app",
    model_hf_path="Qwen/Qwen3-0.6B",
    model_id="qwen3-0.6b",
    resources=flyte.Resources(cpu="4", memory="16Gi", gpu="L40s:1"),
    scaling=flyte.app.Scaling(
        replicas=(0, 1),  # Scale to zero when idle
        scaledown_after=600,  # 10 minutes idle before scaling down
    ),
    # ...
)
```

## Structured generation

SGLang is particularly well-suited for structured generation tasks. The deployed app supports standard OpenAI API calls, and you can use SGLang's advanced features through the API.

## Best practices

1. **Use prefetching**: Prefetch models for faster deployment and better reproducibility
2. **Enable streaming**: Use `stream_model=True` to reduce startup time and disk usage
3. **Right-size GPUs**: Match GPU memory to model size
4. **Use tensor parallelism**: For large models, use multiple GPUs with `--tp`
5. **Set autoscaling**: Use appropriate idle TTL to balance cost and performance
6. **Configure memory**: Use `--mem-fraction-static` to control memory allocation
7. **Limit context length**: Use `--max-model-len` for smaller models to reduce memory usage

## Troubleshooting

**Model loading fails:**
- Verify GPU memory is sufficient for the model
- Check that the model path or HuggingFace path is correct
- Review container logs for detailed error messages

**Out of memory errors:**
- Reduce `--max-model-len`
- Lower `--mem-fraction-static`
- Use a smaller model or more GPUs

**Slow startup:**
- Enable `stream_model=True` for faster loading
- Prefetch models before deployment
- Use faster storage backends

=== PAGE: https://www.union.ai/docs/v2/union/user-guide/native-app-integrations/flyte-webhook ===

# Flyte webhook

`FlyteWebhookAppEnvironment` is a pre-built `FastAPIAppEnvironment` that exposes
HTTP endpoints for common Union.ai operations. Instead of writing
your own FastAPI routes to interact with the control plane, you get a
ready-to-deploy webhook service with a single constructor call.

## Available endpoints

The webhook provides endpoints for the following operations:

| Group | Endpoints | Description |
|---|---|---|
| **core** | `GET /health`, `GET /me` | Health check and authenticated user info |
| **task** | `POST /run-task/{domain}/{project}/{name}`, `GET /task/{domain}/{project}/{name}` | Run tasks and retrieve task metadata |
| **run** | `GET /run/{name}`, `GET /run/{name}/io`, `POST /run/{name}/abort` | Get run status, inputs/outputs, and abort runs |
| **app** | `GET /app/{name}`, `POST /app/{name}/activate`, `POST /app/{name}/deactivate`, `POST /app/{name}/call` | Manage apps and call other app endpoints |
| **trigger** | `POST /trigger/{task_name}/{trigger_name}/activate`, `POST /trigger/{task_name}/{trigger_name}/deactivate` | Activate and deactivate triggers |
| **build** | `POST /build-image` | Build container images |
| **prefetch** | `POST /prefetch/hf-model`, `GET /prefetch/hf-model/{run_name}`, `GET /prefetch/hf-model/{run_name}/io`, `POST /prefetch/hf-model/{run_name}/abort` | Prefetch HuggingFace models |

All endpoints except `/health`, `/docs`, and `/openapi.json` use passthrough
authentication, forwarding the caller's credentials to the
Union.ai control plane.

## Basic usage

Create a webhook with all endpoints enabled:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

Deploy and activate it:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

Once running, the webhook exposes OpenAPI docs at `{endpoint}/docs` (Swagger UI)
and `{endpoint}/redoc`.

## Filtering endpoints

You can restrict which endpoints the webhook exposes using either **endpoint
groups** or **individual endpoints**.

### Endpoint groups

Enable groups of related endpoints with `endpoint_groups`:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

Available groups: `all`, `core`, `task`, `run`, `app`, `trigger`, `build`, `prefetch`.

### Individual endpoints

For finer control, specify exact endpoints with `endpoints`:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

> [!NOTE]
> You cannot specify both `endpoint_groups` and `endpoints` at the same time. Use
> one or the other.

## Allow-listing

Restrict which resources the webhook can access using allow-lists.

### Task allow-list

Limit which tasks can be run or queried through the webhook:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

Task identifiers support three formats:
- `domain/project/name` — exact match
- `project/name` — matches any domain
- `name` — matches any domain and project

### App allow-list

Limit which apps can be managed through the webhook:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

### Trigger allow-list

Limit which triggers can be activated or deactivated:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

Trigger identifiers support two formats:
- `task_name/trigger_name` — exact match
- `trigger_name` — matches any task

## Calling the webhook

Authenticate requests with a Union.ai API key passed as a Bearer
token:

```
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.0.0",
#    "fastapi",
#    "uvicorn",
#    "httpx",
# ]
# ///

"""Examples showing how to use FlyteWebhookAppEnvironment."""

import logging

import flyte
import flyte.app
from flyte.app.extras import FlyteWebhookAppEnvironment

# {{docs-fragment basic-webhook}}
webhook_env = FlyteWebhookAppEnvironment(
    name="my-webhook",
    title="My Flyte Webhook",
    description="A pre-built webhook service for Flyte operations",
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
    scaling=flyte.app.Scaling(replicas=1),
)
# {{/docs-fragment basic-webhook}}

# {{docs-fragment endpoint-groups}}
task_runner_webhook = FlyteWebhookAppEnvironment(
    name="task-runner-webhook",
    title="Task Runner Webhook",
    endpoint_groups=["core", "task", "run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment endpoint-groups}}

# {{docs-fragment individual-endpoints}}
minimal_webhook = FlyteWebhookAppEnvironment(
    name="minimal-webhook",
    title="Minimal Webhook",
    endpoints=["health", "run_task", "get_run"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment individual-endpoints}}

# {{docs-fragment task-allowlist}}
restricted_webhook = FlyteWebhookAppEnvironment(
    name="restricted-webhook",
    title="Restricted Webhook",
    endpoint_groups=["core", "task", "run"],
    task_allowlist=[
        "production/my-project/allowed-task",
        "my-project/another-task",
        "any-domain-task",
    ],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment task-allowlist}}

# {{docs-fragment app-allowlist}}
app_manager_webhook = FlyteWebhookAppEnvironment(
    name="app-manager-webhook",
    title="App Manager Webhook",
    endpoint_groups=["core", "app"],
    app_allowlist=["my-app", "another-app"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment app-allowlist}}

# {{docs-fragment trigger-allowlist}}
trigger_manager_webhook = FlyteWebhookAppEnvironment(
    name="trigger-manager-webhook",
    title="Trigger Manager Webhook",
    endpoint_groups=["core", "trigger"],
    trigger_allowlist=["my-task/my-trigger", "another-trigger"],
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=True,
)
# {{/docs-fragment trigger-allowlist}}

# {{docs-fragment deploy-webhook}}
if __name__ == "__main__":
    import os

    import httpx

    flyte.init_from_config(log_level=logging.DEBUG)

    served_app = flyte.serve(webhook_env)
    url = served_app.url
    endpoint = served_app.endpoint
    print(f"Webhook is served on {url}")
    print(f"OpenAPI docs available at: {endpoint}/docs")

    served_app.activate(wait=True)
# {{/docs-fragment deploy-webhook}}

# {{docs-fragment call-webhook}}
    token = os.getenv("FLYTE_API_KEY")
    if not token:
        raise ValueError("FLYTE_API_KEY not set. Obtain with: flyte get api-key")

    headers = {
        "Authorization": f"Bearer {token}",
        "User-Agent": "flyte-webhook-client/1.0",
    }

    with httpx.Client(headers=headers) as client:
        # Health check (no auth required)
        health = client.get(f"{endpoint}/health")
        print(f"/health: {health.json()}")

        # Get current user info (requires auth)
        me = client.get(f"{endpoint}/me")
        print(f"/me: {me.json()}")

        # Run a task
        resp = client.post(
            f"{endpoint}/run-task/development/my-project/my-task",
            json={"x": 42, "y": "hello"},
        )
        result = resp.json()
        print(f"Run task: {result}")

        # Check run status
        run_name = result["name"]
        run = client.get(f"{endpoint}/run/{run_name}")
        print(f"Run status: {run.json()}")
# {{/docs-fragment call-webhook}}
```

*Source: https://github.com/unionai/unionai-examples/blob/main/v2/user-guide/build-apps/flyte_webhook_examples.py*

## Authentication

`FlyteWebhookAppEnvironment` uses `FastAPIPassthroughAuthMiddleware`, which
extracts the caller's auth token from the `Authorization` header and sets up
a Union.ai context so that every control-plane call (e.g.
`remote.Task.get`, `flyte.run`) runs with the caller's identity.

The `/health`, `/docs`, `/openapi.json`, and `/redoc` endpoints are excluded
from authentication.

## Self-reference protection

App endpoints (`get_app`, `activate_app`, `deactivate_app`, `call_app`) prevent
the webhook from targeting itself. Attempting to activate, deactivate, or call
the webhook's own name returns a `400 Bad Request` error.

