Apps

Now that you understand tasks, let’s learn about apps - Flyte’s way of running long-lived services.

Tasks vs apps

You’ve already learned about tasks: Python functions that run to completion in containers. Tasks are great for data processing, training, and batch operations.

Apps are different. An app is a long-running service that stays active and handles requests over time. Apps are ideal for:

  • REST APIs and webhooks
  • Model inference endpoints
  • Interactive dashboards
  • Real-time data services
Aspect Task App
Lifecycle Runs once, then exits Stays running indefinitely
Invocation Called with inputs, returns outputs Receives HTTP requests
Use case Batch processing, training APIs, inference, dashboards
Durability Inputs/outputs stored, can resume Stateless request handling

AppEnvironment

Just as tasks use TaskEnvironment, apps use AppEnvironment to configure their runtime.

An AppEnvironment specifies:

  • Hardware: CPU, memory, GPU allocation
  • Software: Container image with dependencies
  • App-specific settings: Ports, scaling, authentication

Here’s a simple example:

import flyte
from flyte.app.extras import FastAPIAppEnvironment

env = FastAPIAppEnvironment(
    name="my-app",
    image=flyte.Image.from_debian_base().with_pip_packages("fastapi", "uvicorn"),
    limits=flyte.Resources(cpu="1", mem="2Gi"),
)

A hello world app

Let’s create a minimal FastAPI app to see how this works.

First, create hello_app.py:

hello_app.py
# /// script
# requires-python = "==3.13"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "fastapi",
#    "uvicorn",
# ]
# ///

"""A simple "Hello World" FastAPI app example for serving."""

from fastapi import FastAPI
import pathlib
import flyte
from flyte.app.extras import FastAPIAppEnvironment

# Define a simple FastAPI application
app = FastAPI(
    title="Hello World API",
    description="A simple FastAPI application",
    version="1.0.0",
)

# Create an AppEnvironment for the FastAPI app
env = FastAPIAppEnvironment(
    name="hello-app",
    app=app,
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi",
        "uvicorn",
    ),
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=False,
)

# Define API endpoints
@app.get("/")
async def root():
    return {"message": "Hello, World!"}

@app.get("/health")
async def health_check():
    return {"status": "healthy"}

# Serving this script will deploy and serve the app on your Union/Flyte instance.
if __name__ == "__main__":
    # Initialize Flyte from a config file.
    flyte.init_from_config(root_dir=pathlib.Path(__file__).parent)

    # Serve the app remotely.
    app_instance = flyte.serve(env)

    # Print the app URL.
    print(app_instance.url)
    print("App 'hello-app' is now serving.")


Understanding the code

  • FastAPI() creates the web application with its endpoints
  • FastAPIAppEnvironment configures the container and resources
  • @app.get("/") defines an HTTP endpoint that returns a greeting
  • flyte.serve() deploys and starts the app on your Flyte backend

Serving the app

With your config file in place, serve the app:

flyte serve hello_app.py env

Or run the Python file directly (which calls flyte.serve() in the main block):

python hello_app.py

You’ll see output like:

https://my-instance.flyte.com/v2/domain/development/project/my-project/apps/hello-app
App 'hello-app' is now serving.

Click the link to view your app in the UI. You can find the app URL there, or visit /docs for FastAPI’s interactive API documentation.

When to use apps vs tasks

Use tasks when:

  • Processing takes seconds to hours
  • You need durability (inputs/outputs tracked)
  • Work is triggered by events or schedules
  • Results need to be cached or resumed

Use apps when:

  • Responses must be fast (milliseconds)
  • You’re serving an API or dashboard
  • Users interact in real-time
  • You need a persistent endpoint

Common patterns

Model serving with FastAPI: Train a model with a Flyte pipeline, then serve predictions from it. During local development, the app loads the model from a local file. When deployed remotely, Flyte’s Parameter system automatically resolves the model from the latest training run output. See FastAPI app for the full example.

Agent UI with Gradio: Build an interactive UI that kicks off agent runs using flyte.with_runcontext(). A single RUN_MODE environment variable controls the deployment progression: fully local (rapid iteration), local UI with remote task execution (cluster compute), or fully remote (production). See Build apps for details.

Next steps

You now understand the core building blocks of Flyte:

  • TaskEnvironment and AppEnvironment configure where code runs
  • Tasks are functions that execute and complete
  • Apps are long-running services
  • Runs and Actions track executions

Before diving deeper, check out Key capabilities for an overview of what Flyte can do—from parallelism and caching to LLM serving and error recovery.

Then head to Basic project to build an end-to-end ML system with training tasks and a serving app.