Apps

Now that you understand tasks, let’s learn about apps - Flyte’s way of running long-lived services.

Tasks vs apps

You’ve already learned about tasks: Python functions that run to completion in containers. Tasks are great for data processing, training, and batch operations.

Apps are different. An app is a long-running service that stays active and handles requests over time. Apps are ideal for:

REST APIs and webhooks
Model inference endpoints
Interactive dashboards
Real-time data services

Aspect	Task	App
Lifecycle	Runs once, then exits	Stays running indefinitely
Invocation	Called with inputs, returns outputs	Receives HTTP requests
Use case	Batch processing, training	APIs, inference, dashboards
Durability	Inputs/outputs stored, can resume	Stateless request handling

AppEnvironment

Just as tasks use TaskEnvironment, apps use AppEnvironment to configure their runtime.

An AppEnvironment specifies:

Hardware: CPU, memory, GPU allocation
Software: Container image with dependencies
App-specific settings: Ports, scaling, authentication

Here’s a simple example:

        
    
import flyte
from flyte.app.extras import FastAPIAppEnvironment

env = FastAPIAppEnvironment(
    name="my-app",
    image=flyte.Image.from_debian_base().with_pip_packages("fastapi", "uvicorn"),
    limits=flyte.Resources(cpu="1", mem="2Gi"),
)

A hello world app

Let’s create a minimal FastAPI app to see how this works.

First, create hello_app.py:

hello_app.py
                
                    
                
            
                
            
# /// script
# requires-python = "==3.13"
# dependencies = [
#    "flyte>=2.0.0b52",
#    "fastapi",
#    "uvicorn",
# ]
# ///

"""A simple "Hello World" FastAPI app example for serving."""

from fastapi import FastAPI
import pathlib
import flyte
from flyte.app.extras import FastAPIAppEnvironment

# Define a simple FastAPI application
app = FastAPI(
    title="Hello World API",
    description="A simple FastAPI application",
    version="1.0.0",
)

# Create an AppEnvironment for the FastAPI app
env = FastAPIAppEnvironment(
    name="hello-app",
    app=app,
    image=flyte.Image.from_debian_base(python_version=(3, 12)).with_pip_packages(
        "fastapi",
        "uvicorn",
    ),
    resources=flyte.Resources(cpu=1, memory="512Mi"),
    requires_auth=False,
)

# Define API endpoints
@app.get("/")
async def root():
    return {"message": "Hello, World!"}

@app.get("/health")
async def health_check():
    return {"status": "healthy"}

# Serving this script will deploy and serve the app on your Union/Flyte instance.
if __name__ == "__main__":
    # Initialize Flyte from a config file.
    flyte.init_from_config(root_dir=pathlib.Path(__file__).parent)

    # Serve the app remotely.
    app_instance = flyte.serve(env)

    # Print the app URL.
    print(app_instance.url)
    print("App 'hello-app' is now serving.")

Understanding the code

FastAPI() creates the web application with its endpoints
FastAPIAppEnvironment configures the container and resources
@app.get("/") defines an HTTP endpoint that returns a greeting
flyte.serve() deploys and starts the app on your Flyte backend

Serving the app

With your config file in place, serve the app:

flyte serve hello_app.py env

Or run the Python file directly (which calls flyte.serve() in the main block):

python hello_app.py

You’ll see output like:

        
https://my-instance.flyte.com/v2/domain/development/project/my-project/apps/hello-app
App 'hello-app' is now serving.

Click the link to view your app in the UI. You can find the app URL there, or visit /docs for FastAPI’s interactive API documentation.

When to use apps vs tasks

Use tasks when:

Processing takes seconds to hours
You need durability (inputs/outputs tracked)
Work is triggered by events or schedules
Results need to be cached or resumed

Use apps when:

Responses must be fast (milliseconds)
You’re serving an API or dashboard
Users interact in real-time
You need a persistent endpoint

Model serving with FastAPI: Train a model with a Flyte pipeline, then serve predictions from it. During local development, the app loads the model from a local file. When deployed remotely, Flyte’s Parameter system automatically resolves the model from the latest training run output. See FastAPI app for the full example.

Agent UI with Gradio: Build an interactive UI that kicks off agent runs using flyte.with_runcontext(). A single RUN_MODE environment variable controls the deployment progression: fully local (rapid iteration), local UI with remote task execution (cluster compute), or fully remote (production). See Build apps for details.

Next steps

You now understand the core building blocks of Flyte:

TaskEnvironment and AppEnvironment configure where code runs
Tasks are functions that execute and complete
Apps are long-running services
Runs and Actions track executions

Before diving deeper, check out Key capabilities for an overview of what Flyte can do—from parallelism and caching to LLM serving and error recovery.

Then head to Basic project to build an end-to-end ML system with training tasks and a serving app.