Key capabilities

Now that you understand the core concepts – TaskEnvironment, tasks, runs, and apps – here’s an overview of what Flyte can do. Each capability is covered in detail later in the documentation.

Environment and resources

Configure how and where your code runs.

Multiple environments: Create separate configurations for different use cases (dev, prod, GPU vs CPU) → Multiple environments
Resource specification: Request specific CPU, memory, GPU, and storage for your tasks → Resources

Reusable containers: Eliminate container startup overhead with pooled, warm containers for millisecond-level task scheduling → Reusable containers

Deployment

Get your code running remotely.

Cloud image building: Build container images remotely without needing local Docker → Container images

Code packaging: Your local code is automatically bundled and deployed to remote execution → Packaging
Local testing: Test tasks locally before deploying with flyte run --local → How task run works

Data handling

Pass data efficiently between tasks.

Files and directories: Pass large files and directories between tasks using flyte.io.File and flyte.io.Dir → Files and directories
DataFrames: Work with pandas, Polars, and other DataFrame types natively → DataFrames

Parallelism and composition

Scale out and compose workflows.

Fanout parallelism: Process items in parallel using flyte.map or asyncio.gather → Fanout
Remote tasks: Call previously deployed tasks from within your workflows → Remote tasks

Security and automation

Manage credentials and automate execution.

Secrets: Inject API keys, passwords, and other credentials securely into tasks → Secrets
Triggers: Schedule tasks on a cron schedule or trigger them from external events → Triggers
Webhooks: Build APIs that trigger task execution from external systems → App usage patterns

Durability and reliability

Handle failures and avoid redundant work.

Error handling: Catch failures and retry with different resources (e.g., more memory) → Error handling
Retries and timeouts: Configure automatic retries and execution time limits → Retries and timeouts
Caching: Add cache="auto" to any task and Flyte stores its outputs keyed on task name and inputs. Same inputs means instant results with no recomputation. This speeds up your development loop: skip re-downloading data, avoid replaying earlier steps in agentic chains, or bypass any expensive computation while you iterate. → Caching
```
@env.task(cache="auto")
async def load_data(data_dir: str = "./data") -> str:
    """Downloads once, then returns instantly on subsequent runs."""
    # ... expensive download ...
    return data_dir
```

Traces: Use @flyte.trace to get visibility into the internal steps of a task without the overhead of making each step a separate task. Traced functions show up as child nodes under their parent task, each with their own timing, inputs, and outputs. This is particularly useful for AI agents where you want to see which tools were called. → Traces

        
    
@flyte.trace
async def search(query: str) -> str:
    """Shows up as a child node under the parent task."""
    return await do_search(query)

@env.task
async def agent(request: str) -> str:
    results = await search(request)    # Traced
    answer = await summarize(results)   # Also traced if decorated
    return answer

Reports: Add report=True to a task and it can generate an HTML report (charts, tables, images) saved alongside the task output. Combined with caching and persisted inputs/outputs, reports act as lightweight experiment tracking—each run produces a self-contained HTML file you can compare across runs and share with your team. → Reports

        
    
import flyte.report

@env.task(report=True)
async def evaluate(model_file: File, test_data: str) -> str:
    # ... run evaluation ...
    await flyte.report.replace.aio(
        f"<h2>Training Report</h2>"
        f"<h3>Test Results</h3>"
        f"<p>Accuracy: {accuracy:.4f}</p>"
    )
    await flyte.report.flush.aio()
    return f"Accuracy: {accuracy:.4f}"

Apps and serving

Deploy long-running services.

FastAPI apps: Deploy REST APIs and webhooks → FastAPI app
LLM serving: Serve large language models with vLLM or SGLang → vLLM app, SGLang app
Autoscaling: Scale apps up and down based on traffic, including scale-to-zero → Autoscaling apps
Streamlit dashboards: Deploy interactive data dashboards → Streamlit app

Notebooks

Work interactively.

Jupyter support: Author and run workflows directly from Jupyter notebooks, and fetch workflow metadata (inputs, outputs, logs) → Notebooks

Next steps

Ready to put it all together? Head to Basic project to build an end-to-end ML system with training tasks and a serving app.