# App Serving

> **📝 Note**
>
> An LLM-optimized bundle of this entire section is available at [`section.md`](https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/section.md).
> This single file contains all pages in this section, optimized for AI coding agent context.

Union.ai lets you build and serve your own web apps, enabling you to build:

- **Model endpoints** with generic web frameworks like FastAPI or optimized inference frameworks like vLLM and SGLang.
- **AI inference-time** components like MCP servers, ephemeral agent memory state stores, etc.
- **Interactive dashboards** and other interfaces to interact with and visualize data and models from your workflows using frameworks like Streamlit, Gradio, Tensorboard, FastHTML, Dash, Panel, Voila, FiftyOne.
- **Flyte Connectors**, which are [light-weight, long running services](https://www.union.ai/docs/v1/union/integrations/connectors/_index) that connect to external
services like OpenAI, BigQuery, and Snowflake.
- **Any other web services** like [web hooks](https://www.union.ai/docs/v1/union/tutorials/serving/custom-webhooks/page.md) that can be implemented via web frameworks like FastAPI, Starlette.

## Example app

We will start with a simple Streamlit app. In this case we will use the default
Streamlit "Hello, World!" app.

In a local directory, create the following file:

```shell
└── app.py
```

## App declaration

The file `app.py` contains the app declaration:

```python
"""A simple Union.ai app using Streamlit"""

import union
import os

# The `ImageSpec` for the container that will run the `App`.
# `union-runtime` must be declared as a dependency,
# in addition to any other dependencies needed by the app code.
# Use Union remote Image builder to build the app container image
image = union.ImageSpec(
    name="streamlit-app",
    packages=["union-runtime>=0.1.18", "streamlit==1.51.0"],
    builder="union"
)

# The `App` declaration.
# Uses the `ImageSpec` declared above.
# In this case we do not need to supply any app code
# as we are using the built-in Streamlit `hello` app.
app = union.app.App(
    name="streamlit-hello",
    container_image=image,
    args="streamlit hello --server.port 8080",
    port=8080,
    limits=union.Resources(cpu="1", mem="1Gi"),
)
```

Here the `App` constructor is initialized with the following parameters:

* `name`: The name of the app. This name will be displayed in app listings (via CLI and UI) and used to refer to the app when deploying and stopping.
* `container_image`: The container image that will be used to for the container that will run the app. Here we use a prebuilt container provided by Union.ai that support Streamlit.
* `args`: The command that will be used within the container to start the app. The individual strings in this array will be concatenated and the invoked as a single command.
* `port`: The port of the app container from which the app will be served.
* `limits`: A `union.Resources` object defining the resource limits for the app container.
  The same object is used for the same purpose in the `@union.task` decorator in Union.ai workflows.
  See [The requests and limits settings](https://www.union.ai/docs/v1/union/user-guide/core-concepts/tasks/task-hardware-environment/customizing-task-resources/page.md#the-requests-and-limits-settings) for details.

The parameters above are the minimum needed to initialize the app.

There are a few additional available parameters that we do not use in this example (but we will cover later):

* `include`: A list of files to be added to the container at deployment time, containing the custom code that defines the specific functionality of your app.
* `inputs`: A `List` of `union.app.Input` objects. Used to provide default inputs to the app on startup.
* `requests`: A `union.Resources` object defining the resource requests for the app container. The same object is used for the same purpose in the `@union.task` decorator in Union.ai workflows (see [The requests and limits settings](https://www.union.ai/docs/v1/union/user-guide/core-concepts/tasks/task-hardware-environment/customizing-task-resources/page.md#the-requests-and-limits-settings) for details).
* `min_replicas`: The minimum number of replica containers permitted for this app.
  This defines the lower bound for auto-scaling the app. The default is 0 <!-- TODO: (see [App autoscaling]() for details) -->.
* `max_replicas`: The maximum number of replica containers permitted for this app.
  This defines the upper bound for auto-scaling the app. The default is 1 <!-- TODO: (see [App autoscaling]() for details) -->.

## Deploy the app

Deploy the app with:

```shell
$ union deploy apps APP_FILE APP_NAME
```

* `APP_FILE` is the Python file that contains one or more app declarations.
* `APP_NAME` is the name of (one of) the declared apps in APP_FILE. The name of an app is the value of the `name` parameter passed into the `App` constructor.

If an app with the name `APP_NAME` does not yet exist on the system then this command creates that app and starts it.
If an app by that name already exists then this command stops the app, updates its code and restarts it.

In this case, we do the following:

```shell
$ union deploy apps app.py streamlit-hello
```

This will return output like the following:

```shell
✨ Creating Application: streamlit-demo
Created Endpoint at: https://withered--firefly--8ca31.apps.demo.hosted.unionai.cloud/
```

Click on the displayed endpoint to go to the app:

![A simple app](https://www.union.ai/docs/v1/union/_static/images/user-guide/core-concepts/serving/streamlit-hello.png)

## Viewing deployed apps

Go to **Apps** in the left sidebar in Union.ai to see a list of all your deployed apps:

![Apps list](https://www.union.ai/docs/v1/union/_static/images/user-guide/core-concepts/serving/apps-list.png)

To connect to an app click on its **Endpoint**.
To see more information about the app, click on its **Name**.
This will take you to the **App view**:

![App view](https://www.union.ai/docs/v1/union/_static/images/user-guide/core-concepts/serving/app-view.png)

Buttons to **Copy Endpoint** and **Start app** are available at the top of the view.

You can also view all apps deployed in your Union.ai instance from the command-line with:

```shell
$ union get apps
```

This will display the app list:

```shell
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━┳━━━━━━━━┓
┃ Name                                    ┃ Link       ┃ Status     ┃ Desired State ┃ CPU ┃ Memory ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━╇━━━━━━━━┩
│ streamlit-query-2                       │ Click Here │ Started    │ Stopped       │ 2   │ 2Gi    │
│ streamlit-demo-1                        │ Click Here │ Started    │ Started       │ 3   │ 2Gi    │
│ streamlit-query-3                       │ Click Here │ Started    │ Started       │ 2   │ 2Gi    │
│ streamlit-demo                          │ Click Here │ Unassigned │ Started       │ 2   │ 2Gi    │
└─────────────────────────────────────────┴────────────┴────────────┴───────────────┴─────┴────────┘
```

## Stopping apps

To stop an app from the command-line, perform the following command:

```shell
$ union stop apps --name APP_NAME
```

`APP_NAME` is the name of an app deployed on the Union.ai instance.

## Subpages

- [Serving custom code](https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/adding-your-own-code/page.md)
  - Example app
  - App declaration
  - Custom code
  - Deploy the app
  - App deployment with included files
- [Serving a Model from a Workflow With FastAPI](https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/serving-a-model/page.md)
  - Example app
  - App configuration
  - Training workflow
  - Run the example
- [API Key Authentication with FastAPI](https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/fast-api-auth/page.md)
  - Define the Fast API app
  - Deploy the Fast API app
- [Cache a Huggingface Model as an Artifact](https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/cache-huggingface-model/page.md)
  - Why Cache Models from HuggingFace?
  - Prerequisites
  - Basic Example: Cache a Model As-Is
  - Command Breakdown
  - Output
  - Using Cached Models in Applications
  - VLLM App Example
  - SGLang App Example
  - Advanced Example: Sharding a Model with the vLLM Engine
  - Create a Shard Configuration File
  - Cache the Sharded Model
  - Best Practices
- [Deploy Optimized LLM Endpoints with vLLM and SGLang](https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/deploy-optimized-llm-endpoints/page.md)
  - Overview
  - Basic Example: Deploy a Non-Sharded Model
  - Deploy with vLLM
  - Deploy with SGLang
  - Custom Image Example: Deploy with Your Own Image
  - Advanced Example: Deploy a Sharded Model
  - Cache a Sharded Model
  - Deploy with VLLMApp
  - Deploy with SGLangApp
  - Authentication via API Key
  - Performance Tuning
- [Deploy Custom Flyte Connectors](https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/deploying-your-connector/page.md)
  - Overview
  - Prerequisites
  - Connector Deployment Options
  - Module-based Deployment
  - ImageSpec-based Deployment
  - Managing Secrets
  - Example: Creating a ChatGPT Connector
  - Using the Connector in a Workflow
  - Creating Your Own Connector
  - Deployment Commands
  - Best Practices

---
**Source**: https://github.com/unionai/unionai-docs/blob/main/content/user-guide/core-concepts/serving/_index.md
**HTML**: https://www.union.ai/docs/v1/union/user-guide/core-concepts/serving/
