# Configuration
> This bundle contains all pages in the Configuration section.
> Source: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration ===

# Advanced Configurations

> **📝 Note**
>
> An LLM-optimized bundle of this entire section is available at [`section.md`](section.md).
> This single file contains all pages in this section, optimized for AI coding agent context.

This section covers the configuration of union features on your Union.ai cluster.

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/helm ===

# Helm Values

A full list of Helm values available for configuration can be found here:

* [Data plane chart](https://github.com/unionai/helm-charts/tree/main/charts/dataplane)
* [Data plane CRD chart](https://github.com/unionai/helm-charts/tree/main/charts/dataplane-crds)
* [Knative Operator (for serving)](https://github.com/unionai/helm-charts/tree/main/charts/knative-operator)

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/node-pools ===

# Configuring Service and Worker Node Pools

As a best practice, we recommend using separate node pools for the Union services and the Union worker pods. This allows
you to guard against resource contention between Union services and other tasks running in your cluster.

Start by creating two node pools in your cluster. One for the Union services and one for the Union worker pods.
Configure the node pool for the Union services with the `union.ai/node-role: services` label.  The worker pool will
be configured with the `union.ai/node-role: worker` label.  You will also need to taint the nodes in the service and
worker pools to ensure that only the appropriate pods are scheduled on them.

The nodes for Union services should be tainted with:

```shell
kubectl taint nodes <node-name> union.ai/node-role=services:NoSchedule
```
The nodes for execution workers should be tainted with:

```shell
kubectl taint nodes <node-name> union.ai/node-role=worker:NoSchedule
```

Vendor interfaces and provisioning tools may support tainting nodes automatically through configuration options.

Set the scheduling constraints for the Union services in your values file:

```yaml
scheduling:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: union.ai/node-role
            operator: In
            values:
            - services
  tolerations:
    - effect: NoSchedule
      key: union.ai/node-role
      operator: Equal
      value: services
```

To ensure that your worker processes are scheduled on the worker node pool, set the following for the Flyte kubernetes plugin:

```yaml
config:
  k8s:
    plugins:
      k8s:
        default-affinity:
          nodeAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
              nodeSelectorTerms:
              - matchExpressions:
                - key: union.ai/node-role
                  operator: In
                  values:
                  - worker
        default-tolerations:
          - effect: NoSchedule
            key: union.ai/node-role
            operator: Equal
            value: worker
```

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/code-viewer ===

# Code Viewer

The Union UI allows you to view the exact code that executed a specific task. Union securely transfers the [code bundle](https://www.union.ai/docs/v1/union/user-guide/development-cycle/running-your-code) directly to your browser without routing it through the control plane.

![Code Viewer](https://www.union.ai/docs/v1/union/_static/images/deployment/configuration/code-viewer/demo.png)

## Enable CORS policy on your fast registration bucket

To support this feature securely, your bucket must allow CORS access from Union. The configuration steps vary depending on your cloud provider.

### AWS S3 Console

1. Open the AWS Console.
2. Navigate to the S3 dashboard.
3. Select your fast registration bucket. By default, this is the same as the metadata bucket configured during initial deployment.
4. Click the **Permissions** tab and scroll to **Cross-origin resource sharing (CORS)**.
5. Click **Edit** and enter the following policy:
![S3 CORS Policy](https://www.union.ai/docs/v1/union/_static/images/deployment/configuration/code-viewer/s3.png)

```
[
    {
        "AllowedHeaders": [
            "*"
        ],
        "AllowedMethods": [
            "GET",
            "HEAD",
        ],
        "AllowedOrigins": [
            "https://*.unionai.cloud"
        ],
        "ExposeHeaders": [
            "ETag"
        ],
        "MaxAgeSeconds": 3600
    }
]
```

For more details, see the [AWS S3 CORS documentation](https://docs.aws.amazon.com/AmazonS3/latest/userguide/cors.html).

### Google GCS

Google Cloud Storage requires CORS configuration via the command line.

1. Create a `cors.json` file with the following content:
    ```json
    [
        {
        "origin": ["https://*.unionai.cloud"],
        "method": ["HEAD", "GET"],
        "responseHeader": ["ETag"],
        "maxAgeSeconds": 3600
        }
    ]
    ```
2. Apply the CORS configuration to your bucket:
    ```bash
    gcloud storage buckets update gs://<fast_registration_bucket> --cors-file=cors.json
    ```
3. Verify the configuration was applied:
   ```bash
   gcloud storage buckets describe gs://<fast_registration_bucket> --format="default(cors_config)"

   cors_config:
   - maxAgeSeconds: 3600
     method:
     - GET
     - HEAD
     origin:
     - https://*.unionai.cloud
     responseHeader:
     - ETag
   ```
For more details, see the [Google Cloud Storage CORS documentation](https://docs.cloud.google.com/storage/docs/using-cors#command-line).

### Azure Storage

For Azure Storage CORS configuration, see the [Azure Storage CORS documentation](https://learn.microsoft.com/en-us/rest/api/storageservices/cross-origin-resource-sharing--cors--support-for-the-azure-storage-services).

## Troubleshooting

| Error Message | Cause | Fix |
|---------------|-------|-----|
| `Not available: No code available for this action.` | The task does not have a code bundle. This occurs when the code is baked into the Docker image or the task is not a code-based task. | This is expected behavior for tasks without code bundles. |
| `Not Found: The code bundle file could not be found. This may be due to your organization's data retention policy.` | The code bundle was deleted from the bucket, likely due to a retention policy. | Review your fast registration bucket's retention policy settings. |
| `Error: Code download is blocked by your storage bucket's configuration. Please contact your administrator to enable access.` | CORS is not configured on the bucket. | Configure CORS on your bucket using the instructions above. |

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/image-builder ===

# Image Builder

Union Image Builder supports the ability to build container images within the dataplane. This enables the use of the `remote` builder type for any defined [Container Image](https://www.union.ai/docs/v1/union/user-guide/core-concepts/tasks/task-software-environment/image-spec).

Configure the use of remote image builder:
```bash
flyte create config --builder=remote --endpoint...
```

Write custom [container images](https://www.union.ai/docs/v1/union/user-guide/core-concepts/tasks/task-software-environment/image-spec):
```python
env = flyte.TaskEnvironment(
    name="hello_v2",
    image=flyte.Image.from_debian_base()
        .with_pip_packages("<package 1>", "<package 2>")
)
```

> By default, Image Builder is disabled and has to be enabled by configuring the builder type to `remote` in flyte config

## Requirements

* The image building process runs in the target run's project and domain. Any image push secrets needed to push images to the registry will need to be accessible from the project & domain where the build happens.

## Configuration

Image Builder is configured directly through Helm values.

```yaml
imageBuilder:

  # Enable Image Builder
  enabled: true

  # -- The config map build-image container task attempts to reference.
  # -- Should not change unless coordinated with Union technical support.
  targetConfigMapName: "build-image-config"

  # -- The URI of the buildkitd service. Used for externally managed buildkitd services.
  # -- Leaving empty and setting imageBuilder.buildkit.enabled to true will create a buildkitd service and configure the Uri appropriately.
  # -- E.g. "tcp://buildkitd.buildkit.svc.cluster.local:1234"
  buildkitUri: ""

  # -- The default repository to publish images to when "registry" is not specified in ImageSpec.
  # -- Note, the build-image task will fail unless "registry" is specified or a default repository is provided.
  defaultRepository: ""

  # -- How build-image task and operator proxy will attempt to authenticate against the default #    repository.
  # -- Supported values are "noop", "google", "aws", "azure"
  # -- "noop" no authentication is attempted
  # -- "google" uses docker-credential-gcr to authenticate to the default registry
  # -- "aws" uses docker-credential-ecr-login to authenticate to the default registry
  # -- "azure" uses az acr login to authenticate to the default registry. Requires Azure Workload Identity to be enabled.
  authenticationType: "noop"

  buildkit:

    # -- Enable buildkit service within this release.
    enabled: true

    # Configuring Union managed buildkitd Kubernetes resources.
    ...
```

## Authentication

### AWS

By default, Union is intended to be configured to use [IAM roles for service accounts (IRSA)](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html) for authentication. Setting `authenticationType` to `aws` configures Union image builder related services to use AWS default credential chain. Additionally, Union image builder uses [`docker-credential-ecr-login`](https://github.com/awslabs/amazon-ecr-credential-helper) to authenticate to the ecr repository configured with `defaultRepository`.

`defaultRepository` should be the fully qualified ECR repository name, e.g. `<AWS_ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/<REPOSITORY_NAME>`.

Therefore, it is necessary to configure the user role with the following permissions.

```json
{
  "Effect": "Allow",
  "Action": [
    "ecr:GetAuthorizationToken"
  ],
  "Resource": "*"
},
{
  "Effect": "Allow",
  "Action": [
    "ecr:BatchCheckLayerAvailability",
    "ecr:BatchGetImage",
    "ecr:GetDownloadUrlForLayer"
  ],
  "Resource": "*"
  // Or
  // "Resource": "arn:aws:ecr:<AWS_REGION>:<AWS_ACCOUNT_ID>:repository/<REPOSITORY>"
}
```

Similarly, the `operator-proxy` requires the following permissions

```json
{
  "Effect": "Allow",
  "Action": [
    "ecr:GetAuthorizationToken"
  ],
  "Resource": "*"
},
{
  "Effect": "Allow",
  "Action": [
    "ecr:DescribeImages"
  ],
  "Resource": "arn:aws:ecr:<AWS_REGION>:<AWS_ACCOUNT_ID>:repository/<REPOSITORY>"
}
```

#### AWS Cross Account access

Access to repositories that do not exist in the same AWS account as the data plane requires additional ECR resource-based permissions. An ECR policy like the following is required if the configured `defaultRepository` or `ImageSpec`'s `registry` exists in an AWS account different from the dataplane's.

```json
{
  "Statement": [
    {
      "Sid": "AllowPull",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<DATAPLANE_AWS_ACCOUNT>:role/<user-role>",
          "arn:aws:iam::<DATAPLANE_AWS_ACCOUNT>:role/<node-role>",
          // ... Additional roles that require image pulls
        ]
      },
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:GetDownloadUrlForLayer"
      ]
    },
    {
      "Sid": "AllowDescribeImages",
      "Action": [
        "ecr:DescribeImages"
      ],
      "Principal": {
        "AWS": [
          "arn:aws:iam::<DATAPLANE_AWS_ACCOUNT>:role/<operator-proxy-role>",
        ]
      },
      "Effect": "Allow"
    },
    {
      "Sid": "ManageRepositoryContents"
      // ...
    }
  ],
  "Version": "2012-10-17"
}
```

In order to support a private ImageSpec `base_image` the following permissions are required.

```json
{
  "Statement": [
    {
      "Sid": "AllowPull",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<DATAPLANE_AWS_ACCOUNT>:role/<user-role>",
          "arn:aws:iam::<DATAPLANE_AWS_ACCOUNT>:role/<node-role>",
          // ... Additional roles that require image pulls
        ]
      },
      "Action": [
        "ecr:BatchCheckLayerAvailability",
        "ecr:BatchGetImage",
        "ecr:GetDownloadUrlForLayer"
      ]
    },
  ]
}
```

### Google Cloud Platform

By default, GCP uses [Kubernetes Service Accounts to GCP IAM](https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#kubernetes-sa-to-iam) for authentication. Setting `authenticationType` to `google` configures Union image builder related services to use GCP default credential chain. Additionally, Union image builder uses [`docker-credential-gcr`](https://github.com/GoogleCloudPlatform/docker-credential-gcr) to authenticate to the Google artifact registries referenced by `defaultRepository`.

`defaultRepository` should be the full name to the repository in combination with an optional image name prefix. `<GCP_LOCATION>-docker.pkg.dev/<GCP_PROJECT_ID>/<REPOSITORY_NAME>/<IMAGE_PREFIX>`.

It is necessary to configure the GCP user service account with `iam.serviceAccounts.signBlob` project level permissions.

#### GCP Cross Project access

Access to registries that do not exist in the same GCP project as the data plane requires additional GCP permissions.

* Configure the user "role" service account with the `Artifact Registry Writer`.
* Configure the GCP worker node and union-operator-proxy service accounts with the `Artifact Registry Reader` role.

### Azure

By default, Union is designed to use Azure [Workload Identity Federation](https://learn.microsoft.com/en-us/azure/aks/workload-identity-deploy-cluster) for authentication using [user-assigned managed identities](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/how-manage-user-assigned-managed-identities?pivots=identity-mi-methods-azp) in place of AWS IAM roles.

* Configure the user "role" user-assigned managed identity with the `AcrPush` role.
* Configure the Azure kubelet identity ID and operator-proxy user-assigned managed identities with the `AcrPull` role.

### Private registries

Follow guidance in this section to integrate Image Builder with private registries:

#### GitHub Container Registry

1. Follow the [GitHub guide](https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry) to log in to the registry locally.
2. Create a Union secret:
```bash
flyte create secret --type image_pull --from-docker-config --registries ghcr.io SECRET_NAME
```

> This secret will be available to all projects and domains in your tenant. [Learn more about Union Secrets](./union-secrets)
> Check alternative ways to create image pull secrets in the [API reference](https://www.union.ai/docs/v1/union/api-reference/union-cli)

3. Reference this secret in the Image object:

```python
env = flyte.TaskEnvironment(
    name="hello_v2",
    # Allow image builder to pull and push from the private registry. `registry` field isn't required if it's configured
    # as the default registry in imagebuilder section in the helm chart values file.
    image=flyte.Image.from_debian_base(registry="<my registry url>", name="private", registry_secret="<YOUR_SECRET_NAME>")
        .with_pip_packages("<package 1>", "<package 2>"),
    # Mount the same secret to allow tasks to pull that image
    secrets=["<YOUR_SECRET_NAME>"]
)
```

This will enable Image Builder to push images and layers to a private GHCR. It'll also allow pods for this task environment to pull
this image at runtime.

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/multi-cluster ===

# Multiple Clusters

Union enables you to integrate multiple Kubernetes clusters into a single Union control plane using the `clusterPool` abstraction.

Currently, the clusterPool configuration is performed by Union in the control plane when you provide the mapping between clusterPool name and clusterNames using the following structure:

```yaml
clusterPoolname:
  - clusterName
```
With `clusterName` matching the name you used to install the Union operator Helm chart.

You can have as many cluster pools as needed:

**Example**

```yaml
default: # this is the clusterPool where executions will run, unless another mapping specified
  - my-dev-cluster
development-cp:
  - my-dev-cluster
staging-cp:
  - my-staging-cluster
production-cp:
  - production-cluster-1
  - production-cluster-2
dr-region:
  - dr-site-cluster
```

## Using cluster pools

Once the Union team configures the clusterPools in the control plane, you can proceed to configure mappings:

### project-domain-clusterPool mapping

1. Create a YAML file that includes the project, domain, and clusterPool:

**Example: cpa-dev.yaml**

```yaml
domain: development
project: flytesnacks
clusterPoolName: development-cp
```

2. Update the control plane with this mapping:

```bash
uctl update cluster-pool-attributes --attrFile cpa-dev.yaml
```
3. New executions in `flytesnacks-development` should now run in the `my-dev-cluster`

### project-domain-workflow-clusterPool mapping

1. Create a YAML file that includes the project, domain, and clusterPool:

**Example: cpa-dev.yaml**

```yaml
domain: production
project: flytesnacks
workflow: my_critical_wf
clusterPoolName: production-cp
```

2. Update the control plane with this mapping:

```bash
uctl update cluster-pool-attributes --attrFile cpa-prod.yaml
```
3. New executions of the `my_critical_wf` workflow in `flytesnacks-production` should now run in any of the clusters under `production-cp`

## Data sharing between cluster pools

The sharing of metadata is controlled by the cluster pool to which a cluster belongs. If two clusters are in the same cluster pool, then they must share the same metadata bucket, defined in the Helm values as `storage.bucketName`.

If they are in different cluster pools, then they **must** have different metadata buckets. You could, for example, have a single metadata bucket for all your development clusters, and a separate one for all your production clusters, by grouping the clusters into cluster pools accordingly.

 Alternatively you could have a separate metadata bucket for each cluster, by putting each cluster in its own cluster pool.

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/persistent-logs ===

# Persistent logs

Persisted logging is enabled by default and uses an object store to store task logs. By default persisted logs (also called Task Logs) will be stored in the `persisted-logs/*` path on the Storage  endpoint configured for your data plane.

=== PAGE: https://www.union.ai/docs/v1/union/deployment/selfmanaged/configuration/union-secrets ===

# Secrets

[Union Secrets](https://www.union.ai/docs/v1/union/user-guide/development-cycle/managing-secrets) are enabled by default. Union Secrets are managed secrets created through the native Kubernetes secret manager.

The only configurable option is the namespace where the secret is stored. To override the default behavior, set `proxy.secretManager.namespace` in the values file used by the helm chart. If this is not specified, the `union` namespace will be used by default.

Example:
```yaml
proxy:
  secretManager:
    # -- Set the namespace for union managed secrets created through the native Kubernetes secret manager. If the namespace is not set,
    # the release namespace will be used.
    namespace: "secret"
```

