Data retention policy
Data retention policies determine how long workflow data is kept in the data plane object store. After the retention period expires, the data is automatically deleted.
For the conceptual map of what’s in the data plane bucket versus the control plane database, see Where your data lives. This page describes which data is covered by the retention policy and what happens when data is deleted.
Retention periods are set by your Union.ai plan:
| Plan | Retention period |
|---|---|
| Teams | 30 days |
| Enterprise | 1 year |
Retention periods are not adjustable through the UI or CLI.
Data categories
The retention policy applies to the following data:
-
Workflow execution data: the per-run artifacts written during task execution:
- Task inputs and outputs (the
inputs.pb/outputs.pbprotobuf payloads written by the platform during run setup and task execution). - Offloaded large values:
flyte.io.File,flyte.io.Dir,flyte.io.DataFrame, and other reference-type payloads — both in their default location and in any custom location specified per run viaflyte.with_runcontext(raw_data_path=...). Deckand report artifacts.- Trace checkpoints.
- Task inputs and outputs (the
-
Workflow code: The code that is uploaded by
flyte deployorflyte run --copy-style alland that make up the logic of your tasks and apps and other Flyte entities.
When the age of the data reaches the plan’s retention period, it is deleted.
Attempting to access deleted data
If you attempt to access deleted data, you will receive an error:
- When task input/output data is deleted, the Input/Output tabs in the UI display a Not Found error.
- When
Deckdata is deleted, theDeckview in the UI displays a Not Found error. - When artifact references resolve to deleted blobs, the UI shows a URL that points to a no longer existing object.
To remedy these types of errors, you have to re-run the workflow that generated the data in question.
- When workflow code is deleted, executions that depend on it will fail.
To remedy this, you have to both re-deploy and re-run the workflow.
Data retention and task caching
Task caching is adjusted to match the retention period. To avoid attempts to retrieve cache data that has already been deleted, the cache age is configured to be less than the retention period.