The page you navigated to () does not exist, so we brought you to the closest page to it.
You have switched from the to the variant of this site. There is no equivalent of . We have taken you to the closest page in the variant.
ReusePolicy
Package: flyte
Configure a task environment for container reuse across multiple task invocations.
When environment creation is expensive relative to task runtime, reusable containers keep a pool of warm containers ready, avoiding cold-start overhead. The Python process may be reused by subsequent task invocations.
Total concurrent capacity is max_replicas * concurrency. For example,
ReusePolicy(replicas=(1, 3), concurrency=2) supports up to 6 concurrent tasks.
Caution: The environment is shared across invocations — manage memory and resources carefully.
env = flyte.TaskEnvironment(
name="fast_env",
reusable=flyte.ReusePolicy(replicas=(1, 3), concurrency=2),
)Parameters
class ReusePolicy(
replicas: typing.Union[int, typing.Tuple[int, int]],
idle_ttl: typing.Union[int, datetime.timedelta],
concurrency: int,
scaledown_ttl: typing.Union[int, datetime.timedelta],
)| Parameter | Type | Description |
|---|---|---|
replicas |
typing.Union[int, typing.Tuple[int, int]] |
Number of container replicas to maintain. - int: Fixed replica count, always running (e.g., replicas=3). - tuple(min, max): Auto-scaling range (e.g., replicas=(1, 5)). Scales between min and max based on demand. Default is 2. A minimum of 2 replicas is recommended to avoid starvation when the parent task occupies one replica. |
idle_ttl |
typing.Union[int, datetime.timedelta] |
Environment-level idle timeout — shuts down all replicas when the entire environment has been idle for this duration. Specified as seconds (int) or timedelta. Minimum 30 seconds. Default is 30 seconds. |
concurrency |
int |
Maximum concurrent tasks per replica. Values greater than 1 are only supported for async tasks. Default is 1. |
scaledown_ttl |
typing.Union[int, datetime.timedelta] |
Per-replica scale-down delay — minimum time to wait before removing an individual idle replica. Prevents rapid scale-down when tasks arrive in bursts. Specified as seconds (int) or timedelta. Default is 30 seconds. Note the distinction: idle_ttl controls when the whole environment shuts down; scaledown_ttl controls when individual replicas are removed during auto-scaling. |
Properties
| Property | Type | Description |
|---|---|---|
max_replicas |
None |
Returns the maximum number of replicas. |
min_replicas |
None |
Returns the minimum number of replicas. |
Methods
| Method | Description |
|---|---|
get_scaledown_ttl() |
Returns the scaledown TTL as a timedelta. |
get_scaledown_ttl()
def get_scaledown_ttl()Returns the scaledown TTL as a timedelta. If scaledown_ttl is not set, returns None.