The page you navigated to () does not exist, so we brought you to the closest page to it.
You have switched from the to the variant of this site. There is no equivalent of . We have taken you to the closest page in the variant.
2.1.9
Scaling
Package: flyte.app
Controls replica count and autoscaling behavior for app environments.
Common scaling patterns:
- Scale-to-zero (default):
Scaling(replicas=(0, 1))— no replicas when idle, scales to 1 on demand. - Always-on:
Scaling(replicas=(1, 1))— exactly 1 replica at all times. - Burstable:
Scaling(replicas=(1, 5))— 1 replica minimum, scales up to 5. - High-availability:
Scaling(replicas=(2, 10))— at least 2 replicas always running. - Fixed size:
Scaling(replicas=3)— exactly 3 replicas.
Parameters
class Scaling(
replicas: typing.Union[int, typing.Tuple[int, int]],
metric: typing.Union[flyte.app._types.Scaling.Concurrency, flyte.app._types.Scaling.RequestRate, NoneType],
scaledown_after: int | datetime.timedelta | None,
)| Parameter | Type | Description |
|---|---|---|
replicas |
typing.Union[int, typing.Tuple[int, int]] |
Number of replicas. An int for fixed count, or a (min, max) tuple for autoscaling. Default (0, 1). |
metric |
typing.Union[flyte.app._types.Scaling.Concurrency, flyte.app._types.Scaling.RequestRate, NoneType] |
Autoscaling metric — Scaling.Concurrency(val) (scale when concurrent requests per replica exceeds val) or Scaling.RequestRate(val) (scale when requests per second per replica exceeds val). Default None. |
scaledown_after |
int | datetime.timedelta | None |
Time to wait after the last request before scaling down. Seconds (int) or timedelta. Default None (platform default). |
Methods
| Method | Description |
|---|---|
get_replicas() |
get_replicas()
def get_replicas()