Actors vs. Regular Tasks#
When deciding whether to use actors or traditional tasks in your workflows, it’s important to consider the benefits and trade-offs. This page outlines key scenarios where actors shine and where they may not be the best fit.
When to Use Actors |
When Not to Use Actors |
---|---|
Short Running Tasks |
Long Running Tasks |
Map Tasks with Large Input Arrays |
Map Tasks with Small Input Arrays |
State Management and Efficient Initialization |
Strict Task Isolation Is Critical |
Shared Dependencies and Resources |
Efficiency Gains from Actors with Map Tasks#
Let’s see how using Actors with map tasks can cut runtime in half!
We compare three scenarios:
Regular map tasks without specifying concurrency. This is the fasted expected configuration as flyte will spawn as many pods as there are elements in the input array, allowing Kubernetes to manage scheduling based on available resources.
Regular map tasks with fixed concurrency. This limits the number of pods that are alive at any given time.
Map tasks with Actors. Here we set the number of replicas to match the concurrency of the previous example.
These will allow us to compare actors to vanilla map tasks when both speed is maximized and when alive pods are matched one-to-one.
“Hello World” Benchmark#
This benchmark simply runs a task that returns “Hello World”, which is a near instantaneous task.
Task Type |
Concurrency/Replicas |
Duration (seconds) |
---|---|---|
Without Actors |
unbound |
111 |
Without Actors |
25 |
1195 |
With Actors |
25 |
42 |
Key Takeaway: For near instantaneous tasks, using a 25-replica Actor with map tasks reduces runtime by 96% if live pods are matched, and 62% when map task concurrency is unbounded.
“5s Sleep” Benchmark#
This benchmark simply runs a task that sleeps for five seconds.
Task Type |
Concurrency/Replicas |
Duration (seconds) |
---|---|---|
Without Actors |
unbound |
174 |
Without Actors |
100 |
507 |
With Actors |
100 |
87 |
Key Takeaway: For five-second long tasks, using a 100-replica Actor with map tasks reduces runtime by 83% if live pods are matched, and 50% when map task concurrency is unbounded.
If you have short running map tasks, you can cut your runtime in half. If you are already using concurrency limits on your map tasks, you can expect even better improvements!