Remote dependencies with ImageSpec#
During the development cycle you will want to be able to run your workflows both locally on your machine and remotely on Union, so you will need to ensure that the required dependencies are installed in both environments.
Here we will explain how to set up the dependencies for you workflow to run remotely on Union. For information on how to make your dependencies available locally, see Local dependencies.
ImageSpec#
When a workflow is deployed to Union, each task is set up to run in its own container in the Kubernetes cluster.
You specify the dependencies as part of the definition of the container image to be used for each task using the ImageSpec
class.
In the template code generated when you did union init
, you will see a ImageSpec
block in the script.
The relevant part for our purposes is:
"""Basic Union workflow template."""
from flytekit import task, workflow, ImageSpec
# ImageSpec defines the container image used for the Kubernetes pods that run the tasks in Union.
image_spec = ImageSpec(
# The name of the image
name="basic-union-image",
# Use the requirements.txt to define the packages to be installed in the the image
requirements="requirements.txt",
# Container registry to which the image will be pushed.
#
# ON UNION BYOC UNCOMMENT THIS PARAMETER!
#
# Make sure that:
#
# * You substitute the actual name of the registry here.
# (for example if you are using GitHub's GHCR, you would
# use "ghcr.io/<my-github-org>").
#
# * You have Docker installed locally and are logged into the registry.
#
# * The image, once pushed to the registry, is accessible to Union
# (for example, for GHCR, make sure the image is public)
#
# This parameter is only needed for BYOC. On Serverless, images are stored
# transparently in Union's own container registry.
# registry="<my-registry>"
The init
template includes all available ImageSpec
parameters but apart from the two parameters above (name
and requirements
), they are all commented out.
Since you are running on Union BYOC, you will have to uncomment and configure the registry
parameter as well.
Make sure that:
You substitute the actual name of the registry you are using for
<my-registry>
. (For example if you are using GitHub’s GHCR, you would usehttps://ghcr.io/<my-github-org>
).You have Docker installed locally and are logged into the registry.
The image, once pushed to the registry, is accessible to Union (for example, for GHCR, make sure the image is public).
This parameter is needed because with Union BYOC, when you register a workflow, the image is built on your local machine and pushed to the registry you specify. On the other hand, with Union Serverless, when you register a workflow, the image is built and stored in the cloud on Union and therefore does not need to be pushed to a registry.
For more details on setting this up, see Setting up container image handling.
Building the task container image#
When you register your workflow code, your locally installed Union SDK will build the container image defined in the ImageSpec
block and push it to the registry you specified.
When a task that uses that image is executed on Union, the image will be pulled from the registry and installed in the container that runs the task.
Note
Local building of container images and pushing them to your configured registry is only done on Union BYOC.
With Union Serverless, images are built and stored transparently by the ImageBuilder
service on Union in the cloud.
For more information, see the Serverless version of this page.