Use a custom Airflow® image
When you deploy a Managed Apache Airflow® cluster, you can use a default image provided by DoubleCloud or upload an Airflow® container image tailored for your needs.
The differences between the default and custom image are as follows:
-
The default image provided by DoubleCloud contains a pre-configured Airflow® environment with commonly used packages. This is the default option when you create a Managed Apache Airflow® cluster, allowing you to get started with Airflow® right away without the need to configure and build a container image.
To view the packages included in the default image, go to the airflow-image repository
requirements.txt
file.Using the default image removes the need to maintain the image, perform updates, and apply security patches. It’s an ideal choice if you don’t need to customize the cluster or when all the packages you need are available from PyPI and can be installed using requirements.txt.
-
A custom image is a container image you build for your specific workflow. It extends the base image with custom plugins, packages, or libraries tailored for your needs.
Custom container images allow you to use specific versions of Airflow® or dependencies that aren’t available in the default configuration. They give you full control over the environment, but you’re responsible for maintaining and updating packages and dependencies that you’ve added.
Note
Using a custom container image is an alternative to installing dependencies through requirements.txt in the console, and you can use only one of the options at a time. If you select a custom image after you’ve added dependencies to requirements.txt, the custom image is used and the content of requirements.txt is removed.
Before you start
-
If you haven’t already, create a Managed Apache Airflow® cluster.
-
Make sure you have Docker
Step 1. Build a custom container image
-
Describe your custom Airflow® image in a Dockerfile using one of DoubleCloud Airflow® images
Warning
Make sure you use the base image with the same Airflow® version as your cluster.
An example Dockerfile extending the base image looks as follows:
FROM ghcr.io/doublecloud/airflow:2.8.1 # Custom extensions to the base image COPY requirements.txt /usr/local/ # Define Airflow version to avoid unintentional version downgrading RUN pip install --no-cache-dir \ "apache-airflow==${AIRFLOW_VERSION}" \ -r /usr/local/requirements.txt # Copy your dbt project into the image COPY --chown=airflow:root dbt /usr/app/dbt
-
On the Clusters
-
Switch to the Container image tab.
-
Take note of the credentials under Registry configuration. You need them to log in to the registry.
-
On your local machine, log in to the registry using the
docker login
command. Use the username from the previous step.docker login cr.airflow.double.cloud --username <username>
When prompted, enter the password from the previous step.
-
Build the image
cr.airflow.double.cloud/<cluster-id>/<cluster-version>
.docker build . -t <host>
Step 2. Use a custom image in the cluster
-
Push the image to the DoubleCloud registry. Use the same host value from the Registry configuration section as in the previous step.
docker push <host>
If the command output looks similar to the following, your image has been pushed successfully:
latest: digest: sha256:bca85c0*******************d566679 size: 5333
-
In the console, navigate to the cluster, select the Container images tab, and click Select image.
-
Select Custom.
-
Under Primary image, select the container image you want to use from the list and click Submit.
-
The cluster will update, and all the services will restart from the new image.
See also
- airflow-dbt-demo repository
- Manage Python dependencies