Use a custom Airflow® image

When you deploy a Managed Apache Airflow® cluster, you can use a default image provided by DoubleCloud or upload an Airflow® container image tailored for your needs.

The differences between the default and custom image are as follows:

  • The default image provided by DoubleCloud contains a pre-configured Airflow® environment with commonly used packages. This is the default option when you create a Managed Apache Airflow® cluster, allowing you to get started with Airflow® right away without the need to configure and build a container image.

    To view the packages included in the default image, go to the airflow-image repository navigate to the directory with the version you need, and open the requirements.txt file.

    Using the default image removes the need to maintain the image, perform updates, and apply security patches. It’s an ideal choice if you don’t need to customize the cluster or when all the packages you need are available from PyPI and can be installed using requirements.txt.

  • A custom image is a container image you build for your specific workflow. It extends the base image with custom plugins, packages, or libraries tailored for your needs.

    Custom container images allow you to use specific versions of Airflow® or dependencies that aren’t available in the default configuration. They give you full control over the environment, but you’re responsible for maintaining and updating packages and dependencies that you’ve added.

Note

Using a custom container image is an alternative to installing dependencies through requirements.txt in the console, and you can use only one of the options at a time. If you select a custom image after you’ve added dependencies to requirements.txt, the custom image is used and the content of requirements.txt is removed.

Before you start

  1. If you haven’t already, create a Managed Apache Airflow® cluster.

  2. Make sure you have Docker installed on your local machine. You need it to build a container image and push it to the registry.

Step 1. Build a custom container image

  1. Describe your custom Airflow® image in a Dockerfile using one of DoubleCloud Airflow® images as a parent image.

    Warning

    Make sure you use the base image with the same Airflow® version as your cluster.

    An example Dockerfile extending the base image looks as follows:

    FROM ghcr.io/doublecloud/airflow:2.8.1
    
    # Custom extensions to the base image
    COPY requirements.txt /usr/local/
    
    # Define Airflow version to avoid unintentional version downgrading
    RUN pip install --no-cache-dir \
        "apache-airflow==${AIRFLOW_VERSION}" \
        -r /usr/local/requirements.txt
    
    # Copy your dbt project into the image
    COPY --chown=airflow:root dbt /usr/app/dbt
    
  2. On the Clusters page in the console, select the Airflow® cluster where you want to use the custom image.

  3. Switch to the Container image tab.

  4. Take note of the credentials under Registry configuration. You need them to log in to the registry.

  5. On your local machine, log in to the registry using the docker login command. Use the username from the previous step.

    docker login cr.airflow.double.cloud --username <username>
    

    When prompted, enter the password from the previous step.

  6. Build the image and tag with the host value using the following command. You can find the host value in the Registry configuration section. The host has the following format: cr.airflow.double.cloud/<cluster-id>/<cluster-version>.

    docker build . -t <host>
    

Step 2. Use a custom image in the cluster

  1. Push the image to the DoubleCloud registry. Use the same host value from the Registry configuration section as in the previous step.

    docker push <host>
    

    If the command output looks similar to the following, your image has been pushed successfully:

    latest: digest: sha256:bca85c0*******************d566679 size: 5333
    
  2. In the console, navigate to the cluster, select the Container images tab, and click Select image.

  3. Select Custom.

  4. Under Primary image, select the container image you want to use from the list and click Submit.

  5. The cluster will update, and all the services will restart from the new image.

See also