Use a custom Airflow® image

When you deploy a Managed Apache Airflow® cluster, you can use a default image provided by DoubleCloud or upload an Airflow® container image tailored for your needs.

The differences between the default and custom image are as follows:

  • The default image provided by DoubleCloud contains a pre-configured Airflow® environment with commonly used packages. This is the default option when you create a Managed Apache Airflow® cluster, allowing you to get started with Airflow® right away without the need to configure and build a container image.

    Using the default image is an ideal choice if you don't need to customize the cluster. It removes the need to maintain the image, apply updates and security patches.

    Tip

    To view the content of the default image, open the airflow-image repository navigate to the directory with the version you need, and open the requirements.txt file.

  • A custom image is a container image you build for your specific workflow. It extends the base image with custom plugins, packages, or libraries tailored for your needs.

    Custom container images allow you to use specific versions of Airflow® or dependencies that aren't available in the default configuration. They give you full control over the environment, but you're responsible for maintaining and updating packages and dependencies that you've added.

Before you start

  1. If you haven't already, create a Managed Apache Airflow® cluster.
  2. Make sure you have Docker installed on your local machine. You need it to build a container image and push it to the registry.

Step 1. Build a custom container image

  1. Describe your custom Airflow® image in a Dockerfile using one of DoubleCloud Airflow® images as a parent image.

    Warning

    Make sure you use the base image with the same Airflow® version as your cluster.

    An example Dockerfile extending the base image looks as follows:

    FROM ghcr.io/doublecloud/airflow:2.8.1
    
    # Custom extensions to the base image
    COPY requirements.txt /usr/local/
    
    # Define Airflow version to avoid unintentional version downgrading
    RUN pip install --no-cache-dir \
        "apache-airflow==${AIRFLOW_VERSION}" \
        -r /usr/local/requirements.txt
    
    # Copy your dbt project into the image
    COPY --chown=airflow:root dbt /usr/app/dbt
    
  2. On the Clusters page in the console, select the Airflow® cluster where you want to use the custom image.

  3. Select the Container image tab.

  4. Take note of the credentials under Registry configuration. You need them to log in to the registry.

  5. On your local machine, log in to the registry using the docker login command. Use the username from the previous step.

    docker login cr.airflow.double.cloud --username <username>
    

    When prompted, enter the password from the previous step.

  6. Build the image and tag with the host value using the following command. You can find the host value in the Registry configuration section. The host has the following format: cr.airflow.double.cloud/<cluster-id>/<cluster-version>.

    docker build . -t <host>
    

Step 2. Use a custom image in the cluster

  1. Push the image to the DoubleCloud registry. Use the same host value from the Registry configuration section as in the previous step.

    docker push <host>
    

    If the command output looks similar to the following, your image has been pushed successfully:

    latest: digest: sha256:bca85c0*******************d566679 size: 5333
    
  2. In the console, navigate to the cluster, select the Container images tab, and click Select image.

  3. Select Custom.

  4. Under Primary image, select the container image you want to use from the list and click Submit.

  5. The cluster will update, and all the services will restart from the new image.

See also