This tutorial guides you through creating a Managed Apache Airflow® cluster in DoubleCloud.
Console
Terraform
API
Step 1. Configure resources
Go to the Clusters page
and click Create cluster at the top right.
Select Airflow.
Choose a provider and a region.
You can create Managed Apache Airflow® clusters on AWS in any of the available regions.
By default, DoubleCloud preselects the region nearest to you.
In Environment configuration, select the configuration that best fits your needs.
It defines the ratio of webservers, schedulers, and triggerers.
Under Worker node resources,
select a preset with the amount of CPU, RAM, and SSD storage suitable for your workload.
In Min worker nodes and Max worker nodes,
specify the lower and upper limits of workers for autoscaling.
The cluster will automatically adjust the number of workers depending on the load.
In Concurrency, specify the limit for how many running task instances a DAG can have.
Any task instance on top of this number is queued.
Under Basic settings in Name enter a cluster name, such as airflow-dev.
In Version, select the Airflow® version for the cluster.
Unless you need a specific version, select the latest one.
If your DAGs are stored in a Git repository, configure a connection under DAG Code Repository:
How to configure a connection
In Repository URL, DAG path, and Branch, enter the details of your Git repository with DAGs.
If the repository is private, enter the credentials in Username and Password/token.
For GitHub, use a
personal access token .
To make sure the connection details are correct, click Check connection.
Step 2. Configure advanced settings
In Maintenance settings,
select whether you want DoubleCloud to perform maintenance at an arbitrary time or by schedule.
If you selected By schedule, select the day and time (UTC).
Under Networking → VPC, select the network where you want to create the cluster.
If you don’t need to place the cluster in a specific network, leave the preselected default option.
You can create a new network on the
VPC page in the console.
Airflow® clusters in BYOC networks
You can create Managed Apache Airflow® clusters only in BYOC networks created after September 19, 2024.
(Optional) In Allowlist, configure IP addresses the Airflow® cluster can be accessed from.
To do that, click Edit and add or remove IP addresses.
You can use both single addresses and CIDR blocks.
When you're done, click Save in the dialog.
In the Summary block on the right, review the resources to be created and their price.
Click Submit.
When the cluster is ready, its status changes from Creating to Alive.
This example contains a minimum set of parameters required to create a functional example cluster.
When you create your production cluster, make sure to use the configuration that is suitable for your needs.
For a full list of available parameters, refer to the
DoubleCloud Airflow® cluster resource schema .
To create a Managed Apache Airflow® cluster,
use the ClusterServicecreate method.
The following parameters are required to create a functional cluster:
project_id: ID of your project.
You can get the ID on your project’s information page.
cloud_type: aws.
region_id: AWS region to create the cluster in.
name: Name of your cluster.
It must be unique within the project.