Terraform is an infrastructure-as-code tool
that allows you to provision and manage cloud resources with declarative configuration files.
With DoubleCloud,
you can use Terraform to manage ClickHouse®, Apache Kafka®, and Airflow® clusters,
data transfers, endpoints, and network connections.
In this tutorial, you learn how to create resources using the
DoubleCloud Terraform provider .
This will help you get started with DoubleCloud services and give you an idea of how you can benefit from them.
Select a service account to use with Terraform or
create a new one.
Make sure this account has the Editor permissions for the services you want to use.
Step 1. Configure the DoubleCloud Terraform provider
When creating resources with Terraform,
you first describe their parameters in Terraform configuration files.
After that, you run Terraform, and it provisions the resources on your behalf.
Create a new directory for your project and navigate to it:
mkdir doublecloud-terraform && cd doublecloud-terraform
Move the file with keys that you downloaded to this directory.
Create a new Terraform configuration file named main.tf and add the following code:
The terraform section instructs Terraform to use the DoubleCloud provider and
download it from the official Terraform Registry.
The provider section specifies where to look for the API keys.
Step 2. Add resource configuration
In the same main.tf file, add the configuration of the resources you want to create.
ClickHouse® cluster
Apache Kafka® cluster
Apache Airflow® cluster
Endpoints
Transfer
Tip
When you create a Managed ClickHouse® cluster, it needs a network.
In Terraform configuration, the network must be placed before the cluster as in this example.
# main.tf
...
data "doublecloud_network" "default" {
name = NETWORK_NAME # Replace with the name of the network you want to use
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
}
resource "doublecloud_clickhouse_cluster" "example-clickhouse" {
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
name = "example-clickhouse"
region_id = "eu-central-1"
cloud_type = "aws"
network_id = data.doublecloud_network.default.id
resources {
clickhouse {
resource_preset_id = "s2-c2-m4"
disk_size = 34359738368
replica_count = 1
}
}
config {
log_level = "LOG_LEVEL_TRACE"
max_connections = 120
}
access {
data_services = ["transfer"]
ipv4_cidr_blocks = [
{
value = "10.0.0.0/24"
description = "Office in Berlin"
}
]
}
}
When you create a Managed Apache Kafka® cluster, it needs a network.
In Terraform configuration, the network must be placed before the cluster as in this example.
# main.tf
...
data "doublecloud_network" "default" {
name = NETWORK_NAME # Replace with the name of the network you want to use
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
}
resource "doublecloud_kafka_cluster" "example-kafka" {
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
name = "example-kafka"
region_id = "eu-central-1"
cloud_type = "aws"
network_id = data.doublecloud_network.default.id
resources {
kafka {
resource_preset_id = "s2-c2-m4"
disk_size = 34359738368
broker_count = 1
zone_count = 1
}
}
schema_registry {
enabled = false
}
access {
data_services = ["transfer"]
ipv4_cidr_blocks = [
{
value = "10.0.0.0/24"
description = "Office in Berlin"
}
]
}
}
When you create a Managed Apache Airflow® cluster, it needs a network.
In Terraform configuration, the network must be placed before the cluster as in this example.
# main.tf
...
data "doublecloud_network" "default" {
name = NETWORK_NAME # Replace with the name of the network you want to use
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
}
resource "doublecloud_airflow_cluster" "example-airflow" {
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
name = "example-airflow"
region_id = "eu-central-1"
cloud_type = "aws"
network_id = data.doublecloud_network.default.id
resources {
airflow {
max_worker_count = 1
min_worker_count = 1
environment_flavor = "dev_test"
worker_concurrency = 16
worker_disk_size = 10
worker_preset = "small"
}
}
config {
version_id = "2.10.0"
sync_config {
repo_url = "https://github.com/apache/airflow"
branch = "main"
dags_path = "airflow/example_dags"
}
}
access {
data_services = ["transfer"]
ipv4_cidr_blocks = [
{
value = "10.0.0.0/24"
description = "Office in Berlin"
}
]
}
}
An endpoint is a connection between a database and Transfer.
A source endpoint connects to a remote source and sends data to Transfer,
while a target endpoint writes this data to a database.
# main.tf
...
# Source endpoint resource
resource "doublecloud_transfer_endpoint" "example-s3-source" {
name = "example-s3-source"
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
settings {
s3_source {
dataset = "hits"
format {
parquet {
buffer_size = 10000
}
}
path_pattern = "data-sets/hits.parquet"
provider {
bucket = "doublecloud-docs"
}
schema {}
}
}
}
# Target endpoint resource
resource "doublecloud_transfer_endpoint" "example-clickhouse-target" {
name = "example-clickhouse-target"
project_id = DOUBLECLOUD_PROJECT_ID # Replace with your project ID
settings {
clickhouse_target {
clickhouse_cleanup_policy = "DROP"
connection {
address {
cluster_id = example-clickhouse.id
}
database = "default"
password = CLICKHOUSE_PASSWORD # Replace with the ClickHouse user password
user = "admin"
}
}
}
}
A transfer requires a source and a target endpoint.
If you create new endpoints for your transfer resource,
make sure to place their configuration before the transfer as in this example.
In this tutorial, you are placing the Terraform and resource configuration in one .tf file.
When creating a more complex project,
you may want to use variables and split the configuration into several files,
such as main.tf, resource.tf, etc.
It’s also good practice to move sensitive information to variables.
Step 3. Initialize Terraform and validate the configuration
Initialize Terraform by running the following command in the project directory.
It downloads the provider and builds the .terraform directory.
terraform init
Initializing the backend...
Initializing provider plugins...
- Finding latest version of doublecloud/doublecloud...
- Installing doublecloud/doublecloud v0.1.24...
...
Terraform has been successfully initialized!
(Optional) Validate the configuration.
This command verifies that the syntax is correct and outputs the resources that will be created.
terraform plan
Terraform used the selected providers to generate the following execution plan.
Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
...
Plan: 4 to add, 0 to change, o to destroy.
Step 4. Apply the configuration and create resources
Apply the configuration and create resources:
terraform apply
When prompted, type yes and press Enter.
Terraform will provision your resources,
which may take some time.
To remove all the resources described in the project's configuration file,
you can also use the terraform destroy command.
However, it's not common to use it in production environments.