Use Managed Service for ClickHouse® API

Before you start

  1. Create an API key.

  2. Install the DoubleCloud API Python SDK.

Step 1. Create your cluster

ClickHouse® clusters are one or more database hosts.

Use the ClusterService create method. To create a cluster with the minimal configuration, provide the following parameters:The required parameters to create a functional cluster:

  • project_id: ID of your project. You can get it on your project's information page.

  • cloud_type: aws.

  • region_id: eu-central-1.

  • name: Name of your cluster, such as clickhouse-dev.

  • resources: Specify the parameters from the doublecloud.clickhouse.v1.ClusterResources model:

    • resource_preset_id: s2-c2-m4.

    • disk_size: 34359738368 (32 GB).

    • replica_count: 1.

    • shard_count: 1.

import json
import logging

from google.protobuf.wrappers_pb2 import Int64Value

import doublecloud
from doublecloud.clickhouse.v1.cluster_pb2 import ClusterResources
from doublecloud.clickhouse.v1.cluster_service_pb2 import CreateClusterRequest
from doublecloud.clickhouse.v1.cluster_service_pb2_grpc import ClusterServiceStub


def create_cluster(sdk, project_id, region_id, name, network_id):
    cluster_service = sdk.client(ClusterServiceStub)
    operation = cluster_service.Create(
        CreateClusterRequest(
        project_id=project_id,
        cloud_type="aws",
        region_id="eu-central-1",
        name="tutorial-cluster",
            resources=ClusterResources(
                clickhouse=ClusterResources.Clickhouse(
                    resource_preset_id="s2-c2-m4",
                    disk_size=Int64Value(value=34359738368),
                    replica_count=Int64Value(value=1),
                    shard_count=Int64Value(value=1),
                )
            ),
            network_id=network_id,
        )
    )
    logging.info("Creating initiated")
    return operation

Step 2. Connect to the cluster and create a database and table

  1. Go to the Clusters page in the console.

  2. If the cluster status has changed to Alive, select it in the cluster list.

  3. Click WebSQL at the top right.

    Screenshot of a ClickHouse® cluster page in the DoubleCloud console showing the WebSQL button

  4. In WebSQL, click on any database in the connection manager on the left to open the query editor.

  5. Create a database with the CREATE query:

    CREATE DATABASE IF NOT EXISTS website_data ON CLUSTER default
    
  6. Make sure that the database has been created:

    SHOW DATABASES
    
    ┌─name───────────────┐
    │ INFORMATION_SCHEMA │
    │ _system            │
    │ default            │
    │ website_data       │  // your database
    │ information_schema │
    │ system             │
    └────────────────────┘
    
  7. Add a table to the database. The columns will match the data in the example dataset:

    CREATE TABLE website_data.hits ON CLUSTER default (
      Hit_ID Int32, 
      Date Date, 
      Time_Spent Float32, 
      Cookie_Enabled Int32, 
      Region_ID Int32, 
      Gender String, 
      Browser String, 
      Traffic_Source String, 
      Technology String
    )
    ENGINE = ReplicatedMergeTree()
    ORDER BY (Hit_ID, Date)
    
  8. Make sure that the table has been created:

    SHOW TABLES FROM website_data
    
    ┌─name─┐
    │ hits │
    └──────┘
    

Tip

Alternatively, you can use an IDE of your choice to connect to the cluster and create a database.

Step 3. Upload data

  1. To fetch sample data and insert it into the database, run the following command:

    INSERT INTO website_data.hits
    SELECT *
    FROM s3('https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.csv', CSVWithNames)
    SETTINGS format_csv_delimiter = ';'
    
  2. To view the uploaded data, run a SELECT query:

    SELECT * FROM website_data.hits LIMIT 5
    

    The output should look as follows:

    ┌─Hit_ID─┬───────Date─┬─Time_Spent─┬─Cookie_Enabled─┬─Region_ID─┬─Gender─┬─Browser─┬─Traffic_Source──┬─Technology───────────┐
    │  14230 │ 2017-01-30 │  265.70175 │              1 │         2 │ Female │ Firefox │ Direct          │ PC (Windows)         │
    │  14877 │ 2017-04-12 │  317.82758 │              0 │       229 │ Female │ Firefox │ Direct          │ PC (Windows)         │
    │  14892 │ 2017-07-29 │   191.0125 │              1 │        55 │ Female │ Safari  │ Recommendations │ Smartphone (Android) │
    │  15071 │ 2017-06-11 │  148.58064 │              1 │       159 │ Female │ Chrome  │ Ad traffic      │ PC (Windows)         │
    │  15110 │ 2016-09-02 │  289.48334 │              1 │       169 │ Female │ Chrome  │ Search engine   │ Smartphone (IOS)     │
    └────────┴────────────┴────────────┴────────────────┴───────────┴────────┴─────────┴─────────────────┴──────────────────────┘
    

Now you have a Managed ClickHouse® cluster with data in it. See the links below to continue exploring!

Download the complete code example

You can download the full code listing for all the steps above from our Python SDK GitHub repository .

See also