Get started with Managed Service for ClickHouse®

This tutorial explains how to create a Managed ClickHouse® cluster on DoubleCloud, connect to it, and upload sample data.

If you're already familiar with ClickHouse® and know how to configure it, refer to Create a Managed ClickHouse® cluster with more detailed instructions instead.

If you're a new DoubleCloud user, this tutorial won't incur you any costs — you can use the trial period credits to test the platform, including creating fully operational clusters.

Before you begin

  1. Log in or sign up to the DoubleCloud console .

Step 1. Create a cluster

  1. In the console, go to the Clusters page and click Create cluster.

  2. Select ClickHouse.

    Tip

    The cluster creation page contains various options that allow you to configure the cluster for your needs. If you're just testing ClickHouse® and DoubleCloud now, you can go with the default settings that will create a fully functional cluster with minimal resource configuration. To do that, click Submit at the bottom of the page and skip to Step 2. Install ClickHouse® client.

    Otherwise, if you want to learn how you can configure the cluster, continue with the following steps.

  3. Review the Provider and Region settings.

    You can create Managed ClickHouse® clusters on AWS or Google Cloud in any of the available regions. By default, DoubleCloud preselects the region nearest to you.

  4. Review Resources.

    For this getting started guide, the defaults are enough. However, when you create a production cluster, make sure to select 3 replicas to ensure high availability.

  5. Under Basic settings enter the cluster name, for example clickhouse-dev. Leave the latest LTS version that's preselected.

  6. Review the Advanced settings.

    For this getting started guide, the defaults are enough. For a production cluster, make sure to select dedicated keeper hosts, so that they don't compete for resources with ClickHouse® itself.

  7. Click Submit.

Creating a cluster usually takes five to seven minutes depending on the cloud provider and region. When the cluster is ready, its status changes from Creating to Alive.

While the cluster is creating, you can move on to the next step and install the ClickHouse client.

Step 2. Install the ClickHouse® client

To connect to you Managed ClickHouse® from your local machine, you need the ClickHouse® client. If you don't have it configured yet, the quickest way to get is to install ClickHouse® locally.

To install ClickHouse® and the client on Linux, FreeBSD, or macOS, run the following command:

curl https://clickhouse.com/ | sh

This command downloads the official binary for your operating system.

Step 3. Connect to the cluster

  1. Open the Clusters overview page in the console.

  2. Select the cluster you created and make sure its status has changed from Creating to Alive.

  3. On the Overview tab, find the Connection strings section. Copy the Native interface connection string.

  4. In the terminal, enter the connection string and replace clickhouse-client with ./clickhouse client.

    Note

    You need to do that because you used a quicker way to install ClickHouse® client in step 1. When you later install the client separately, you can use the whole connection string that includes the clickhouse-client command.

    ./clickhouse client --host<rest_of_the_connection_string_you_copied>
    
    Connected to ClickHouse server version 23.8.9.
    ach-euc1-az2-s1-1.<cluster_name>.at.double.cloud :)
    

    :) means that the cluster is ready to receive commands.

Step 4. Create a database and upload data

  1. Create a database:

    CREATE DATABASE IF NOT EXISTS first_database
    
  2. Make sure that the database has been created:

    SHOW DATABASES
    
    ┌─name───────────────┐
    │ INFORMATION_SCHEMA │
    │ _system            │
    │ default            │
    │ first_database     │
    │ information_schema │
    │ system             │
    └────────────────────┘
    
  3. Add a table to the database. The columns will match the data in the example dataset:

    CREATE TABLE first_database.hits ON CLUSTER default (
      Hit_ID Int32, 
      Date Date, 
      Time_Spent Float32, 
      Cookie_Enabled Int32, 
      Region_ID Int32, 
      Gender String, 
      Browser String, 
      Traffic_Source String, 
      Technology String
    )
    ENGINE = ReplicatedMergeTree()
    ORDER BY (Hit_ID, Date)
    
  4. Make sure that the table has been created:

    SHOW TABLES FROM first_database
    
    ┌─name─┐
    │ hits │
    └──────┘
    
  5. To make it easier for you to test ClickHouse®, DoubleCloud provides sample datasets. For this guide, you can use a small sample dataset with website hits stored in an S3 bucket.

    To fetch sample data and insert it into the database, run the following command:

    INSERT INTO first_database.hits
    SELECT *
    FROM s3('https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.csv', CSVWithNames)
    SETTINGS format_csv_delimiter = ';'
    
  6. To view the uploaded data, run a SELECT query:

    SELECT * FROM first_database.hits LIMIT 5
    

    The output should look as follows:

    ┌─Hit_ID─┬───────Date─┬─Time_Spent─┬─Cookie_Enabled─┬─Region_ID─┬─Gender─┬─Browser─┬─Traffic_Source──┬─Technology───────────┐
    │  14230 │ 2017-01-30 │  265.70175 │              1 │         2 │ Female │ Firefox │ Direct          │ PC (Windows)         │
    │  14877 │ 2017-04-12 │  317.82758 │              0 │       229 │ Female │ Firefox │ Direct          │ PC (Windows)         │
    │  14892 │ 2017-07-29 │   191.0125 │              1 │        55 │ Female │ Safari  │ Recommendations │ Smartphone (Android) │
    │  15071 │ 2017-06-11 │  148.58064 │              1 │       159 │ Female │ Chrome  │ Ad traffic      │ PC (Windows)         │
    │  15110 │ 2016-09-02 │  289.48334 │              1 │       169 │ Female │ Chrome  │ Search engine   │ Smartphone (IOS)     │
    └────────┴────────────┴────────────┴────────────────┴───────────┴────────┴─────────┴─────────────────┴──────────────────────┘
    

Step 5 (optional). Clean up

When you no longer need resources, it's good practice to stop or delete them, so that you don't incur additional costs.

  • To stop a Managed ClickHouse® cluster, select it on the Clusters page in the console and click Stop.

    When your cluster is stopped, you don't pay for its CPU and RAM, but you're still billed for SSD Storage.

  • To delete a cluster, select it on the Clusters page in the console and click Delete.

What's next

Now that you have learned how to create a cluster and upload sample data to it, continue exploring the DoubleCloud platform or create a production Managed ClickHouse® cluster for your needs.