Getting started with Managed Service for ClickHouse®

To get started with the service:

Before you start

Your primary tool to interact with the DoubleCloud is the console. We need to set it up and then configure it before moving on.

  1. Go to the console.

  2. Log in to DoubleCloud if you already have an account, or create one if you are opening the console for the first time.

    Note

    This tutorial shows how to use a CLI client with Docker and with clickhouse-client on DEB-based and RPM-based Linux distributions. You can also use other tools of your choice.

    For more connection options, see Connect to a ClickHouse® database.

  3. Install the software:

    1. Open your terminal.

    2. (Optional) Start Docker if needed:

      service docker start
      
    3. Pull the clickhouse-client Docker image:

      docker pull clickhouse/clickhouse-client
      
    1. Open your terminal.

    2. Connect to the ClickHouse® official DEB repository from your Linux system:

      sudo apt update && sudo apt install -y apt-transport-https ca-certificates dirmngr && \
      sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 8919F6BD2B48D754 && \
      echo "deb https://packages.clickhouse.com/deb stable main" | sudo tee \
      /etc/apt/sources.list.d/clickhouse.list
      
    3. Refresh the package list and install the clickhouse-client :

      sudo apt update && sudo apt install -y clickhouse-client
      
    1. Open your terminal.

    2. Connect to a ClickHouse® official RPM repository from your Linux system:

      sudo yum install -y yum-utils
      sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo
      
    3. Install the clickhouse-client :

      sudo yum install -y clickhouse-client
      

    Warning

    If you run a RedHat 7-based Linux distribution, including Cent OS 7, Oracle Linux 7 and others, you need to download and install trusted certificates and manually add the path to them in the clickhouse-client configuration file as follows:

    1. Install the root certificate:

      curl https://letsencrypt.org/certs/isrg-root-x2-cross-signed.pem > \ 
      /etc/pki/ca-trust/source/anchors/isrg-root-x2-cross-signed.pem
      
    2. Install the intermediate certificate:

      curl https://letsencrypt.org/certs/lets-encrypt-r3-cross-signed.pem > \
      /etc/pki/ca-trust/source/anchors/lets-encrypt-r3-cross-signed.pem
      
    3. Update the list of trusted certificates:

      sudo update-ca-trust
      
    4. Locate your clickhouse-client configuration file (by default, you can find it at /etc/clickhouse-client/config.xml) and add the path to the certificates into the <openSSL> section:

      <client> <!-- Used for connection to server's secure tcp port -->
         <loadDefaultCAFile>true</loadDefaultCAFile>
         <cacheSessions>true</cacheSessions>
         <disableProtocols>sslv2,sslv3</disableProtocols>
         <preferServerCiphers>true</preferServerCiphers>
         <caConfig>/etc/ssl/certs/ca-bundle.crt</caConfig>
         <!-- Use for self-signed: <verificationMode>none</verificationMode> -->
         <invalidCertificateHandler>
         <!-- Use for self-signed: <name>AcceptCertificateHandler</name> -->
         <name>RejectCertificateHandler</name>
         </invalidCertificateHandler>
      </client>
      

Create your cluster

ClickHouse® clusters are one or more database hosts.

  1. Select Clusters from the list of services on the left.

  2. Click Create cluster in the upper-right corner of the page.

  3. Select ClickHouse®.

    1. Choose a provider and a region.

    2. Under Resources:

      • Select the s1-c2-m4 preset for CPU, RAM capacity, and storage space to create a cluster with minimal configuration.

      • Choose the number of replicas. Let's keep it as is with a single replica.

      • Select the number of shards. Keep a single shard.

    3. Under Basic settings:

      • Enter the cluster Name, in this tutorial - first-cluster.

      • From the Version drop-down list, select the ClickHouse® version the Managed ClickHouse® cluster will use. For most clusters, we recommend using the latest version.

    4. Under NetworkingVPC, specify in which DoubleCloud VPC to locate your cluster. Use the default value in the previously selected region if you don't need to create this cluster in a specific network.

    5. Click Submit.

Your cluster will appear with the Creating status on the Clusters page in the console. Setting everything up may take some time. When the cluster is ready, it changes its state to Alive.

Click on the cluster and you'll see the following page:

cluster-created

Tip

The DoubleCloud service creates the superuser admin and its password automatically. You can find both the User and the Password in the Overview tab on the cluster information page.

To create users for other roles, see Manage ClickHouse® users

Connect to the cluster and create a database

  1. Select Clusters from the list of services on the left.

  2. Select the name of your cluster to open its information page. By default, you will see the Overview tab.

  3. Under Connection strings, find the Native interface string and click Copy.

  4. Open your Linux terminal and run a command to connect to your cluster:

    docker run --network host --rm -it 
    clickhouse/<Native interface connection string>
    
    The complete Docker command structure
    docker run --network host --rm -it  \ 
                clickhouse/clickhouse-client \
                --host <FQDN of your cluster> \
                --secure \
                --user admin \
                --password <Cluster user password> \
                --port 9440 
    
    <Native interface connection string>
    
    The complete structure of the Native interface connection string
    clickhouse-client --host <FQDN of the host> \
                      --secure \
                      --user <Cluster user name> \
                      --password <Cluster user password> \
                      --port 9440 
    
  5. After you have connected to the cluster, run the CREATE query:

    CREATE DATABASE IF NOT EXISTS first_database
    

    After that, you can check the result by executing the following command:

    SHOW DATABASES
    

    The output will contain the name of your recently created database:

    ┌─name───────────────┐
    │ INFORMATION_SCHEMA │
    │ _system            │
    │ default            │
    │ first_database     │
    │ information_schema │
    │ system             │
    └────────────────────┘
    
  6. Now, it's time to add a table to the first_database database on your cluster. The table will have the following structure adjusted to the dataset that we will upload later:

    CREATE TABLE first_database.hits ON CLUSTER 'default' \
          (Hit_ID Int32, Date Date, Time_Spent Float32, Cookie_Enabled Int32, Region_ID Int32, Gender String, Browser String, Traffic_Source String, Technology String) \
          ENGINE ReplicatedMergeTree \
          ORDER BY (Hit_ID, Date)
    

    Check if your query was successful and the table exists in the database:

    SHOW TABLES FROM first_database
    

    The readout will be the following:

    ┌─name─┐
    │ hits │
    └──────┘
    

Tip

Alternatively, you can use a GUI-based IDE of your choice to connect to a cluster and create a database.

Upload data

  1. Now you can open another terminal instance or exit from your cluster by typing the exit command.

  2. Run the following query that will fetch the data from our S3 bucket and combine it with the INSERT query:

    curl https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.csv
    | docker run --network host --rm -it 
    clickhouse/<Native interface connection string>
    --query="INSERT INTO first_database.hits FORMAT CSVWithNames"
    --format_csv_delimiter=";"
    
    The complete Docker command structure
    curl https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.csv
    | docker run --network host --rm -i 
       clickhouse/clickhouse-client 
       --host <FQDN of your cluster> 
       --port 9440 --secure 
       --user admin 
       --password <your cluster password> 
       --query="INSERT INTO first_database.hits FORMAT CSVWithNames"
       --format_csv_delimiter=";"
    
  3. If the query was successful, perform the SELECT query to see the loaded data:

    docker run --network host --rm -it 
    clickhouse/<Native interface connection string>
    --query="SELECT * FROM first_database.hits LIMIT 20"
    
    The complete Docker command structure
    docker run --network host --rm -i 
          clickhouse/clickhouse-client 
          --host <FQDN of your cluster> 
          --port 9440 --secure 
          --user admin 
          --password <your cluster password> 
          --query="SELECT * FROM first_database.hits LIMIT 20"
    
  1. Run the following query that will fetch the data from our S3 bucket and combine it with the INSERT query:

    INSERT INTO first_database.hits FORMAT CSVWithNames
    
  2. If the query was successful, perform the SELECT query to see the loaded data:

    SELECT * FROM first_database.hits LIMIT 20
    

You should see your data in the table:

67112 2016-01-10 388.93976 0 184 Male   Firefox Direct          PC (Mac) 
65411 2016-01-30 267.13525 0 229 Female Firefox Search engine   PC (Windows) 
57240 2016-02-05 321.99017 0 229 Female Firefox Search engine   PC (Windows) 
34703 2016-02-11 300.81143 1 34  Male   Safari  Recommendations Tablet (Android) 
55134 2016-02-17 324.48718 0 2   Male   Firefox Direct          Other 
46247 2016-02-23 307.71362 1 2   Male   Safari  Recommendations Smartphone (Android) 
15422 2016-02-29 118.30818 1 2   Female Safari  Recommendations Smartphone (Android) 
36341 2016-03-06 330.35165 1 51  Male   Chrome  Social network  PC (Windows) 
54562 2016-03-12 325.20392 0 51  Female Chrome  Search engine   Smartphone (I0S) 
63833 2016-03-18 316.09775 1 51  Female Chrome  Search engine   Smartphone (I0S) 
43941 2016-03-24 263.73654 1 51  Male   Firefox Search engine   PC (Windows) 
38583 2016-03-30 363.8421  0 51  Female Safari  Internal        Smartphone (Android) 
65839 2016-04-05 284.4762  1 51  Male   Firefox Direct          PC (Windows) 
38024 2016-04-11 305.58286 1 51  Male   Edge    Search engine   PC (Windows) 
30573 2016-04-17 170.59955 0 52  Male   Edge    Search engine   PC (Windows) 
59083 2016-04-23 512.53    0 52  Male   Edge    Search engine   PC (Windows) 
40305 2016-04-29 324.144   0 52  Female Firefox Direct          Other 

Now you have a Managed ClickHouse® cluster with data in it. See the links below to continue exploring!

See also