Work with sharding in ClickHouse®
For these step-by-step instructions, let's assume that you've created the cluster called Test-Cluster
with 5 shards and the database called examples
.
Create a distributed table
-
Connect to your database.
-
Create a table with ReplicatedMergeTree engine
player_ratings-A
:CREATE TABLE examples.player_ratings-A ON CLUSTER default ( `<table structure>` ) ENGINE = ReplicatedMergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate)
Let's look through this query in detail:
default
is the cluster name.ON CLUSTER default
automatically creates tables on all hosts.<table structure>
- describe every column and its data type according to ClickHouse documentation -
Create a distributed table
player_ratings-A_distributed
with Distributed engineCREATE TABLE examples.player_ratings_distributed ON CLUSTER default AS examples.data_table1 ENGINE = Distributed(default, examples, player_ratings-A, rand())
In this case, instead of explicitly describing the table structure, you can use
AS examples.player_ratings-A
to copy the structure of the initial table you created earlier.
Distributed table health check
To check the availability of the distributed table you created, run a test query for a number of rows in the table:
SELECT count() FROM examples.player_ratings-A_distributed
You should receive a number as a response to this query.