ClickHouse® connector
You can use the ClickHouse® connector in both source and target endpoints. In source endpoints, the connector retrieves data from ClickHouse® databases. In target endpoints, it inserts data to ClickHouse® databases.
Source endpoint
To configure a ClickHouse® source endpoint, provide the following settings:
-
In Connection type, select the type of the ClickHouse® cluster you want to connect to.
-
Configure the connection:
Managed cluster
Connect to a ClickHouse® cluster deployed in DoubleCloud.
-
In Managed Cluster, select the cluster you want to connect to.
-
In Authentication, select Default if you want to connect as the
admin
user.If you want to connect as a different user, select Password and enter user credentials.
On-premise
Connect to an on-premise ClickHouse® installation.
-
Under Shards:
-
Click + Shard and enter the shard ID.
-
Under Hosts, click + Host and enter the domain name (FQDN) or IP address of the host.
-
-
In HTTP Port, enter the port for HTTP interface connections or leave the default value of
8443
.Tip
-
Optional fields have default values if these fields are specified.
-
Complex types recording is supported (
array
,tuple
, etc.).
-
-
In Native port, enter the port for clickhouse-client
9440
. -
Enable SSL if you need to secure your connection.
-
To encrypt data transmission, upload a
.pem
file with certificates under CA Certificate. -
In User and Password, enter the user credentials to connect to the database.
-
-
In Database, enter the name of the database you want to transfer data from.
-
Under Table filter, add tables you want to include or exclude:
-
Included tables: Transfer will transfer data only from these tables.
-
Excluded tables: Data from these tables won’t be transferred.
-
To create a ClickHouse® source endpoint with the API, use the endpoint.ClickhouseSource model.
ClickHouse® type | Transfer type |
---|---|
Int64 |
int64 |
Int32 |
int32 |
Int16 |
int16 |
Int8 |
int8 |
UInt64 |
uint64 |
UInt32 |
uint32 |
UInt16 |
uint16 |
UInt8 |
uint8 |
— | float |
Float64 |
double |
FixedString , String |
string |
IPv4 , IPv6 , Enum8 , Enum16 |
utf8 |
— | boolean |
Date |
date |
DateTime |
datetime |
DateTime64 |
timestamp |
REST ... |
any |
Target endpoint
To configure a target ClickHouse® endpoint, provide the following settings:
-
In Connection type, select the type of the ClickHouse® cluster you want to connect to.
-
Configure the connection:
Managed cluster
Connect to a ClickHouse® cluster deployed in DoubleCloud.
-
In Managed Cluster, select the cluster you want to connect to.
-
In Authentication, select Default if you want to connect as the
admin
user.If you want to connect as a different user, select Password and enter user credentials.
On-premise
Connect to an on-premise ClickHouse® installation.
-
Under Shards:
-
Click + Shard and enter the shard ID.
-
Under Hosts, click + Host and enter the domain name (FQDN) or IP-address of the host.
-
-
Enter HTTP Port for HTTP interface connections leave the default value of
8443
.Tip
-
Optional fields have default values if these fields are specified.
-
Complex types recording is supported (
array
,tuple
, etc.).
-
-
In Native port, enter the port for clickhouse-client
9440
. -
Enable SSL if you need to secure your connection.
-
To encrypt data transmission, upload a
.pem
file with certificates under CA Certificate. -
In User and Password, enter the user credentials to connect to the database.
-
-
In Database, enter the name of the database you want to transfer data to.
-
Select a Cleanup policy to specify how data in the target database is cleaned up when a transfer is activated, reactivated, or reloaded.
Cleanup policy reference
-
Disabled: Don’t clean. Select this option if you only perform replication without copying data.
-
Drop (default): Fully delete the tables included in the transfer. Use this option to always transfer the latest version of the table schema to the target database from the source.
-
Truncate: Execute the TRUNCATE
-
-
In Sharding settings, select how you want to split the data between shards.
Sharding by column value
-
Enter the name of a column whose values are used as the row sharding key.
-
(Optional) If you want to map a column value to a specific shard, click + Mapping. Enter the column value and the shard name.
By default, the target endpoint uses automatic sharding.
Sharding by transfer ID
-
(Optional) If you want to map a transfer ID to a specific shard, click + Mapping. Enter the transfer ID and the shard name.
By default, the target endpoint uses automatic sharding.
The No sharding and Uniform random sharing options have no configurable parameters.
-
-
(Optional) Configure table renaming:
-
Under Advanced settings → Rename tables, click + Table.
-
Enter the source and target table names.
If the Target endpoint has the table with the same name, the data will be written into the existing table.
Merge multiple tables with table renaming
You can merge data from several source tables into a single one at your transfer target. To do that, create several Rename table entries with different Source table name values and the same Target table name values.
The data schemas in the source and target tables must be the same.
-
-
In Flush interval, specify a desired value or leave the default of 1 second.
To create a ClickHouse® target endpoint with the API, use the endpoint.ClickhouseTarget model.
Transfer type | ClickHouse® type |
---|---|
int64 | Int64 |
int32 | Int32 |
int16 | Int16 |
int8 | Int8 |
uint64 | UInt64 |
uint32 | UInt32 |
uint16 | UInt16 |
uint8 | UInt8 |
float | Float64 |
double | Float64 |
string | String |
utf8 | String |
boolean | UInt8 |
date | Date |
datetime | DateTime |
timestamp | DateTime64 |
any | String |