Hybrid storage in Managed ClickHouse® clusters
Hybrid storage provides fault tolerance for data storage. It allows you to manage data placement for MergeTree-family
This possibility is enabled by default. The DoubleCloud service creates an S3 bucket (if located in AWS) or a Cloud Storage bucket (if located in Google Cloud) for each Managed ClickHouse® cluster as a system disk
The size of the object storage isn't limited, but you are charged for each piece of data you store there. You are not charged for the object storage until you enable it and put some data there.
You're not required to set a TTL (Time To Live) for hybrid storage, but doing so gives you direct control over which data gets stored in object storage. Without specifying a TTL, data will be moved to object storage only when the storage on network disks runs out of space.
To learn more about storage policies and disks, refer to the following ClickHouse® documentation: Using Multiple Block Devices for Data Storage
How to use hybrid storage
The following settings are required to configure a table that will separate the data and place it in the cold and hot storage sections:
TO DISKrule. You can set TTL only for the values of the Date
hybrid_storage, rows are placed only to the object storage, and data is not transferred between storages.
A sample query to create a table with TTL looks as follows:
CREATE TABLE table_name ON CLUSTER default (
ENGINE = ReplicatedMergeTree
PARTITION BY Date
ORDER BY (Date)
TTL Date + INTERVAL 5 DAY TO DISK 'object_storage'
SETTINGS storage_policy = 'hybrid_storage'
This query works as follows:
If the number of months from the current date to the
date_columncolumn value is less than the TTL value, this data is stored on network drives.
If the number of days from the current date to the
Dateis greater than or equal to the TTL value (that is, the lifetime has already expired), this data is placed in the object storage according to the
TO DISK 'object_storage'policy
Caching in the DoubleCloud ClickHouse® object storage
DoubleCloud automatically caches data in the object storage. It uses the LRU
When creating a ClickHouse® cluster, your allocated cache size is 50% of the SSD Storage volume you configure. For example, the minimum possible
s1-c2-m4 resource preset with 32 GB of available storage will have 16 GB of available cache.
After the cluster is created, its cache size remains fixed.