ClickHouse® high availability

Ensuring high availability is essential for production ClickHouse® clusters, especially for large-scale data analytics or business-critical applications. Clusters that are configured to be highly available ensure data integrity, minimize downtime costs, and provide scalability and load balancing.

This article explains how you can achieve high availability in your ClickHouse® clusters.

Replicas

Replicas are ClickHouse® hosts that hold a copy of the data in the cluster or shard. Having several replicas is essential to avoid situations when the whole cluster becomes unavailable.

Having several replicas provides the following benefits:

  • Data integrity: In an unlikely event when data in one replica becomes corrupted or unavailable, it's not completely lost and can be restored from other replicas.

  • Load balancing: If you run resource-intensive queries, having several replicas allows you to distribute the load across several nodes and optimize the use or resources.

  • Minimal downtime costs: Should a ClickHouse® host go down, a highly available cluster still remains operational. Unlike non-replicated clusters, you can still write and query data without significant impact on operations.

For high availability, you need to use at least three replicas per shard. If you have two replicas, it allows load balancing, but to ensure high availability, three replicas are necessary.

ClickHouse Keeper

ClickHouse Keeper is a coordination system for replicating data across ClickHouse® costs and executing distributed DDL queries.

When you create a ClickHouse® cluster in DoubleCloud, you can select between embedded and dedicated ClickHouse Keeper hosts. Embedded ClickHouse Keeper shares the CPU and RAM with ClickHouse® itself, and the two may compete for resources when the cluster load is high. In the dedicated mode, ClickHouse Keeper runs on three independent nodes.

If you're inserting a lot of data into ClickHouse®, dedicated ClickHouse Keeper is recommended. When using dedicated ClickHouse Keeper hosts, you can use only two replicas to ensure high availability, but it's still recommended that you use three.

Learn more about ClickHouse Keeper