Monitor a Managed ClickHouse® cluster

Monitor your DoubleCloud Managed ClickHouse® cluster and its hosts with built-in cluster management tools.

These tools provide diagnostic data readouts as graphic charts and display cluster health as an indicator after the cluster name.

Monitor diagnostic data of the cluster and hosts

Go to the Clusters page in the console.
Click on the cluster you want to monitor.
Select the Monitoring tab on the cluster information page to see the charts for your cluster.

Cluster monitoring charts

These charts display the cluster statistics. They're updated automatically, you don't need to refresh the page.

Note

Charts automatically apply the most appropriate measurement units (megabytes, gigabytes).

Here is the list of all the monitoring charts for your Managed ClickHouse® cluster:

Select queries: a number of query requests per second.
Insert queries: a number of insert type queries per second.
Total queries: overall a number of queries per second.
Select queries per host: a number of select type queries for each host, displayed by dedicated lines.
Insert queries per host: a number of insert type queries for each host, displayed by dedicated lines.
Failed select queries per host: an overall number of unsuccessful select type queries for each host, displayed by dedicated lines.
Average select query time per host: average select type query execution time for each host, displayed by dedicated lines.
Average query time per host: average query time for each host, displayed by dedicated lines.
Active locks per host: a number of active locks for each host, displayed by dedicated lines.
Waiting locks per host: a number of pending blocks for each host, displayed by dedicated lines.
Connections per host: a number of connections for each host, displayed by dedicated lines.
Read data: data reading speed in bytes per second.
Inserted data: data insert speed in bytes per second.
Merged data: data merge speed in bytes per second.
Read data per host: data reading speed for each host in bytes per second, displayed by dedicated lines.
Inserted data per host: data insert speed for each host in bytes per second, displayed by dedicated lines.
Merged data per host: data merge speed for each host in bytes per second, displayed by dedicated lines.
Read rows per host: a number of rows read per second for each host, displayed by dedicated lines.
Inserted rows per host: a number of rows inserted per second for each host, displayed by dedicated lines.
Merged rows per host: a number of rows merged per second for each host, displayed by dedicated lines.
Max replication delay across tables: the longest replication delay among all the tables. Approaching the limit indicates excessive load or low data insert efficiency.
Max replication queue across tables: maximum replication queue length among all the tables. Approaching the limit indicates excessive load or replication problems.
Max data parts per partition: maximum a number of data parts per partition among all partitions. Approaching the limit indicates excessive load or low data insert efficiency.
CPU cores usage: a number of CPU cores in use.
Memory usage: RAM capacity in use (in megabytes).
Disk space usage: disk space in use (in megabytes).
CPU cores usage per host: a number of CPU cores in use for each host, displayed by dedicated lines.
Memory usage per host: RAM capacity in use (in megabytes) for each host, displayed by dedicated lines.
Disk space usage per host: disk space in use (in megabytes) for each host, displayed by dedicated lines.
Network data received per host: speed of data download from the network (in bits per second) for each host, displayed by dedicated lines.
Network data sent per host: speed of data upload to the network (in bits per second) for each host, displayed by dedicated lines.
Network data usage per host: data exchange speed over the network (in bits per second) for each host, displayed by dedicated lines.

Cluster health indicator

You can see a health state and status indicator to the right of the cluster name.

States show the overall health of your cluster:

State	Description	Suggested action
Alive	The cluster is operating normally.	No action required.
Stopped	The cluster is intact, stopped by the user or by the service	No action required. If you want to restart the cluster, refer to Start and stop the cluster.
Degraded	The cluster is working below full capacity	Make sure the cluster isn't undergoing technical maintenance at the moment. If you are unable to determine the reason for this state yourself, write a technical support request: Specify the cluster ID. List the last operations performed on the cluster.
Dead	The cluster doesn't work (also, all its hosts don't work as well).	Write a technical support request: Specify the cluster ID. List the last operations performed on the cluster.
Unknown	The condition of the cluster is unknown.	Write a technical support request: Specify the cluster ID. List the last operations performed on the cluster.

Statuses show activities affecting the cluster:

Status	Description	Suggested action
Creating	The cluster is getting ready for the first start.	No further action required. After some time, the cluster will transition into the Alive state.
Updating	The cluster is undergoing an update.	No further action required. After some time, the cluster will transition into the Alive state.
Starting	The cluster is starting.	No further action required. After some time, the cluster will transition into the Alive state.
Stopping	The cluster is stopping.	No further action required. After some time, the cluster will transition into the Stopped state.
Error	An error occurred. The cluster is inoperable.	Write a technical support request: Specify the cluster ID. List the last operations performed on the cluster.

Reset cluster password

Delete a cluster