Monitor a Managed ClickHouse® cluster
Monitor your DoubleCloud Managed ClickHouse® cluster and its hosts with built-in cluster management tools.
These tools provide diagnostic data readouts as graphic charts and display cluster health as an indicator after the cluster name.
Monitor diagnostic data of the cluster and hosts
-
Go to the Clusters
-
Click on the cluster you want to monitor.
-
Select the Monitoring tab on the cluster information page to see the charts for your cluster.
Cluster monitoring charts
These charts display the cluster statistics. They're updated automatically, you don't need to refresh the page.
Note
Charts automatically apply the most appropriate measurement units (megabytes, gigabytes).
Here is the list of all the monitoring charts for your Managed ClickHouse® cluster:
-
Select queries: a number of query requests per second.
-
Insert queries: a number of
insert
type queries per second. -
Total queries: overall a number of queries per second.
-
Select queries per host: a number of
select
type queries for each host, displayed by dedicated lines. -
Insert queries per host: a number of
insert
type queries for each host, displayed by dedicated lines. -
Failed select queries per host: an overall number of unsuccessful
select
type queries for each host, displayed by dedicated lines. -
Average select query time per host: average
select
type query execution time for each host, displayed by dedicated lines. -
Average query time per host: average query time for each host, displayed by dedicated lines.
-
Active locks per host: a number of active locks for each host, displayed by dedicated lines.
-
Waiting locks per host: a number of pending blocks for each host, displayed by dedicated lines.
-
Connections per host: a number of connections for each host, displayed by dedicated lines.
-
Read data: data reading speed in bytes per second.
-
Inserted data: data insert speed in bytes per second.
-
Merged data: data merge speed in bytes per second.
-
Read data per host: data reading speed for each host in bytes per second, displayed by dedicated lines.
-
Inserted data per host: data insert speed for each host in bytes per second, displayed by dedicated lines.
-
Merged data per host: data merge speed for each host in bytes per second, displayed by dedicated lines.
-
Read rows per host: a number of rows read per second for each host, displayed by dedicated lines.
-
Inserted rows per host: a number of rows inserted per second for each host, displayed by dedicated lines.
-
Merged rows per host: a number of rows merged per second for each host, displayed by dedicated lines.
-
Max replication delay across tables: the longest replication delay among all the tables. Approaching the limit indicates excessive load or low data insert efficiency.
-
Max replication queue across tables: maximum replication queue length among all the tables. Approaching the limit indicates excessive load or replication problems.
-
Max data parts per partition: maximum a number of data parts per partition among all partitions. Approaching the limit indicates excessive load or low data insert efficiency.
-
CPU cores usage: a number of CPU cores in use.
-
Memory usage: RAM capacity in use (in megabytes).
-
Disk space usage: disk space in use (in megabytes).
-
CPU cores usage per host: a number of CPU cores in use for each host, displayed by dedicated lines.
-
Memory usage per host: RAM capacity in use (in megabytes) for each host, displayed by dedicated lines.
-
Disk space usage per host: disk space in use (in megabytes) for each host, displayed by dedicated lines.
-
Network data received per host: speed of data download from the network (in bits per second) for each host, displayed by dedicated lines.
-
Network data sent per host: speed of data upload to the network (in bits per second) for each host, displayed by dedicated lines.
-
Network data usage per host: data exchange speed over the network (in bits per second) for each host, displayed by dedicated lines.
Cluster health indicator
You can see a health state and status indicator to the right of the cluster name.
States show the overall health of your cluster:
State |
Description |
Suggested action |
Alive |
The cluster is operating normally. |
No action required. |
Stopped |
The cluster is intact, stopped by the user or by the service |
No action required. If you want to restart the cluster, refer to Start and stop the cluster. |
Degraded |
The cluster is working below full capacity |
Make sure the cluster isn't undergoing technical maintenance at the moment. If you are unable to determine the reason for this state yourself, write a technical support request:
|
Dead |
The cluster doesn't work (also, all its hosts don't work as well). |
Write a technical support request:
|
Unknown |
The condition of the cluster is unknown. |
Write a technical support request:
|
Statuses show activities affecting the cluster:
Status |
Description |
Suggested action |
Creating |
The cluster is getting ready for the first start. |
No further action required. After some time, the cluster will transition into the Alive state. |
Updating |
The cluster is undergoing an update. |
No further action required. After some time, the cluster will transition into the Alive state. |
Starting |
The cluster is starting. |
No further action required. After some time, the cluster will transition into the Alive state. |
Stopping |
The cluster is stopping. |
No further action required. After some time, the cluster will transition into the Stopped state. |
Error |
An error occurred. The cluster is inoperable. |
Write a technical support request:
|