Extending Grafana With DoubleCloud Makes Log Retention Significantly Cheaper

DoubleCloud is a platform that helps organizations build cost-effective sub-second analytical applications on tightly integrated and proven open-source technologies like ClickHouse®, Kafka and others.

September 5, 2022
15 mins to read

Hi, I’m Adam Jennings, one of DoubleCloud’s Solution Architects based in the US with (far too) many years’ experience in Data management, Big Data and ML.

The story of DoubleCloud started last year with the idea to bring a modern open-source managed data stack to the world. Our platform went into general availability in July of 2022.

We offer a fully managed, open source and platform agnostic data transfer, warehousing, storage and visualization solution, a DataStack-as-a-Service if you will, using the power of ClickHouse®, Kafka®, and our own visualization tools.

I’ll be heading to GrafanaLive soon to hopefully meet and chat with you all.

I feel Grafana’s huge impact in the observability space meshes perfectly with the open-source sub-second analytical power of ClickHouse® on DoubleClouds managed platform and I’d love to discuss the possibilities of this awesome new solution with you.

DoubleCloud Integrations With Grafana

Using Grafana in conjunction with DoubleCloud means we shoulder the full burden of managing the service, allowing you to focus on what you love… business analysis and development!

You can better utilize yours or your staff’s time hitting project goals while we take care of all the data stack maintenance.

The choice to make the platform open-sourced was a deliberate one as we want to be open, honest, upfront and cloud agnostic with no hidden vendor lock-in issues.

The Grafana / ClickHouse® Plugin

February of 2022 was an exciting date for me as it saw the 1.0 release of the Grafana ClickHouse® plugin.

It was created to give developers full access to the ClickHouse® native interface by providing a fast, robust, and low overhead method of querying your ClickHouse® cluster.

It also means that when leveraging Grafana templates, you can also quickly leverage the powers of ClickHouse®.

As of this writing, there are three predefined dashboards to get your started:

  • Query Analysis Dashboard. This dashboard is great for understanding who and what queries are being executed, as well as querying performance on your ClickHouse® cluster.
  • Data Analytics Dashboard. The second dashboard allows for a much deeper understanding of the size and shape of your tables. It also gives a great oversight of the value of DoubleCloud’s managed ClickHouse® offering. At a glance you’ll be able to see the current server version, uptime, and all the disks attached to the ClickHouse® cluster… both default, EBS storage, and object storage, backed by S3.
  • Cluster Analytics Dashboard. The final dashboard (for now) allows developers to understand cluster health and background replication and merging.

As you can see, these three dashboard templates will give any developer a huge boost in monitoring their data clusters with little to no additional work needed.

DoubleCloud’s ClickHouse® Hybrid Storage

It doesn’t end there though.

One of the things that sets DoubleCloud’s managed ClickHouse® clusters apart is that they allow users to store data separately, with your frequently used (hot data) in local disk storage and the bulk of your rarely used (cold data) in object storage (S3). Store your logs and metrics for use in Grafana in hybrid storage tables. If your dashboards default to 24 hours or one week, set that as your hybrid storage option. This ensures your default dashboards are performant and return results quickly.

DoubleCloud’s managed ClickHouse® offering is over 10x cheaper on object storage in most regions… As an example, we typically only charge around $0.25/GB-month on SSD whereas we pass-through object storage pricing, currently around $0.023 per GB on S3.

I advise my customers to leverage the Pareto principle. 80% of the most frequently used queries should hit the SSD storage rather than the S3, contributing to the sub-second analytical queries on ClickHouse® that people have come to love while still leveraging the object storage for cost efficiency.

When data access is repeated, the DoubleCloud platform caches queries to speed up the queries.

The platform also allows developers to create customized rules based on the most frequently accessed data, massively reducing costs associated with maintaining longer-term storage.

It’s no different for your end-users either. Query a table and DoubleCloud’s managed ClickHouse® will instantly access the needed partitions in S3. In fact, if a cluster runs out of provisioned SSD storage, we will automatically spill onto object storage, preventing down time and lost data.

In summary, empowering Grafana with DoubleCloud allows developers to save money with our hybrid storage solutions.

Thanks to ClickHouse’s® tech, retrieving data is lightning fast and the managed service provided by DoubleCloud offers robust backup processes, monitoring, configuration sharding and free updates, meaning developers can do what they do best… develop (instead of boring, repetitive tasks related to data warehousing)!

The one thing I haven’t mentioned yet, but you should definitely get in touch and ask me about is, the scalability of the platform.

We made sure you can easily change the number of hosts in a cluster, upgrade their size and enable sharding to improve cluster performance.

Thanks for reading, here’s $600 credit on us to trial the power of DoubleCloud

* Apache® and Apache Kafka® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

** ClickHouse is a trademark of ClickHouse, Inc. https://clickhouse.com