Benefiting from a hybrid solution, with on-prem and a Managed ClickHouse

Written By: Stefan Kaeser, DoubleCloud Solution Architect

When talking to potential customers, we sometimes get to the unfortunate point where they really like our services and offerings, but still have to decline due to a variety of reasons.

The data collected by said customer might be highly confidential and they’re not allowed to use any cloud providers at all for their production data.

The amount of data and the machines they’re already using might be way too expensive to handle within a cloud environment or there might be many other reasons they can’t expound on.

I’d been using ClickHouse® for over five years in an OnPrem scenario at a previous company and I’ve worked with a managed ClickHouse solution whilst at DoubleCloud for many customers.

If you’ve a good team and are managing your own servers (which I fortunately had, greetings to Francois and Atanas), then there’s absolutely nothing stopping you managing your own Clickhouse cluster OnPrem!

But still, the world isn’t black and white, it’s filled with different shades of gray and a whole host of other colors to choose between as well.

Why limit yourself to the mindset of just using OnPrem or just using CloudServices?

Hybrid systems are all over the place, in storage solutions as well as in the automotive industry.

Why not choose a hybrid solution for your ClickHouse services as well?

Development and testing

Even if you’ve a good team managing your ClickHouse clusters, and also know your way around upgrading stuff and creating new ones yourself, one of the most time consuming things in your day-to-day business will always be setting up new infrastructure.

Let’s say your developers want to test the newest version of ClickHouse; it still takes various steps to do so: you’ll have to create a ticket for your operations team, redefine the ticket to specify exactly what ops need to know, download the specified version of ClickHouse into your internal repository for infrastructure, set up the cluster itself, make entries to DNS so developers can reach the new cluster, copy the data from your other dev system into the new cluster…

And on and on it goes!

All those steps take up valuable time that could be better used for other things. And in the end it might not have even been worth it, as you decide the new version you’ve tested has a bug or is already too old or you want to wait for LTS or whatever.

So the solution is quite simple.

Just split your development and testing clusters from your production ones. All you need to have is an account on DoubleCloud and give your developers access to it.

They can then spin up a new dev cluster in a few minutes, choose the version they want to test, play around with it, and then shut it down with another two clicks after their test is finished.

You can also create a new cluster out of an existing backup with only a few clicks, so there’s no need to copy test data to a new cluster as well. Just create it from backups and upgrade the version to the one you want to test.

Your operations team will still be responsible for all production systems but they’ll have more time and energy to do so if all these small little things around development and test systems for developers can be handled by the developers themselves.

Separate clusters by logic

The previous example might have been an obvious one, but it’s already shown where you can split between OnPrem and Cloud quite easily.

But there are other potential layers, where a combination between OnPrem solutions and Cloud solutions can make sense.

Monitoring

Especially when using OnPrem solutions, monitoring is a must.

You have to make sure everything runs as it should and you’ll want to keep metrics and logs as long as possible to look for degradations, impacts, incidents etc.

ClickHouse is one of the best solutions out there for storing that kind of data. But no matter if you’ve no experience with ClickHouse at all or you are already using ClickHouse for other parts of your application, it makes sense to bring your monitoring to a managed ClickHouse in DoubleCloud.

Monitoring systems tend to grow over time, especially when you’re OnPrem and constantly adding new VMs, Containers and Systems etc.

But you’ll still want to use monitoring, without investing more time managing the monitoring itself.

Also, you’ve heard the term: “Who watches the watcher?”

Imagine you’re already using ClickHouse for production and you get some problems with your ClickHouse clusters.

Do you really want to have your monitoring within the same logic?

If your ClickHouse clusters break because of an incorrect configuration you’ve rolled out, your monitoring might break as well.

Using managed ClickHouse on DoubleCloud however will avoid this problem as you’ve separated your monitoring solution from your production logic, but can still use the same technology you’re used to.

It’s also easy to use DoubleCloud’s visualization to build your monitoring, without an extra cost.

High load scenarios

OnPrem solutions can be quite cheap when you’re in need of really high computing power. Buying a few servers with a few hundred cores for compute will amortize itself quite fast if you’re looking at pricing for vCPU at public cloud providers.

But what if you’re not only in need of high computing power?

What if you have to batch process a few trillion rows to create some dashboards but they’ll be looked at multiple times, by thousands of users, all at once on a Monday morning?
Then compute power is needed for your batch processing but it won’t help you much on that morning!

What you now need is load balancing with the ability to handle that amount of requests.

You’ll also need high availability and maybe even multiple regions around the world to deliver the results with as low latency as possible.

Even if you’ve really good operation people who all know the ins and outs of ClickHouse, it’s very likely you won’t have them sitting all around the world, being capable of optimizing ClickHouse, load balancers, cdn etc.

Use your people to let them do what they can do best.

Optimize the queries and the handling of the batch processing and give the boring work of setting up clusters behind load balancers in different regions of the world to a managed cloud provider like DoubleCloud.

Security requirements

Some companies handle highly confidential data, which is required to be handled within OnPrem systems.

But a lot of business data doesn’t have the same level of security. And with ClickHouse this is quite easy to handle.

Just have your confidential data OnPrem.

You can calculate aggregates and put the results, via the ClickHouse table function remoteSecure, into a managed cluster by DoubleCloud.

Setting up special users and roles, you can even make sure that your OnPrem cluster can directly access a managed cluster within DoubleCloud to send data to it, but the other way round is blocked, so no one can access your OnPrem cluster from the cloud.

And using the functionality directly within ClickHouse you can also make sure that no application outside of your secure environment has the credentials to access your secret data.

Final thoughts

In the end, we could come up with even scenarios.

Scenarios where you can use a combination of managed cloud providers like DoubleCloud and OnPrem solutions.

You can optimize your workflow by reducing cost, be it opportunity or physical.

So just think about what’s in for you as a company.

How can you improve your workload, and if we can help you by taking care of some boring tasks (or not so boring ones), then reach out to us, and we’ll help you build the solution you need.

ClickHouse® is a trademark of ClickHouse, Inc. https://clickhouse.com

Free DoubleCloud’s Visualization Tools

Don’t waste time reading numerous reports and manually analyzing data — rely on AI-Insights and get fast and accurate conclusions.

Start your trial today

Sign in to save this post