Replicate and visualize data from Yandex Metrica
Yandex Metrica
This tutorial shows how to replicate raw data from your Yandex Metrica account to a ClickHouse® database and then automatically build a dashboard with the data.
You can also replicate data to the same ClickHouse® cluster from other sources, such as HubSpot or Google Ads, and then run analytical queries and build custom dashboards from enriched data.
Tip
If you prefer to dive straight into the code, check out the
example Terraform project
Before you start
-
If you haven’t already, create a DoubleCloud account
-
Make sure you have Metrica PRO or API access enabled in your Yandex Metrica account. If it’s not enabled or you’re unsure if it’s enabled, contact the DoubleCloud support and include your Metrica Tag ID in the message. We’ll contact the Metrica support team on your behalf and request them to enable API access for you.
How to contact the DoubleCloud support
-
Log in to the DoubleCloud console.
-
At the bottom left, click
-
Write your request and click Send.
-
-
Get a Yandex Metrica access token in Yandex OAuth
-
Decide what Yandex Metrica tags you want to transfer data from and copy their IDs on Metrica’s main page
Step 1. Create a Managed ClickHouse® cluster
-
Go to the Clusters
-
Click Create cluster at the top right and select ClickHouse.
Tip
For this tutorial, you don’t need to modify cluster settings on this page. If you proceed with the default settings, you get a fully functional ClickHouse® cluster that you can use for testing and development.
When creating a production cluster, refer to Create a Managed ClickHouse® cluster for a list of ways to create and configure a cluster.
-
Under Basic settings, enter a cluster name, such as
clickhouse-dev
. -
Click Submit.
Creating a cluster takes around five minutes depending on the provider, region, and settings.
Step 2. Create a ClickHouse® database
-
After the cluster status changes from Creating to Alive, select it in the cluster list.
-
Click WebSQL at the top right.
-
In WebSQL, click on any database in the connection manager on the left to open the query editor.
-
Create a new database named
metrica
using the following command:CREATE DATABASE IF NOT EXISTS metrica ON CLUSTER default;
-
Check that the database has been created:
SHOW DATABASES
┌─name───────────────┐ │ INFORMATION_SCHEMA │ │ _system │ │ default │ │ metrica │ // your database │ information_schema │ │ system │ └────────────────────┘
Step 3. Create a source Yandex Metrica endpoint
To create a source Yandex Metrica endpoint:
-
Go to the Transfer
-
In Source type, select Metrica.
-
Enter an endpoint name, such as
metrica-source-dev
. -
In Tag IDs, enter one or several IDs of the Metrica tags whose data you want to transfer.
-
In Token, enter your Metrica access token.
-
In Hits and Sessions, select Enable.
-
Toggle Enable dashboard. DoubleCloud will automatically create a Visualization dashboard with your data.
-
Click Submit.
Step 4. Create a target ClickHouse® endpoint
To create a target endpoint:
-
On the Transfer page, click Create → Target endpoint.
-
In Target type, select ClickHouse.
-
Enter an endpoint name, such as
clickhouse-target-dev
. -
In Connection type, select Managed cluster and select the cluster that you created from the dropdown.
-
In Database, enter
metrica
– the name of the database you created in the cluster. You can leave the default values in other fields. -
Click Submit.
Step 5. Create a transfer
Now that your endpoints are ready, you can create a transfer.
-
On the Transfer page, click Create → Transfer at the top right.
-
Under Endpoints, select the endpoints you just created — metrica-source-dev and clickhouse-target-dev as the source and target respectively.
-
Under Basic settings, enter a transfer name, such as
transfer-dev
. -
In Transfer type, select Replication.
-
Leave the default preselected options in Snapshot settings and Runtime environment
-
Click Submit.
Step 6. Activate the transfer
Now that you transfer is fully configured, you can use it to replicate data from MongoDB to ClickHouse®.
-
On the transfer page, click Activate at the top right.
After the transfer is activated, it fetches data from Metrica and uploads it to the ClickHouse® database.
The initial activation may take a few minutes.
Step 7. Access the dashboard
-
Go to the Visualization
-
Select the collection with your visualization assets from the list. DoubleCloud created this collection and assets automatically, and it’s called "Metrica dashboard transfer-dev".
-
Under Dashboards, select the dashboard called "Yandex Metrica - Demo"
This preconfigured dashboard contains charts showing the main website metrics.
What’s next
Now that your raw Yandex Metrica data has been replicated to a Managed ClickHouse® cluster, you can customize the dashboard by adding more charts or configure replication of data from other sources.
- Source connectors for DoubleCloud Transfer
- Create more charts and add them to the dashboard
- Use the DoubleCloud Terraform provider