Replicate and visualize data from Yandex Metrica

Yandex Metrica is a free web analytics tool you can use for tracking user actions, analyzing traffic sources, and evaluating the effectiveness of online and offline ads.

This tutorial shows how to replicate raw data from your Yandex Metrica account to a ClickHouse® database and then automatically build a dashboard with the data.

You can also replicate data to the same ClickHouse® cluster from other sources, such as HubSpot or Google Ads, and then run analytical queries and build custom dashboards from enriched data.

Tip

If you prefer to dive straight into the code, check out the example Terraform project .

Before you start

  1. If you haven’t already, create a DoubleCloud account .

  2. Make sure you have Metrica PRO or API access enabled in your Yandex Metrica account. If it’s not enabled or you’re unsure if it’s enabled, contact the DoubleCloud support and include your Metrica Tag ID in the message. We’ll contact the Metrica support team on your behalf and request them to enable API access for you.

    How to contact the DoubleCloud support
    1. Log in to the DoubleCloud console.

    2. At the bottom left, click Support Contact support.

    3. Write your request and click Send.

  3. Get a Yandex Metrica access token in Yandex OAuth .

  4. Decide what Yandex Metrica tags you want to transfer data from and copy their IDs on Metrica’s main page .

Step 1. Create a Managed ClickHouse® cluster

  1. Go to the Clusters page in the console.

  2. Click Create cluster at the top right and select ClickHouse.

    Tip

    For this tutorial, you don’t need to modify cluster settings on this page. If you proceed with the default settings, you get a fully functional ClickHouse® cluster that you can use for testing and development.

    When creating a production cluster, refer to Create a Managed ClickHouse® cluster for a list of ways to create and configure a cluster.

  3. Under Basic settings, enter a cluster name, such as clickhouse-dev.

  4. Click Submit.

    Creating a cluster takes around five minutes depending on the provider, region, and settings.

Step 2. Create a ClickHouse® database

  1. After the cluster status changes from Creating to Alive, select it in the cluster list.

  2. Click WebSQL at the top right.

  3. In WebSQL, click on any database in the connection manager on the left to open the query editor.

  4. Create a new database named metrica using the following command:

    CREATE DATABASE IF NOT EXISTS metrica ON CLUSTER default;
    
  5. Check that the database has been created:

    SHOW DATABASES
    
    ┌─name───────────────┐
    │ INFORMATION_SCHEMA │
    │ _system            │
    │ default            │
    │ metrica            │  // your database
    │ information_schema │
    │ system             │
    └────────────────────┘
    

Step 3. Create a source Yandex Metrica endpoint

To create a source Yandex Metrica endpoint:

  1. Go to the Transfer page in the console and click CreateSource endpoint at the top right.

  2. In Source type, select Metrica.

  3. Enter an endpoint name, such as metrica-source-dev.

  4. In Tag IDs, enter one or several IDs of the Metrica tags whose data you want to transfer.

  5. In Token, enter your Metrica access token.

  6. In Hits and Sessions, select Enable.

  7. Toggle Enable dashboard. DoubleCloud will automatically create a Visualization dashboard with your data.

  8. Click Submit.

Step 4. Create a target ClickHouse® endpoint

To create a target endpoint:

  1. On the Transfer page, click CreateTarget endpoint.

  2. In Target type, select ClickHouse.

  3. Enter an endpoint name, such as clickhouse-target-dev.

  4. In Connection type, select Managed cluster and select the cluster that you created from the dropdown.

  5. In Database, enter metrica – the name of the database you created in the cluster. You can leave the default values in other fields.

  6. Click Submit.

Step 5. Create a transfer

Now that your endpoints are ready, you can create a transfer.

  1. On the Transfer page, click CreateTransfer at the top right.

  2. Under Endpoints, select the endpoints you just created — metrica-source-dev and clickhouse-target-dev as the source and target respectively.

  3. Under Basic settings, enter a transfer name, such as transfer-dev.

  4. In Transfer type, select Replication.

  5. Leave the default preselected options in Snapshot settings and Runtime environment

  6. Click Submit.

Step 6. Activate the transfer

Now that you transfer is fully configured, you can use it to replicate data from MongoDB to ClickHouse®.

  1. On the transfer page, click Activate at the top right.

    After the transfer is activated, it fetches data from Metrica and uploads it to the ClickHouse® database.

    The initial activation may take a few minutes.

Step 7. Access the dashboard

  1. Go to the Visualization page in the console.

  2. Select the collection with your visualization assets from the list. DoubleCloud created this collection and assets automatically, and it’s called "Metrica dashboard transfer-dev".

  3. Under Dashboards, select the dashboard called "Yandex Metrica - Demo"

    This preconfigured dashboard contains charts showing the main website metrics.

What’s next

Now that your raw Yandex Metrica data has been replicated to a Managed ClickHouse® cluster, you can customize the dashboard by adding more charts or configure replication of data from other sources.