Migrate a standalone ClickHouse® installation to DoubleCloud Managed Service for ClickHouse® using Transfer

To perform your cluster migration using Transfer, you need to complete the following:

  1. Create a source endpoint

  2. Create a target endpoint

  3. Create a transfer

  4. Perform post-migration activities

Before you start

  1. Create a Managed Service for ClickHouse® cluster.

Create a source endpoint

For each database you want to migrate from your standalone cluster, create a source endpoint as follows:

  1. Go to the Transfer service page .

  2. Select the Endpoints tab, click Create endpointSource.

  3. Under Source type, choose ClickHouse.

  4. Under Basic settings:

    • Enter the Name of the endpoint.

    • (optional) Enter the Description of the endpoint.

  5. Under Connection settingsConnection type, select Custom installation.

  6. Under Shards:

    1. Click Add Shard.

    2. Specify Shard ID identifier;

    3. Under Hosts, click Add Host and enter the domain name (FQDN) or IP-address of the host in the textbox.

  7. Specify the HTTP Port for HTTP interface connections.

    Tip

    • Optional fields have default values if these fields are specified.

    • Complex types recording is supported (array, tuple etc).

  8. Specify the Native port for clickhouse-client connections.

  9. Click SSL to secure your connection.

  10. To encrypt the data transmission, click Upload file under CA Certificate.

  11. Specify the Database name from which to migrate the data.

  12. Specify the User of the above database.

  13. Enter the Password for the database user.

  14. Under Table filterIncluded tabes, specify all the tables from the database you want to migrate.

    There are two points to keep in mind when creating a list of tables to migrate:

    • Transfer can't migrate materialized views . Migrate the source tables and manually recreate materialized views on the target.

    • If you want to migrate tables created with the CollapsingMergeTree engine, ensure that to merges happen on the table during transfer.

  • Click Submit.

Create a target endpoint

Create a single target endpoint for all the databases you want to migrate:

  1. On the Transfer page, select the Endpoints tab, click Create endpointTarget.

  2. Under Source type, choose ClickHouse.

  3. Under Basic settings:

    • Enter the Name of the endpoint.

    • (optional) Enter the Description of the endpoint.

  4. Under Connection settingsConnection type, select Managed cluster.

  5. Under Managed cluster, select the DoubleCloud cluster to which you want to migrate your data.

  6. Specify the Database name to which you want to migrate data.

  7. Specify the User of the above database.

  8. Enter the Password for the database user.

  9. Keep the rest of the settings empty or with their default values and click Submit.

Create a transfer

Create a transfer for each database you want to migrate:

  1. On the Transfer page, click Create transfer.

  2. Under Endpoints:

    1. In Source, select the endpoint to the self-managed database you want to migrate.

    2. In Target, select the endpoint to the DoubleCloud Managed Service for ClickHouse® cluster.

  3. Under Basic settings:

    1. Enter the transfer Name.

    2. (optional) Enter the transfer Description.

  4. Under Transfer parameters:

    1. If you need a periodic migration of data, under Periodic snapshot, toggle Enabled and select a period or provide a cron expression of how often to perform a snapshot.

    2. Add one or more Incremental tables. With incremental tables, you can transfer only the data that has changed. Each of these tables has the following fields:

      • Schema corresponds to the database schema (as in PostgreSQL), name, or dataset. Refer to the Airbyte - destination namespace configuration to learn more about namespaces.

      • Table in the source with which to compare your target table.

      • Key column is the name of a column that contains some value that indicates whether the table should be incremented. A common example of a column that contains a cursor is a column with timestamps. Refer to the Airbyte - Incremental Sync - Append to learn more about cursor definitions and usage examples.

      • Initial value (optional) defines a value in the Key column based on which to start tracking changes. For example, you can specify a date as the Initial value. In this case, the service will transfer the rows with a date value greater than the start value.

    3. Under Parallel snapshot settings, configure the number of workers and threads per worker.

    High-load transfer configuration

    In high-load production cases, consider adding workers if you need to transfer more than 100 GB of data to snapshot. Adding a worker doubles the transfer's performance.

    1. Select the runtime environment:

      This runtime has no configurable settings. It will define all its parameters automatically.

      1. Select the VM size. In most cases, Medium is enough for responsive snapshotting.

      2. Choose the logging level:

        • DEBUG: information that can be useful for debugging

        • INFO (default): confirmation messages, information about events that mustn't cause errors in the transfer

        • WARN: events that can cause errors in the transfer.

      3. Under Dedicated settings, select the settings type:

        • Automatic doesn't require any further settings.

        • Under manual, select your network from the dropdown list.

  5. Click Submit.

Activate the transfer

  1. Open the Transfer page in the console.

  2. To the right of the transfer name, click Activate.

Post-migration

After migrating the data, you need to recreate all materialized views manually (you need to add the ON CLUSTER default at the source):

SELECT create_table_query
FROM system.tables
WHERE (lower(database) 
NOT IN ('system', 'information_schema')) 
AND (engine IN ('MaterializedView', 'View', 'Dictionary'))

See also