Create and manage transfers

Create a transfer

Console

Terraform

API

Python

Open the Transfer page in the console.
Click Create → Transfer.
Under Endpoints:
1. In Source, select an existing source endpoint or create a new one.
2. in Target, select an existing target endpoint or create a new one.
Under Basic settings, enter a transfer name and (optionally) a description.
Under Transfer Parameters, select a Transfer type. Available transfer types depend on the type of source and destination endpoints:
Snapshot
Makes a one-time transfer of the database snapshot.
1. Under Snapshot settings → Parallel snapshot settings, configure the transfer performance:
  1. Processes count sets the number of parallel instances of a container with a transfer. Increasing this number will speed up your transfer execution. You can specify up to 8 instances.
  2. Threads count specifies the number of processes within each container. You can run up to 10 processes.
  The average download speed of data transfers is between 1 and 10 mb/s.
2. Add one or more Incremental tables. With incremental tables, you can transfer only the data that has changed. Each of these tables has the following fields:
  - Schema corresponds to the database schema (as in PostgreSQL), name, or dataset.
  - Table in the source to compare with your target table.
  - Key column is the name of a column that contains some value (a cursor) that indicates whether to increment the table. A common example of a cursor is a column with timestamps. Refer to the Airbyte - Incremental Sync - Append to learn more about cursor definitions and usage examples.
  - Start value (Optional) defines a value in the Key column based on which to track the changes. For example, you can specify a date as the Start value. In this case, the service will transfer only the rows with a date value greater than the start value.
Periodic snapshot
Runs snapshots at the specified interval.
1. Set the Period from 5 minutes to 24 hours between the transfer runs.
2. Under Snapshot settings → Parallel snapshot settings, configure the transfer performance:
  1. Processes count sets the number of parallel instances of a container with a transfer. Increasing this number will speed up your transfer execution. You can specify up to 8 instances.
  2. Threads count specifies the number of processes within each container. You can run up to 10 processes.
  The average download speed of data transfers is between 1 and 10 mb/s.
3. Add one or more Incremental tables. With incremental tables, you can transfer only the data that has changed. Each of these tables has the following fields:
  - Schema corresponds to the database schema (as in PostgreSQL), name, or dataset. Refer to the Airbyte - destination namespace configuration to learn more about namespaces.
  - Table in the source with which to compare your target table.
  - Key column is the name of a column that contains some value that indicates whether the table should be incremented. A common example of a column that contains a cursor is a column with timestamps. Refer to the Airbyte - Incremental Sync - Append to learn more about cursor definitions and usage examples.
  - Initial value (optional) defines a value in the Key column based on which to start tracking changes. For example, you can specify a date as the Initial value. In this case, the service will exclusively transfer the rows with a date value greater than the start value.
Replication
Continuously retrieves changes from the source database and applies them to the target database.
Snapshot and replication
Transfers the data and keeps the target database in sync with the source database.
1. Under Snapshot settings → Parallel snapshot settings, configure the transfer performance:
  1. Processes count sets the number of parallel instances of a container with a transfer. Increasing this number will speed up your transfer execution. You can specify up to 8 instances.
  2. Threads count specifies the number of processes within each container. You can run up to 10 processes.
  The average download speed of data transfers is between 1 and 10 mb/s.
2. Add one or more Incremental tables. With incremental tables, you can transfer only the data that has changed. Each of these tables has the following fields:
  - Schema corresponds to the database schema (as in PostgreSQL), name, or dataset.
  - Table in the source to compare with your target table.
  - Key column is the name of a column that contains some value (a cursor) that indicates whether to increment the table. A common example of a cursor is a column with timestamps. Refer to the Airbyte - Incremental Sync - Append to learn more about cursor definitions and usage examples.
  - Start value (Optional) defines a value in the Key column based on which to track the changes. For example, you can specify a date as the Start value. In this case, the service will transfer only the rows with a date value greater than the start value.

Specify the Runtime environment properties:

Serverless

The settings will be adjusted automatically.

Dedicated

Select the VM performance preset:
- Tiny: 2vCPUs (burstable instance)
  
  Tiny instances are ideal for scenarios where a transfer is typically idle
  
  They have 20% baseline performance, and the ability to burst to the maximum performance isn't guaranteed.
- Small: 2 vCPUs
- Medium: 4 vCPUs
- Large: 8 vCPUs

Select the Minimum logging level:

Level	Description
TRACE	Detailed diagnostics information
DEBUG	Debugging information
INFO	Diagnostics information for statistics
WARNING	Warning about a non-critical malfunction, investigate further
ERROR	Error report
FATAL	Possible system failure, service shutdown

Under Settings, choose how to select a network in which to run a virtual machine instance:
- Automatic
  
  Allows the service to set the network automatically based on the target resource location.
- Manual
  
  Transfer BYOC compatibility
  
  If you use the BYOC feature for your cluster, select an external network. This will enable Transfer to write data to your BYOA cluster.
  
  Select a VPC Network ID from the drop-down list.

Read more about runtimes here: Runtime types in Transfer

Set up Data transformations if you need to modify a list of tables to transfer.

Public Preview notice

This feature is provided as Public Preview and is free of charge.

After the end of the Public Preview period, the functionality and pricing of this feature will be subject to change.

For example, you don't want to transfer the column with passwords. In Transformer list, click Add transformer to create a set of transformation rules. Add this column's name to the Exclude columns section. In Transformer[number] → Columns filter, configure transformations for tables and columns:

Include columns sets the list of columns to transfer and Exclude columns makes the list of columns that won't be transferred. Set these table names as regular expressions.

Each transformer is a separate set of rules, and you can combine different rules within each set.

For example, you can set a table in Include tables and an Exclude column. In this case, the service will ignore the specified Exclude columns only for the included table. If you combine Exclude tables and Include columns, only the specified columns will be transferred from all tables except those specified in theExclude tables field.

Click + Transformation to add a new transformation layer. You can apply multiple layers to your data.

From the dropdown menu, select the appropriate transformation type:

Mask secret fields

This transformation allows you to apply a hash function to specified columns in tables to further protect sensitive data during transfer.

Under Tables, specify the following:

Included tables restricts the set of tables to transfer.
Excluded tables allow transferring all data except the specified tables.

Set these table names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Column list, click + to add a column name. The masking will be applied to the columns listed in this section.
Under Mask function → Hash → User-defined Salt, specify the Salt hash you want to apply to your data.

Columns filter

This transformation allows you to apply filtering to the list of columns to transfer from the data source.

Under Tables, click + Tables and specify the following:

Included tables restricts the set of tables to transfer.
Excluded tables allow transferring all data except the specified tables.

Set these table names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Columns, specify the following:

Included columns restricts the set of columns to transfer from the tables specified above.
Excluded columns allow transferring all columns except the specified ones.

Set these column names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Rename tables

This transformation gives you a capability to associate the table name on the source with a new table name on the target without changing the contents of the transferred table.

Under Tables list to rename, click + Table.
Under Table 1 → Source table name:
- For PostgreSQL data sources, use the Named schema field to provide the source table schema. Leave empty for the data sources that don't support schema and/or database abstractions.
- Specify the initial Table name on the source.
Under Target table name:
- For PostgreSQL data sources, use the Named schema field to provide the target table schema. Leave empty for the data sources that don't support schema and/or database abstractions.
- Specify the intended Table name on the target.

Replace primary key

This transformation allows you to reassign the primary key column on the target table.

Under Tables, click + Tables and specify the following:

Included tables restricts the set of tables to transfer.
Excluded tables allow transferring all data except the specified tables.

Set these table names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Key columns names, specify the pairs of columns to replace separated by a , comma as follows:
```
<column name at the source> <column name at the target>, 
```
Warning

Assigning two or more primary keys per table makes these tables incompatible with ClickHouse®.

Convert values to string

This transformation allows you to convert a certain data column in a specified table to a string.

Under Tables, click + Tables and specify the following:

Included tables restricts the set of tables to transfer.
Excluded tables allow transferring all data except the specified tables.

Set these table names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Columns, specify the following:

Included columns restricts the set of columns to transfer from the tables specified above.
Excluded columns allow transferring all columns except the specified ones.

Set these column names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Key columns names, click + to add a name of the column containing the primary key.
Under Non-key column names, click + to add a column without a primary key.

The conversion will be applied to the columns listed in both sections.

Convert data to raw JSON

This transformation gives you a capability to convert a certain data column in a specified table to a raw JSON.

Under Tables, specify the following:

Included tables restricts the set of tables to transfer.
Excluded tables allow transferring all data except the specified tables.

Set these table names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Columns, specify the following:

Included columns restricts the set of columns to transfer from the tables specified above.
Excluded columns allow transferring all columns except the specified ones.

Set these column names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Key columns names, click + to add a name of the column containing the primary key.
Under Non-key column names, click + to add a column without a primary key.

The conversion will be applied to the columns listed in both sections.

Sharding

This transformation allows you to distribute the tables between multiple shards on the ClickHouse® data destination.

Under Tables, click + Tables and specify the following:

Included tables restricts the set of tables to transfer.
Excluded tables allow transferring all data except the specified tables.

Set these table names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Columns, click + Columns and specify the following:
- Included columns restricts the set of columns to transfer from the tables specified above.
- Excluded columns allow transferring all columns except the specified ones.
Enter the Count of shards between which you want to distribute the table data.

Convert CDC data to raw JSON with history

This transformation allows you to convert the change data capture (CDC) data to raw JSON with history.

Under Tables, click + Tables and specify the following:

Included tables restricts the set of tables to transfer.
Excluded tables allow transferring all data except the specified tables.

Set these table names as regular expressions:

Collection of regular expression patterns to parse table names

Pattern	Description	Example
`abc`	An explicit series of characters	`test` returns the table names containing `test`.
`.`	A character wildcard. Use it to match an expression with defined character positions.	`t..t` returns `test`, `tent`, `tart` etc.
`\`	An escape character. Use it to match special characters.	`\_` returns the table names containing an underscore.
`?`	Use it to express that a character (or a group of characters) is optional.	`c?.n` returns `can`, `con`, `in`, `on`, `en`, etc.
`+`	Use it to express that a character (or a group of characters) can appear one and more times.	`-+` returns the table names containing `-`, `--`, `---` etc.
`{n}`	Use it to express that a character (or a group of characters) must appear explicitly `n` times	`-{2}` returns the table names containing `--`.
`{n,m}`	Use it to express that a character (or a group of characters) must appear between `n` and `m` times.	`_{1,3}` returns the table names containing `_`, `__` and `___`.
`\w`	An alphanumeric wildcard. Use it to match any alphanumeric characters. The match pattern is case-sensitive.	`\w+` returns the table names containing letters and/or digits.
`\W`	A non-alphanumeric wildcard. Use it to match any non-alphanumeric character. The match pattern is case-sensitive.	`\W+` returns the table names containing characters other than letters or digits.
`\d`	A digit wildcard. Use it to match any digit characters. The match pattern is case-sensitive.	`\d+` returns the table names containing digits.
`\D`	A non-digit wildcard. Use it to match any non-digit characters. The match pattern is case-sensitive.	`\D+` returns the table names containing any characters other than digits.
`$`	Use it to match the position after the table name's last character.	`metrics$` returns the table names ending with `metrics`.
`^`	Use it to match the position before the table name's first character.	This position is useful to define database names. For example, `^monthly_sales` returns all the tables from the `monthly_sales` database.

Under Key columns names, click + to add a name of the column containing the primary key.
Under Non-key column names, click + to add a column without a primary key.

The conversion will be applied to the columns listed in both sections.

SQL

This transformer accepts ClickHouse® SQL dialect and allows you to produce SQL-like in-memory data transformation. The solution is based on ClickHouse® Local

The source table inside ClickHouse® Local is named table, the ClickHouse® table structure mimics the source table structure.

Since each source change item (row) contains extra metadata, we must match source and target data. Therefore, each row must have a key defined. All these keys should be unique in every batch. Do do this, we call a collapse function.

If we can't match source keys with transformed data, we mark such row as containing an error.

When writing an SQL query, you must preserve original key-columns:

SELECT
   parseDateTime32BestEffortJSONExtractString(CloudTrailEvent, 'eventTime')) AS eventTime,
   JSONExtractString(CloudTrailEvent, 'http_request.user_agent') AS http_useragent,    
   JSONExtractString(CloudTrailEvent, 'errorMessage') AS error_message,
   JSONExtractString(CloudTrailEvent, 'errorCode') AS error_kind, 
   JSONExtractString(CloudTrailEvent, 'sourceIPAddress') AS network_client_ip,
   JSONExtractString(CloudTrailEvent, 'eventVersion') AS eventVersion,
   JSONExtractString(CloudTrailEvent, 'eventSource') AS eventSource,
   JSONExtractString(CloudTrailEvent, 'eventName') AS eventName,
   JSONExtractString(CloudTrailEvent, 'awsRegion') AS awsRegion,
   JSONExtractString(CloudTrailEvent, 'sourceIPAddress') AS sourceIPAddress,
   JSONExtractString(CloudTrailEvent, 'userAgent') AS userAgent,
   JSONExtractString(CloudTrailEvent, 'requestID') AS requestID,
   JSONExtractString(CloudTrailEvent, 'eventID') AS eventID,
   JSONExtractBool(CloudTrailEvent, 'readOnly') AS readOnly,
   JSONExtractString(CloudTrailEvent, 'eventType') AS eventType,
   JSONExtractBool(CloudTrailEvent, 'managementEvent') AS managementEvent,
   JSONExtractString(CloudTrailEvent, 'recipientAccountId') AS recipientAccountId,
   JSONExtractString(CloudTrailEvent, 'eventCategory') AS eventCategory,
   JSONExtractString(CloudTrailEvent, 'aws_account') AS account,

   JSONExtractString(CloudTrailEvent, 'userIdentity.type') AS userIdentity_type,
   JSONExtractString(CloudTrailEvent, 'userIdentity.principalId') AS userIdentity_principalId,
   JSONExtractString(CloudTrailEvent, 'userIdentity.arn') AS userIdentity_arn,
   JSONExtractString(CloudTrailEvent, 'userIdentity.accountId') AS userIdentity_accountId,
   JSONExtractString(CloudTrailEvent, 'userIdentity.accessKeyId') AS userIdentity_accessKeyId,
   
   JSONExtractString(CloudTrailEvent, 'userIdentity.sessionContext.sessionIssuer.type') AS sessionIssuer_type,
   JSONExtractString(CloudTrailEvent, 'userIdentity.sessionContext.sessionIssuer.principalId') AS sessionIssuer_principalId,
   JSONExtractString(CloudTrailEvent, 'userIdentity.sessionContext.sessionIssuer.arn') AS sessionIssuer_arn,
   JSONExtractString(CloudTrailEvent, 'userIdentity.sessionContext.sessionIssuer.accountId') AS sessionIssuer_accountId,
   JSONExtractString(CloudTrailEvent, 'userIdentity.sessionContext.sessionIssuer.userName') AS sessionIssuer_userName,
   
   JSONExtractString(CloudTrailEvent, 'userIdentity.sessionContext.webIdFederationData.federatedProvider') AS federatedProvider,
   
   JSONExtractString(CloudTrailEvent, 'userIdentity.sessionContext.attributes.creationDate') AS attributes_creationDate,
   JSONExtractBool(CloudTrailEvent, 'userIdentity.sessionContext.attributes.mfaAuthenticated') AS attributes_mfaAuthenticated,
   
   JSONExtractString(CloudTrailEvent, 'requestParameters.commandId') AS requestParameters_commandId,
   JSONExtractString(CloudTrailEvent, 'requestParameters.instanceId') AS requestParameters_instanceId,
   
   JSONExtractString(CloudTrailEvent, 'tlsDetails.tlsVersion') AS tlsDetails_tlsVersion,
   JSONExtractString(CloudTrailEvent, 'tlsDetails.cipherSuite') AS tlsDetails_cipherSuite,
   JSONExtractString(CloudTrailEvent, 'tlsDetails.clientProvidedHostHeader') AS tlsDetails_clientProvidedHostHeader
FROM table

dbt

Apply your dbt project to the snapshot of the data transferred to ClickHouse®.

For more information on how to apply dbt transformation to your data, see Transform your data in a Managed ClickHouse® cluster with dbt.

Specify the address of the Git repository containing your dbt project. It must start with https://. The root directory of the repository must contain a dbt_project.yml file.
Under Git branch, specify the branch or a tag of the git repository containing your dbt project.
Provide the DBT profile name which will be created automatically using the settings of the destination endpoint. The name must match the profile property in the dbt_project.yml file.
From the dropdown list, select the Operation for your dbt project to perform. For more information, see the official dbt documentation .

Deleting a transformation layer

To delete a transformation layer, click to the right of the transformation type dropdown menu → Delete .

Click Submit.

You can create a transfer using the Transfer and Transfer endpoint resources of the DoubleCloud Terraform provider.

Tip

If you haven’t used Terraform before, refer to Create DoubleCloud resources with Terraform for more detailed instructions.

Example provider and resource configuration:

# main.tf

terraform {
  required_providers {
    doublecloud = {
      source    = "registry.terraform.io/doublecloud/doublecloud"
    }
  }
}

provider "doublecloud" {
  authorized_key = file("authorized_key.json")
}

# Create a source endpoint resource
resource "doublecloud_transfer_endpoint" "example-source-endpoint" {
  name       = "example-source-endpoint"
  project_id = DOUBLECLOUD_PROJECT_ID                         # Replace with your project ID
  settings {
    ...                                                       # Add source endpoint configuration
  }
}

# Create a target endpoint resource
resource "doublecloud_transfer_endpoint" "example-target-endpoint" {
  name       = "example-target-endpoint"
  project_id = DOUBLECLOUD_PROJECT_ID                         # Replace with your project ID
  settings {
    ...                                                       # Add source endpoint configuration
  }
}

# Create a transfer resource
resource "doublecloud_transfer" "example-transfer" {
  name       = "example-transfer"
  project_id = DOUBLECLOUD_PROJECT_ID                          # Replace with your project ID
  source     = doublecloud_transfer_endpoint.example-source-endpoint.id
  target     = doublecloud_transfer_endpoint.example-target-endpoint.id
  type       = "SNAPSHOT_ONLY"
  activated = false
  transformation = {
    transformers = [
      {
        dbt = {
          git_repository_link = "https://github.com/doublecloud/tests-clickhouse-dbt.git"
          profile_name = "my_clickhouse_profile"
          operation = "run"
        }
      },
    ]
  }
}

To learn how to get the authorized_key.json file, refer to Create an API key. You can find the DoubleCloud project ID on the project settings page.

Tip

For a full list of available parameters, refer to the Transfer resource and Transfer endpoint resource schemas.

To create a transfer, use the TrasnferService create method. Pass the following parameters to create a working transfer:

source_id: ID of the source endpoint.

Tip

To find the endpoint ID, get a list of endpoints in the project.
target_id: ID of the target endpoint.
name: Transfer name. Must be unique within the project.
project_id: ID of the project where you want to create a transfer. You can get this value on your project's information page.
type: Transfer type according to the doublecloud.transfer.v1.TransferType model.

View this example on GitHub

from doublecloud.transfer.v1.transfer_pb2 import TransferType
from doublecloud.transfer.v1.transfer_service_pb2 import CreateTransferRequest


def create_transfer(svc, project_id: str, name: str, src_id: str, dst_id: str):
   return svc.Create(
      CreateTransferRequest(
            source_id=src_id, target_id=dst_id, name=name, project_id=project_id, type=TransferType.SNAPSHOT_ONLY
      )
   )

For more in-depth examples, check out DoubleCloud API Python SDK repository .

func createTransfer(ctx context.Context, dc *dc.SDK, flags *cmdFlags, src string, dst string) (*operation.Operation, error) {
   op, err := dc.WrapOperation(dc.Transfer().Transfer().Create(ctx, &transfer.CreateTransferRequest{
      SourceId:  src,
      TargetId:  dst,
      Name:      *flags.name,
      ProjectId: *flags.projectID,
      Type:      transfer.TransferType_SNAPSHOT_ONLY,
   }))
   if err != nil {
      return op, err
   }
   err = op.Wait(ctx)
   return op, err
}

For more in-depth examples, check out DoubleCloud API Go SDK repository .

Activate a transfer

Console

Terraform

API

Python

Open the Transfer page in the console.
Click → Activate.

To activate a transfer:

Change the activated parameter in the transfer resource configuration to true:

resource "doublecloud_transfer" "example-transfer" {
  name      = "example-transfer"
  ...
  activated = true
  ...
}

Update the transfer resource:
```
terraform apply
```

Use the TransferService activate method and pass the transfer ID in the transfer_id request parameter.

Tip

To find the transfer ID, get a list of transfers in the project.

View this example on GitHub

from doublecloud.transfer.v1.transfer_service_pb2 import ActivateTransferRequest

def activate_transfer(svc, transfer_id: str):
   return svc.Activate(ActivateTransferRequest(transfer_id=transfer_id))

func activateTransfer(ctx context.Context, dc *dc.SDK, transferId string) (*operation.Operation, error) {
   op, err := dc.WrapOperation(dc.Transfer().Transfer().Activate(ctx, &transfer.ActivateTransferRequest{
      TransferId: transferId,
   }))
   if err != nil {
      return op, err
   }
   err = op.Wait(ctx)
   return op, err
}

For more in-depth examples, check out DoubleCloud API Go SDK repository .

Deactivate a transfer

Console

Terraform

API

Python

Open the Transfer page in the console.
Next to the transfer name, click → Deactivate.

To deactivate a transfer:

Change the activated parameter in the transfer resource configuration to false:

resource "doublecloud_transfer" "example-transfer" {
  name = "example-transfer"
  ...
  activated = false
  ...
}

Update the transfer resource:
```
terraform apply
```

Use the TransferService deactivate method and pass the transfer ID in the transfer_id request parameter.

Tip

To find the transfer ID, get a list of transfers in the project.

View this example on GitHub

from doublecloud.transfer.v1.transfer_service_pb2 import DeactivateTransferRequest

def deactivate_transfer(svc, transfer_id: str):
   return svc.Deactivate(DeactivateTransferRequest(transfer_id=transfer_id))

func deactivateTransfer(ctx context.Context, dc *dc.SDK, transferId string) (*operation.Operation, error) {
   op, err := dc.WrapOperation(dc.Transfer().Transfer().Deactivate(ctx, &transfer.DeactivateTransferRequest{
      TransferId: transferId,
   }))
   if err != nil {
      return op, err
   }
   err = op.Wait(ctx)
   return op, err
}

For more in-depth examples, check out DoubleCloud API Go SDK repository .

Monitor transfer status

Console

You can monitor the health status of your transfer using the Status history timeline on the Overview page:

Screenshot of the transfer status timeline

The timeline shows the status snapshots of your transfer as colored vertical bars. Currently, there are two possible status messages:

OK: The transfer is running normally.
ERROR: Something’s wrong with the transfer. For more information, refer to the transfer logs.

In the top-right corner of the timeline, you can select the scale: one hour (1h), one day (1d), or one week (1w). You can see the current time as a vertical red marker on the timeline.

To navigate the timeline:

Use and buttons to move the time scale in selected intervals.
Use your mouse cursor to drag the highlighted interval across the timeline located above the colored bars.

On the right of the status display, you can see the information card with the current transfer statistics:

The amount of bytes read.
The number of data rows read. This parameter is used to calculate the transfer's resource consumption.

Transfer data from the AWS networks

Transfer requires access to the resources in AWS that are usually protected by a set of tools. Refer to the following guide to see how to peer networks and configure DoubleCloud and AWS to perform a successful transfer: Transfer data from a PostgreSQL RDS instance to ClickHouse®.

Update a transfer

Console

Terraform

API

Open the Transfer page in the console.
Select the transfer you want to update.
Click Edit at the top right of the page.
Edit Basic and Transfer settings.
Click Submit.

To update a transfer:

Change the desired parameters in the transfer resource configuration:

resource "doublecloud_transfer" "example-transfer" {
  name = "paid-traffic-data"
  ...
  transformation = {
    transformers = [
      {
        dbt = {
          git_repository_link = "https://github.com/doublecloud/tests-clickhouse-dbt.git"
          profile_name = "my_clickhouse_profile"
          operation = "run"
        }
      },
    ]
  }
}

Update the transfer resource:
```
terraform apply
```

Use the TransferService update method and pass the transfer ID in the transfer_id request parameter.

Tip

To find the transfer ID, get a list of transfers in the project.

You can change the following parameters:

description: New transfer description.
name: New transfer name. Must be unique within the project.

Delete a transfer

Console

Terraform

API

Python

Open the Transfer page in the console.
Select the transfer you want to delete.
Click Delete at the top right of the page.
In the dialog that opens, confirm deletion and click Delete.

To delete a transfer:

Comment out or delete the transfer resource from the Terraform configuration files.
Apply the configuration:
```
terraform apply
```

Use the TransferService delete method and pass the transfer ID in the transfer_id request parameter.

Tip

To find the transfer ID, get a list of transfers in the project.

View this example on GitHub

from doublecloud.transfer.v1.transfer_service_pb2 import DeleteTransferRequest


def delete_transfer(svc, transfer_id: str):
   return svc.Delete(DeleteTransferRequest(transfer_id=transfer_id))

For more in-depth examples, check out DoubleCloud API Python SDK repository .

func deleteTransfer(ctx context.Context, dc *dc.SDK, transferId string) (*operation.Operation, error) {
   op, err := dc.WrapOperation(dc.Transfer().Transfer().Delete(ctx, &transfer.DeleteTransferRequest{
      TransferId: transferId,
   }))
   if err != nil {
      return op, err
   }
   err = op.Wait(ctx)
   return op, err
}

For more in-depth examples, check out DoubleCloud API Go SDK repository .

Create and manage target endpoints

Filter Transfer logs