PostgreSQL connector

You can use the PostgreSQL connector in both source and target endpoints. In source endpoints, the connector retrieves data from PostgreSQL databases. In target endpoints, it inserts data to PostgreSQL databases.

Source endpoint

Configuration

Source endpoint API model

Source data type mapping

Requirements for source PostgreSQL databases

The transferred data must be in tables, not views.
Each table must have a primary key.

To configure a source PostgreSQL endpoint, provide the following settings:

Configure the connection:
1. Under Connection settings → Connection type, expand Hosts.
2. Click and enter the host's IP address or domain name (FQDN). To add several hosts, repeat this step.
3. In Port, enter the port for connection, or leave the default value of 5432.
4. To encrypt data transmission, upload a .pem file with certificates under CA Certificate.
5. In Database, enter the name of the database associated with the user and password.
6. In User and Password, enter the user credentials to connect to the database.
  
  Warning
  
  The user must have a role with the REPLICATION attribute or be a superuser. Learn more
Standard PostgreSQL connection string reference
Typically, a PostgreSQL connection string looks as follows:
```
postgres://<username>:<password>@<hostname>:<port>/<database>
```
- <username>:<password>: (Optional) Authentication credentials.
- <hostname>: IP-address or the domain name (FQDN) of the MongoDB server.
- <port>: Port. Default is 5432.
- <database>: Name of the database to connect to.
Configure the Table filter if you need to transfer only specific tables. If you don't specify any of the settings below, all tables will be transferred.
- Included tables
  
  Transfer will transfer only the data from these tables. Specify the table names together with the schema in the schema_name.target_table format.
  
  Enter schema_name.* to select all tables.
- Excluded tables
  
  The data from the tables on this list won't be transferred. Specify the table names after the name of the schema containing these tables as follows: schema_name.excluded_table.
  
  Enter schema_name.* to select all tables.
Configure Schema migration:

Tip

The default schema migration settings usually let you successfully perform a transfer. Change the settings of the initial and final stages of the transfer only if necessary.

During the transfer process, the database schema is transferred from the source to the destination. The service uses pg_dump to transfer the schema. You can set the migration rules for each object class separately. The possible options are At begin, At end, Don't copy.

The transfer is performed in two stages:
- Activation stage
  
  This stage is performed on activation of the transfer, before snapshotting or replication to create schema on the target.
  
  You can select parts of the schema to be transferred at this stage.
  
  By default, they include the following:
  - Sequences
  - Owned sequences
  - Current values for sequences
  - Tables
  - Primary keys
  - Default values
  - Views
  - Functions
  - Types
  - Rules
  - Collations
  - Policies
- Deactivation stage
  
  This stage is performed at the end of the transfer upon its deactivation.
  
  If the transfer is constantly working in replication mode (Replication and Snapshot and replication transfer types), then the final stage of the transfer will be performed only after the replication stops. You can choose which parts of the schema to migrate.
  
  At this stage, it's assumed that when the transfer is deactivated, there is no writing activity on the source. For complete reliability, the source is set to read-only mode. The database schema on the target is brought to a state where it will be consistent with the schema on the source.
  
  We recommend that you include resource-intensive operations, such as index transfers, in the final migration stage. Moving the indexes at the beginning of the transfer can slow it down.
  
  Default objects migrated at the end of a transfer are the following:
  - Foreign keys
  - Constraints
  - Indexes
  - Triggers
Note

When the transfer is restarted during the replication phase, the table schemas on the target are preserved. In this case, the service will transfer only the table schemas that aren't in the target at the restart.
Specify Advanced settings:
- Maximum WAL size for the replication slot. Here you can set the maximum size of Write-Ahead Log keeping on the replication slot. When this value is exceeded, replication stops and the replication slot is removed.
- The name of the Database schema for auxiliary tables.
- Check the Merge inherited tables box if you need to merge content of the tables.
- Under Parallel table copying settings:
  - Split threshold in bytes
    
    This setting defines the minimum tables size in bytes starting from which the service will parallelize the tables. To the right from the input field, select the increment from the dropdown list - from B (bytes) up to GB (gigabytes).
  - Maximum number of table parts
    
    The maximum number of parts into which the service will split a table when using sharded loading.

Partition replication for partitioned tables

If you need to replicate all the partitions of a partitioned table, do the following:

Add the table in question to the Table filter → Included tables list.
Check the Advanced settings → Merge inherited tables box.

This will result in the source endpoint extracting the partitioned table without partitions - the extracted items' table name will be set to the name of the partitioned table, even if the item represents a row of a partition.

For an example of using this endpoint in a real scenario, see Transfer data from a PostgreSQL RDS instance to ClickHouse®.

To create a PostgreSQL source endpoint with API, use the endpoint.PostgresSource model.

PostgreSQL type	Transfer type
`BIGINT`	int64
`INTEGER`	int32
`SMALLINT`	int16
—	int8
—	uint64
—	uint32
—	uint16
—	uint8
—	float
`NUMERIC`, `REAL`, `DOUBLE PRECISION`	double
`BIT (N)`, `BIT VARYING (N)`, `BYTEA`, `BIT`, `BIT VARYING`	string
`CHARACTER VARYING`, `DATA`, `UUID`, `NAME`, `TEXT`, `INTERVAL`, `TIME WITH TIME ZONE`, `TIME WITHOUT TIME ZONE`, `CHAR`, `ABSTIME`, `MONEY`	utf8
`BOOLEAN`	boolean
`DATE`	date
—	datetime
`TIMESTAMP WITHOUT TIME ZONE`, `TIMESTAMP WITH TIME ZONE`	timestamp
`ARRAY`, `CHARACTER`, `CITEXT`, `HSTORE`, `JSON`, `JSONB`, `DATERANGE`, `INT4RANGE`, `INT8RANGE`, `NUMRANGE`, `POINT`, `TSRANGE`, `XML`, `INET`, `CIDR`, `MACADDR`, `OID`, `REST`...	any

Target endpoint

Configuration

Target endpoint API model

Target data type mapping

To configure a target PostgreSQL endpoint, provide the following settings:

Configure the connection:
1. Under Connection settings → Connection type, expand Hosts.
2. Click and enter the host's IP address or domain name (FQDN). To add several hosts, repeat this step.
3. In Port, enter the port for connection, or leave the default value of 5432.
4. To encrypt data transmission, upload a .pem file with certificates under CA Certificate.
5. In Database, enter the name of the database associated with the user and password.
6. In User and Password, enter the user credentials to connect to the database.
  
  Warning
  
  The user must have a role with the REPLICATION attribute or be a superuser. Learn more
Standard PostgreSQL connection string reference
Typically, a PostgreSQL connection string looks as follows:
```
postgres://<username>:<password>@<hostname>:<port>/<database>
```
- <username>:<password>: (Optional) Authentication credentials.
- <hostname>: IP-address or the domain name (FQDN) of the MongoDB server.
- <port>: Port. Default is 5432.
- <database>: Name of the database to connect to.
Select a Cleanup policy to specify how data in the target database is cleaned up when a transfer is activated, reactivated, or reloaded.
Cleanup policy reference
- Disabled: Don’t clean. Select this option if you only perform replication without copying data.
- Drop (default): Fully delete the tables included in the transfer. Use this option to always transfer the latest version of the table schema to the target database from the source.
- Truncate: Execute the TRUNCATE command for the target table each time you run a transfer.
If you want to transfer all the transactions with context in the source database to the target database, expand Advanced settings and enable Save transaction boundaries.

Note

This feature is in the Preview stage.

To create a PostgreSQL target endpoint with the API, use the endpoint.PostgresTarget model.

Transfer type	PostgreSQL type
int64	`BIGINT`
int32	`INTEGER`
int16	`SMALLINT`
int8	`SMALLINT`
uint64	`BIGINT`
uint32	`INTEGER`
uint16	`SMALLINT`
uint8	`SMALLINT`
float	`REAL`
double	`DOUBLE PRECISION`
string	`BYTEA`
utf8	`TEXT`
boolean	`BOOLEAN`
date	`DATE`
datetime	`TIMESTAMP WITHOUT TIME ZONE`
timestamp	`TIMESTAMP WITHOUT TIME ZONE`
any	`JSONB`

PostgreSQL connector

Source endpoint

Target endpoint

See also