DoubleCloud’s 14th Product Update
Hey everyone! Victor here with the latest news.
DynamoDB and Amazon Kinesis connectors
To assist users and customers migrating from Rockset, we have introduced DynamoDB and Amazon Kinesis connectors. We’ve observed that many users utilize DynamoDB alongside Rockset as a data source. Amazon Kinesis also facilitates CDC (Change Data Capture) from DynamoDB through data streams.
DynamoDB is also a popular choice for those building mobile applications, games, and other projects that require efficient data storage for lookup queries. However, it can be challenging to use for analytical purposes and to perform aggregations or analytical operations on large datasets. As a result, it’s often necessary to offload data to another storage solution that is more efficient for such tasks. ClickHouse is an ideal solution for these scenarios, and now setting up this integration is extremely easy using our Data Transfer service.
Quick navigation:
CDC and AWS Lambda transformations in DoubleCloud Data Transfer
We have significantly enhanced the transformation capabilities in our Data Transfer service by adding the ability to perform transformations on the fly, even in CDC pipelines. Under the hood, we are processing small batches and leveraging ClickHouse’s in-memory capabilities to execute SQL transformations. This can be applied to streams from Apache Kafka, changefeeds from PostgreSQL, or MongoDB, where you can flatten the schema into columns in ClickHouse.
Additionally, we’ve introduced the ability to trigger AWS Lambda as a transformation step. You can write your own logic and set it up as an AWS Lambda function, which will receive raw data and return transformed or enriched data. This feature is particularly useful for performing enrichments, lookups to external sources, or executing complex transformations using your preferred language, such as TypeScript, Python, or Go. Here is a great example of how to achieve this.
DoubleCloud Data Visualization service update
-
New UI to show dependencies between selectors, charts, and between each other
-
Added dataset optimize join (required) field
-
Improved calendar component in connections
-
Added IMAGE markup formula
-
Added title options for the indicator chart
And hundreds of other small bug fixes and improvements that you can see here in versions starting from 16.03.2024 till 06.06.2024
Chart-to-Chart filtration in Data Visualization
Chart-to-Chart is helpful when you want to filter data in other charts based on selected values from a source chart. Essentially, these charts act as selectors within a dashboard. To enable this, simply activate the option in the chart settings, and if you click on values in the chart, all charts based on the same dataset will be filtered accordingly.
This functionality is currently in preview.
New GCP regions
GCP usage is growing, and we have added new regions to better serve our users. This enables them to build analytics closer to their deployed applications and end-users.
Below is a list of new regions where you can deploy fully managed ClickHouse and Apache Kafka services:
- Belgium (europe-west1)
- Switzerland (europe-west6)
- Virginia in North America (Us-east4)
- Singapore (asia-southeast1)
ClickHouse and Apache Kafka version updates
ClickHouse version 24.6 is now available on DoubleCloud. You can find the changelog here. Some of my favorite updates include memory optimizations and fixes for memory leaks. Also, ClickHouse Inc. made a lot of improvements to the latest LTS ver 24.3, and therefore, we now make ver 24.3 the default option when you create a new ClickHouse cluster.
We’ve also added Apache Kafka versions 3.6 and 3.7. The most notable improvements include enhanced support for Kraft, extended monitoring capabilities with new metrics, and the introduction of the official Apache Kafka Docker image, which enables quicker testing and deployment.
Quality of life improvements
Here’s a list of some minor but potentially pivotal improvements that simplify workflows and day-to-day tasks:
-
New progress label: We’ve added a cute progress label that displays the approximate cluster creation time.
-
Enhanced Operation tab details: Instead of the generic “Cluster maintenance” message, the Operation tab now provides more detailed information, such as certificate rotations or minor version updates. You’ll also see events related to auto-scaling operations.
-
BYOC private mode improvement: In BYOC private mode, we automatically configure S3 VPC endpoints to match the S3 region, reducing traffic costs.
-
Data Visualization service API enhancements: We’ve increased support for large requests in the Data Visualization service API, enabling the deployment of larger dashboards or workbooks.
-
Airbyte connectors fixes: We committed fixes for Airbyte connectors, specifically for BigQuery and DynamoDB.
-
More parameters for TLS layer in ClickHouse: PR with details about that TLS improvement in ClickHouse.
-
Billing information update: You can now specify an email for billing information.
Documentation
Below is a list of some new and useful documentation articles that are worth mentioning and reading: