Getting started with Apache Kafka®
To get started with the service:
Before you start
Your primary tool to interact with the DoubleCloud is the console. We need to set it up and then configure it before moving on.
-
Go to the Clusters overview
-
If you already have an account, log in to DoubleCloud or sign up if you open the console for the first time.
Warning
The steps below show the sequence of setting up the kcat from Docker
For other connection options, see Connect to a Apache Kafka® cluster.
DockerNative kafkacat (DEB)Pull the kcat
1.7.1
version, but you can use the latest one:docker pull edenhill/kcat:1.7.1
Install kafkacat from your Linux distribution's repository:
sudo apt install kafkacat
Create your cluster
A cluster in the Managed Service for Apache Kafka® is one or more broker hosts where topics and their partitions
Warning
During the trial period, you can create clusters with up to 8 cores, 32 GB RAM, and 400 GB storage. If you need to raise the quotas, don't hesitate to contact our support.
-
Go to the Clusters overview
-
Click Create cluster in the upper-right corner of the page.
-
Select Apache Kafka®.
-
Choose a provider:
-
AWS
-
Under Resources:
-
Select the
s1-c2-m4
preset for CPU, RAM capacity, and storage space to create a cluster with minimal configuration.Understand your Apache Kafka® resource presetA resource preset has the following structure:
<CPU platform> - C<number of CPU cores> - M<number of gigabytes of RAM>
There are three available CPU platforms:
-
g
- ARM Graviton -
i
- Intel (x86) -
s
- AMD (x86)
For example, the
i1-c2-m8
preset means that it's an Intel platform 2-core CPU with 8 gigabytes of RAM.You can see the availability of CPU platforms across our Managed Service for Apache Kafka® areas and regions.
-
-
Select the number of zones and brokers. In this tutorial, we create a cluster with
1
zone and1
broker.
-
-
Under Basic settings:
-
Enter the cluster Name, for example,
quickstart-cluster
. -
From the Version drop-down list, select the Apache Kafka® version the cluster will use. For most clusters, we recommend using the latest version.
-
-
Under Networking → VPC, specify in which DoubleCloud VPC to locate your cluster. Use the
default
value in the previously selected region if you don't need to create this cluster in a specific network. -
Click Submit.
-
-
Google Cloud
-
Under Resources:
-
Under Basic settings:
-
Enter the cluster Name, for example,
quickstart-cluster
. -
From the Version drop-down list, select the Apache Kafka® version the cluster will use. For most clusters, we recommend using the latest version.
-
-
Under Networking → VPC, specify in which DoubleCloud VPC to locate your cluster. Use the
default
value in the previously selected region if you don't need to create this cluster in a specific network. -
Click Submit.
-
-
-
Your cluster will appear with the Creating
status on the Clusters page in the console. Setting everything up may take some time. When the cluster is ready, it changes its state to Alive
.
Click on the cluster and you'll see the following page:
To create a Apache Kafka® cluster, use the ClusterService
create method. The required parameters to create a functional cluster:
-
project_id
- the ID of your project. You can get this value on your project's information page. -
cloud_type
-aws
. -
region_id
- for this quickstart, useeu-central-1
-
name
-quickstart-cluster
. -
resources
- specify the following from the doublecloud.ckafka.v1.Cluster model:-
resource_preset_id
- for this quickstart, specifys1-c2-m4
. -
disk_size
-34359738368
bytes (32 GB). -
broker_count
-1
. -
zone_count
-3
.
-
-
You can also enable schema registry for your cluster: use the
schema_registry_config
object within theClusterService
Create method.
{
"project_id": "<your Project ID>",
"cloud_type": "aws",
"region_id": "eu-central-1",
"name": "quickstart-cluster",
"resources": {
"kafka": {
"resource_preset_id": "s1-c2-m4",
"disk_size": "34359738368",
"broker_count": 1,
"zone_count": 3
}
},
}
Note
The DoubleCloud service creates the superuser admin
and its password automatically. You can find both the User and the Password in the Overview tab on the cluster information page.
Create a topic
After you've created a cluster, you also need to create a topic for messages:
-
On the cluster's page, go to the Topics tab.
-
Click Create.
-
Under Topic Settings, specify the connection properties:
-
Cleanup policy -
Delete
. This policy deletes log segments when their retention time or log size reaches the limit. -
Compression Type -
Uncompressed
. We don't need compression for this tutorial. Let's disable it. -
Retention Bytes -
1048576
(1 Mb). -
Retention Ms -
600000
(10 minutes).
-
-
Specify the Basic Settings:
-
Name
A topic's name. Let's call it
first-topic
. -
Partitions
The number of a topic's partitions
1
to create the simplest topic. -
Replication factor
Specifies the number of copies of a topic in a cluster. This parameter's value shouldn't exceed the overall number of brokers in the cluster. Set it to
1
.
Your topic should look as follows:
-
-
Click Submit.
Use the TopicService
create method and pass the following parameters:
-
cluster_id
- the ID of the cluster in which you want to create a topic. To find the cluster ID, get a list of clusters in the project. -
topic_spec
- let's configure the required topic specifications:-
name
- specify the topic name,first-topic
. -
partitions
- set the minimum number of partitions for this quickstart,1
. -
replication_factor
- go for the basic option here as well, specify1
. -
topic_config_3
: use the doublecloud.kafka.v1.TopicConfig3 model to set the further topic configuration for Apache Kafka® version 3 and above:-
cleanup policy
- set the cleanup policy for the topic, in this caseCLEANUP_POLICY_DELETE
. -
compression_type
- we don't need compression for this tutorial, specifyCOMPRESSION_TYPE_UNCOMPRESSED
. -
retention_bytes
-1048576
(1 Mb). -
retention_ms
-600000
(10 minutes).
-
-
Connect to your cluster
When you have a cluster and a topic in it, connect to the cluster and transfer a text message between a consumer and a producer:
-
Run a command that contains a connection string to create a consumer. You can use the Connection string from the Overview tab on your cluster information page. The command has the following structure:
DockerDEB-baseddocker run --name kcat --rm -i -t edenhill/kcat:1.7.1 -C \ -b <broker FQDN>:9091 \ -t <topic name> \ -X security.protocol=SASL_SSL \ -X sasl.mechanisms=SCRAM-SHA-512 \ -X sasl.username="admin" \ -X sasl.password="<cluster password>" \ -Z
kafkacat -C \ -b <broker FQDN>:9091 \ -t <topic name> \ -X security.protocol=SASL_SSL \ -X sasl.mechanisms=SCRAM-SHA-512 \ -X sasl.username="admin" \ -X sasl.password="<cluster password>" \ -Z
You will see the following status message:
% Reached end of topic first-topic [0] at offset 0
-
Execute the following command in a separate terminal instance to create a producer and push the data:
DockerDEB-basedcurl https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.json | docker run --name kcat --rm -i edenhill/kcat:1.7.1 -P \ -b <broker FQDN>:9091 \ -t <topic name> \ -k key \ -X security.protocol=SASL_SSL \ -X sasl.mechanisms=SCRAM-SHA-512 \ -X sasl.username="<username>" \ -X sasl.password="<password>"
curl https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.json | kafkacat -P \ -b <broker FQDN>:9091 \ -t <topic name> \ -k key \ -X security.protocol=SASL_SSL \ -X sasl.mechanisms=SCRAM-SHA-512 \ -X sasl.username="<username>" \ -X sasl.password="<password>"
-
If you've completed all the steps successfully, the terminal with the consumer will show the uploaded data:
{ "Hit_ID": 40668, "Date": "2017-09-09", "Time_Spent": "730.875", "Cookie_Enabled": 0, "Redion_ID": 11, "Gender": "Female", "Browser": "Chrome", "Traffic_Source": "Social network", "Technology": "PC (Windows)" } % Reached end of topic first-topic [0] at offset 1102
Now you have an Apache Kafka® cluster with the working consumer and producer. See the links below to continue exploring: