Getting started with Apache Kafka®

To get started with the service:

Before you start

Your primary tool to interact with the DoubleCloud is the console. We need to set it up and then configure it before moving on.

  1. Go to the management.

  2. If you already have an account, log in to DoubleCloud or sign up if you open the console for the first time.

    Warning

    The steps below show the sequence of setting up the kcat from Docker and with kafkacat on DEB-based but you can use other tools of your choice.

    For other connection options, see Connect to a Apache Kafka® cluster.

  1. Open your terminal.

  2. (Optional) Start Docker if needed:

    service docker start
    
  3. Pull the kcat image that's available at Docker Hub. This tutorial uses the 1.7.1 version, but you can use the latest one:

    docker pull edenhill/kcat:1.7.1
    
  1. Open your terminal.

  2. Install kafkacat:

    sudo apt install kafkacat
    

Create your cluster

A cluster in the Managed Service for Apache Kafka® is one or more broker hosts where topics and their partitions are located.

Warning

During the trial period, you can create clusters with up to 8 cores, 32 GB RAM, and 400 GB storage. If you need to raise the quotas, don't hesitate to contact our support.

  1. Go to the console.

  2. Select Clusters from the list of services on the left.

  3. Click Create cluster in the upper-right corner of the page.

    1. Select Apache Kafka®.

    2. Choose a provider and a region.

    3. Under Resources:

      • Select the s1-c2-m4 preset for CPU, RAM capacity, and storage space to create a cluster with minimal configuration.

      • Select the number of zones and brokers. In this tutorial, we create a cluster with 1 zone and 1 broker.

    4. Under Basic settings:

      • Enter the cluster Name, for example, Quickstart-cluster.

      • From the Version drop-down list, select the ClickHouse® version the Managed ClickHouse® cluster will use. For most clusters, we recommend using the latest version.

    5. Under NetworkingVPC, specify in which DoubleCloud VPC to locate your cluster. Use the default value in the previously selected region if you don't need to create this cluster in a specific network.

    6. Click Submit.

Your cluster will appear with the Creating status on the Clusters page in the console. Setting everything up may take some time. When the cluster is ready, it changes its state to Alive.

Click on the cluster and you'll see the following page:

cluster-created

Note

The DoubleCloud service creates the superuser admin and its password automatically. You can find both the User and the Password in the Overview tab on the cluster information page.

Create a topic

After you've created a cluster, you also need to create a topic for messages:

  1. On the cluster's page, go to the Topics tab.

  2. Click Create.

  3. Under Topic Settings, specify the connection properties:

    • Cleanup policy - Delete. This policy deletes log segments when their retention time or log size reaches the limit.

    • Compression Type - Uncompressed. We don't need compression for this tutorial. Let's disable it.

    • Retention Bytes - 1048576 (1 Mb).

    • Retention Ms - 600000 (10 minutes).

  4. Specify the Basic Settings:

    • Name
      A topic's name. Let's call it first-topic.

    • Partitions
      The number of a topic's partitions . Set to 1 to create the simplest topic.

    • Replication factor
      Specifies the number of copies of a topic in a cluster. This parameter's value should not exceed the overall number of brokers in the cluster. Set it to 1.

    Your topic should look as follows:

    topic-configured

  5. Click Submit.

Connect to your cluster

When you have a cluster and a topic in it, connect to the cluster and transfer a text message between a consumer and a producer:

  1. Run a command that contains a connection string in a container to create a consumer. You can use the Connection string from the Overview tab on your cluster information page. The command will have the following structure:

    docker run --name kcat --rm -i -t edenhill/kcat:1.7.1 
          -C \
          -b <broker FQDN>:9091 \
          -t <topic name> \
          -X security.protocol=SASL_SSL \
          -X sasl.mechanisms=SCRAM-SHA-512 \
          -X sasl.username="admin" \
          -X sasl.password="<cluster password>" \
          -Z
    
    kafkacat
          -C \
          -b <broker FQDN>:9091 \
          -t <topic name> \
          -X security.protocol=SASL_SSL \
          -X sasl.mechanisms=SCRAM-SHA-512 \
          -X sasl.username="admin" \
          -X sasl.password="<cluster password>" \
          -Z
    

    You will see the following status message:

    % Reached end of topic first-topic [0] at offset 0
    
  2. Execute the following command in a separate terminal instance to create a producer and push the data:

    curl https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.json 
          | docker run --name kcat --rm -i edenhill/kcat:1.7.1
          -P \
          -b <broker FQDN>:9091 \
          -t <topic name> \
          -k key \
          -X security.protocol=SASL_SSL \
          -X sasl.mechanisms=SCRAM-SHA-512 \
          -X sasl.username="<username>" \
          -X sasl.password="<password>" \
    
    curl https://doublecloud-docs.s3.eu-central-1.amazonaws.com/data-sets/hits_sample.json 
          | kafkacat
          -P \
          -b <broker FQDN>:9091 \
          -t <topic name> \
          -k key \
          -X security.protocol=SASL_SSL \
          -X sasl.mechanisms=SCRAM-SHA-512 \
          -X sasl.username="<username>" \
          -X sasl.password="<password>" \
    
  3. If you've completed all the steps successfully, the terminal with the consumer will show the uploaded data:

     },
     {
          "Hit_ID": 40668,
          "Date": "2017-09-09",
          "Time_Spent": "730.875",
          "Cookie_Enabled": 0,
          "Redion_ID": 11,
          "Gender": "Female",
          "Browser": "Chrome",
          "Traffic_Source": "Social network",
          "Technology": "PC (Windows)"
     }
    ]
    % Reached end of topic first-topic [0] at offset 1102
    

Now you have an Apache Kafka® cluster with the working consumer and producer. See the links below to continue exploring!

See also