How To Set Up Apache Kafka With Docker?

How To Set Up Apache Kafka With Docker? | ninjasquad

1. Introduction

If you would like to learn how to set up Apache Kafka using Docker Compose then you’ve come to the right place 🙂

In this article, together we will:

  • see the bare minimum Docker Compose configuration we must provide to spin up an environment with exactly one Kafka broker and Zookeeper instance,
  • take a closer look at the required configurations,
  • verify our setup using Console Producer and Consumer tools.

If you will enjoy this tutorial, then I highly encourage you to check out the continuations:

2. What is Apache Kafka?

Before we start, I feel obliged to provide at least a brief explanation of what Apache Kafka and Zookeeper are. Please skip the following two paragraphs if you want to get straight to the practice part of this tutorial.

Well, Apache Kafka by definition is an open-source distributed event streaming platform. Simply put, it’s a platform widely used to work with real-time streaming data pipelines and to integrate applications to work with such streams. It’s mainly responsible for:

  • publishing/subscribing to the stream of records,
  • their real-time processing,
  • and their ordered storage

But let’s just be honest- it’s almost impossible to describe Kafka in a couple of sentences and I highly encourage you to visit the Apache Intro page after this tutorial right here.

3. What is Apache ZooKeeper?

Or the question should rather be why do I even need another service to work with Apache Kafka?

Just like a zookeeper in a zoo takes care of the animals, Apache Zookeeper “takes care” of all the Kafka brokers in a cluster. It’s mainly responsible for:

  • tracking the cluster state- like which brokers are a part of the cluster, sending notifications to Kafka in case of changes
  • topics configuration
  • controller election
  • access control lists and quotas

Nevertheless, please keep in mind that the Kafka team is constantly working on replacing the ZooKeeper with Apache Kafka Raft (KRaft). To be more specific- the 2.8.0 version introduced early access to this functionality, but at the moment of publishing this article, it’s still not production-ready.

Image shows two ebooks people can get for free after joining newsletter


4. Bare Minimum Kafka Docker Compose

With all of that being said, let’s have a look at the bare minimum version of the docker-compose.yml file. This configuration is necessary to spin up exactly 1 Kafka and Zookeeper instance:

version: '3'
    image: confluentinc/cp-zookeeper:7.2.1
    container_name: zookeeper
    image: confluentinc/cp-kafka:7.2.1
    container_name: kafka
      - "8098:8098"
      - zookeeper
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'

Again, before we head to the next commands, let’s take a couple of minutes to understand, what exactly is going on here.

4.1. Docker Images Version And Containers

Firstly, let’s take a look at this piece of code:

version: '3'
    image: confluentinc/cp-zookeeper:7.2.1
    container_name: zookeeper
    image: confluentinc/cp-kafka:7.2.1
    container_name: kafka

As we can see, with this config we will spin up two services named kafka and zookeeper. The container_name will set specified names for our containers, instead of letting Docker generate them. Finally, we will use the 7.2.1 versions of Confluent Community Docker Image for Apache Kafka and Confluent Docker Image for Zookeeper.

4.2. Additional Kafka Container Configs

As the next step, let’s have a quick look at these four lines:

  - "8098:8098"
  - zookeeper

As the name suggests, the ports command is responsible for exposing ports, so with this setting, port 8098 of our container will be accessible through the 8098 port of the host machine. Additionally, the depends_on will make sure to start the zookeeper container before the kafka.

5. Required Settings

As the next step, let’s take a closer look at the required settings for both Kafka and Zookeeper in our Docker Compose settings.

5.1. Required Zookeeper Settings

As we can see in our example, the only environment we specified is ZOOKEEPER_CLIENT_PORT. This one instructs Zookeeper where it should listen for connections by clients- Kafka in our case.

Additionally, when running in clustered mode, we have to set the ZOOKEEPER_SERVER_ID (but it’s not the case here).

5.2. Required Confluent Kafka Settings

When it comes to the Confluent Kafka Docker image, here is the list of the required environment variables:

  • KAFKA_ZOOKEEPER_CONNECT – instructing Kafka where it can find the Zookeeper
  • KAFKA_ADVERTISED_LISTENERS – specifying the advertised hostname, which clients can reach out to. Moreover, this value is sent to the Zookeeper

Please keep in mind, that in order to run only one instance of Kafka (aka single-node cluster), we have to additionally specify KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR set to 1. Its default value is 3 and if we don’t do that, we will get into this error: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1

6. Verify Kafka Docker Compose Config

With all of that being said, we can finally check out if everything is working, as expected.

6.1. Start Containers

As the first step, we have to start our containers:

docker compose up -d

The -d flag instructs the docker to run containers in a detached mode.

Of course, we can verify everything by running the docker ps command:

# example output:
CONTAINER ID IMAGE                           COMMAND                CREATED        STATUS       PORTS                            NAMES
38b41bc43676 confluentinc/cp-kafka:7.2.1     "/etc/confluent/dock…" 42 seconds ago Up 4 seconds>8098/tcp, 9092/tcp kafka
45484184f330 confluentinc/cp-zookeeper:7.2.1 "/etc/confluent/dock…" 42 seconds ago Up 5 seconds 2181/tcp, 2888/tcp, 3888/tcp     zookeeper

6.2. Create Kafka Topic

Nextly, let’s create a new Kafka topic:

docker exec kafka kafka-topics --bootstrap-server localhost:8098 --create --topic example

As a word of explanation, the kafka-topics is a shell script that allows us to execute a TopicCommand tool. It’s a command-line tool that can manage and list topics in a Kafka cluster. Additionally, we have to specify the listener hostname (specified with KAFKA_ADVERTISED_LISTENERS) with –bootstrap-server.

When the topic is created we should see the following message: “Created topic example.”

6.3. Run Kafka Producer

Following, let’s run a console producer with the kafka-console-producer:

docker exec --interactive --tty kafka kafka-console-producer --bootstrap-server localhost:8098 --topic example

Please keep in mind that this command will start the producer and it will wait for our input (and you should notice the > sign). Please specify a couple of messages and hit Ctrl + D after you finish:

# hit Ctrl + D

6.4. Run Kafka Consumer

As the last step, let’s run a console consumer with the kafka-console-consumer command:

docker exec --interactive --tty kafka kafka-console-consumer --bootstrap-server localhost:8098 --topic example --from-beginning

This time, we should see that our messages are printed to the output successfully (and to finish please hit Ctrl+ C):


7. Kafka With Docker Compose Summary

And that would be all for this step-by-step guide on How To Set Up Apache Kafka With Docker Compose.

Let me know about your thoughts in the comment section, or reach out through the contact form- I highly appreciate your thoughts, comments and feedback.

Take care brother/sister!

Source: Internet

Leave a Comment

We are offering free coding tuts