How To Set Up Apache Kafka With Docker? | ninjasquad
If you would like to learn how to set up Apache Kafka using Docker Compose then you’ve come to the right place 🙂
In this article, together we will:
- see the bare minimum Docker Compose configuration we must provide to spin up an environment with exactly one Kafka broker and Zookeeper instance,
- take a closer look at the required configurations,
- verify our setup using Console Producer and Consumer tools.
If you will enjoy this tutorial, then I highly encourage you to check out the continuations:
2. What is Apache Kafka?
Before we start, I feel obliged to provide at least a brief explanation of what Apache Kafka and Zookeeper are. Please skip the following two paragraphs if you want to get straight to the practice part of this tutorial.
Well, Apache Kafka by definition is an open-source distributed event streaming platform. Simply put, it’s a platform widely used to work with real-time streaming data pipelines and to integrate applications to work with such streams. It’s mainly responsible for:
- publishing/subscribing to the stream of records,
- their real-time processing,
- and their ordered storage
But let’s just be honest- it’s almost impossible to describe Kafka in a couple of sentences and I highly encourage you to visit the Apache Intro page after this tutorial right here.
3. What is Apache ZooKeeper?
Or the question should rather be why do I even need another service to work with Apache Kafka?
Just like a zookeeper in a zoo takes care of the animals, Apache Zookeeper “takes care” of all the Kafka brokers in a cluster. It’s mainly responsible for:
- tracking the cluster state- like which brokers are a part of the cluster, sending notifications to Kafka in case of changes
- topics configuration
- controller election
- access control lists and quotas
Nevertheless, please keep in mind that the Kafka team is constantly working on replacing the ZooKeeper with Apache Kafka Raft (KRaft). To be more specific- the 2.8.0 version introduced early access to this functionality, but at the moment of publishing this article, it’s still not production-ready.
4. Bare Minimum Kafka Docker Compose
With all of that being said, let’s have a look at the bare minimum version of the docker-compose.yml file. This configuration is necessary to spin up exactly 1 Kafka and Zookeeper instance:
version: '3' services: zookeeper: image: confluentinc/cp-zookeeper:7.2.1 container_name: zookeeper environment: ZOOKEEPER_CLIENT_PORT: 2181 kafka: image: confluentinc/cp-kafka:7.2.1 container_name: kafka ports: - "8098:8098" depends_on: - zookeeper environment: KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181' KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:8098 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
Again, before we head to the next commands, let’s take a couple of minutes to understand, what exactly is going on here.
4.1. Docker Images Version And Containers
Firstly, let’s take a look at this piece of code:
version: '3' services: zookeeper: image: confluentinc/cp-zookeeper:7.2.1 container_name: zookeeper ... kafka: image: confluentinc/cp-kafka:7.2.1 container_name: kafka
As we can see, with this config we will spin up two services named
container_name will set specified names for our containers, instead of letting Docker generate them. Finally, we will use the 7.2.1 versions of Confluent Community Docker Image for Apache Kafka and Confluent Docker Image for Zookeeper.
4.2. Additional Kafka Container Configs
As the next step, let’s have a quick look at these four lines:
ports: - "8098:8098" depends_on: - zookeeper
As the name suggests, the ports command is responsible for exposing ports, so with this setting, port 8098 of our container will be accessible through the 8098 port of the host machine. Additionally, the depends_on will make sure to start the zookeeper container before the kafka.
5. Required Settings
As the next step, let’s take a closer look at the required settings for both Kafka and Zookeeper in our Docker Compose settings.
5.1. Required Zookeeper Settings
As we can see in our example, the only environment we specified is ZOOKEEPER_CLIENT_PORT. This one instructs Zookeeper where it should listen for connections by clients- Kafka in our case.
Additionally, when running in clustered mode, we have to set the ZOOKEEPER_SERVER_ID (but it’s not the case here).
5.2. Required Confluent Kafka Settings
When it comes to the Confluent Kafka Docker image, here is the list of the required environment variables:
- KAFKA_ZOOKEEPER_CONNECT – instructing Kafka where it can find the Zookeeper
- KAFKA_ADVERTISED_LISTENERS – specifying the advertised hostname, which clients can reach out to. Moreover, this value is sent to the Zookeeper
Please keep in mind, that in order to run only one instance of Kafka (aka single-node cluster), we have to additionally specify KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR set to 1. Its default value is 3 and if we don’t do that, we will get into this error: org.apache.kafka.common.errors.InvalidReplicationFactorException: Replication factor: 3 larger than available brokers: 1
6. Verify Kafka Docker Compose Config
With all of that being said, we can finally check out if everything is working, as expected.
6.1. Start Containers
As the first step, we have to start our containers:
docker compose up -d
The -d flag instructs the docker to run containers in a detached mode.
Of course, we can verify everything by running the
docker ps command:
# example output: CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 38b41bc43676 confluentinc/cp-kafka:7.2.1 "/etc/confluent/dock…" 42 seconds ago Up 4 seconds 0.0.0.0:8098->8098/tcp, 9092/tcp kafka 45484184f330 confluentinc/cp-zookeeper:7.2.1 "/etc/confluent/dock…" 42 seconds ago Up 5 seconds 2181/tcp, 2888/tcp, 3888/tcp zookeeper
6.2. Create Kafka Topic
Nextly, let’s create a new Kafka topic:
docker exec kafka kafka-topics --bootstrap-server localhost:8098 --create --topic example
As a word of explanation, the kafka-topics is a shell script that allows us to execute a TopicCommand tool. It’s a command-line tool that can manage and list topics in a Kafka cluster. Additionally, we have to specify the listener hostname (specified with KAFKA_ADVERTISED_LISTENERS) with –bootstrap-server.
When the topic is created we should see the following message: “Created topic example.”
6.3. Run Kafka Producer
Following, let’s run a console producer with the kafka-console-producer:
docker exec --interactive --tty kafka kafka-console-producer --bootstrap-server localhost:8098 --topic example
Please keep in mind that this command will start the producer and it will wait for our input (and you should notice the
> sign). Please specify a couple of messages and hit Ctrl + D after you finish:
>lorem >ipsum # hit Ctrl + D
6.4. Run Kafka Consumer
As the last step, let’s run a console consumer with the kafka-console-consumer command:
docker exec --interactive --tty kafka kafka-console-consumer --bootstrap-server localhost:8098 --topic example --from-beginning
This time, we should see that our messages are printed to the output successfully (and to finish please hit Ctrl+ C):
7. Kafka With Docker Compose Summary
And that would be all for this step-by-step guide on How To Set Up Apache Kafka With Docker Compose.
Let me know about your thoughts in the comment section, or reach out through the contact form- I highly appreciate your thoughts, comments and feedback.
Take care brother/sister!