Member-only story
Consume Messages From Kafka Topics Using Python and Avro Consumer
Polling and deserializing with Kafka and Avro
Overview
This tutorial is an addition to another tutorial I recently wrote on how to produce Avro records to a Kafka topic.
In this tutorial, we will learn how to write an Avro consumer that is capable of polling messages from a Kafka topic and deserializing them based on the Avro schema.
You can download the code from this GitHub repo.
Avro Consumer
As we have done a lot of work in the initial commit on the aforementioned repo for the Avro producer, writing the consumer is pretty simple. All the dependencies have been covered by our producer code, so we can get started right away.
When we produce an Avro record to a Kafka topic, our producer needs to encode the Avro schema into it and serialzse it into a byte array. On the other hand, when we consume the Avro record, our consumer needs to deserialize the byte array and decode it using the Avro schema into a text or object that our human eyes can read.
Function to Consume Record from Kafka Topic
Alright, let’s go ahead and write our Avro consumer. Create a new Python file named consumer_record.py
, and its content will be as follows:
Let’s go through the code above so we all understand what’s going on:
- Lines 7-14: Here, we basically set the configuration values for our consumer — namely the bootstrap servers, Schema Registry URL, consumer group ID, and auto-offset reset property. The auto-offset reset property essentially tells our consumer from when it should start polling for records. We set it to
earliest
so that it will start consuming from the beginning of the Kafka topic. If we set it tolatest
and there are already a few messages in the topic before this consumer is initialized and subscribed to the topic, those messages will be…