[CLI] Producer performance tests

In this kata, we will explore how to measure and optimize the performance of a Kafka producer. Kafka is a distributed streaming platform widely used for building real-time data pipelines and applications. Understanding the producer's throughput, latency, and resource utilization is critical for ensuring reliable and efficient data flow in production environments.

Through this exercise, you'll learn to use tools for performance tests, analyze the results, and identify potential bottlenecks. By the end, you'll be equipped with practical techniques to tune Kafka producer settings for maximum performance.

Implementation

  1. Create a file producer.properties with given payload:
    • bootstrap.servers=localhost:9092
      #compression.type=
      #acks=
      #enable.idempotence=
      #linger.ms=
      #max.request.size=
      #batch.size=
      #buffer.memory=
  2. Download producer-perf-tests-data.json file.
  3. Run ./kafka-producer-perf-test.sh script with parameters:
    • --num-records 1000000
      --payload-file ./producer-perf-tests-data.json
      --topic topic_0
      --throughput -1
      --throughput -1
      --producer.config ./producer.properties
      --print-metrics
  4. Change producer configurations in ./producer.properties file.
    • bootstrap.servers=localhost:9092
      #compression.type=
      #acks=
      #enable.idempotence=
      linger.ms=500
      #max.request.size=
      batch.size=65536
      #buffer.memory=
  5. Run test again, compare results with previous run.
  6. Try to tweak all the other producer configuration parameters, run the tests as many times as needed.