Nº04 · Processing
Apache Kafka
The nervous system for real-time data.
What is it?
Apache Kafka is a distributed event streaming platform. Systems publish events to topics and others consume them, in real time and decoupled. Kafka stores those streams durably, so multiple consumers can read them at their own pace.
What is it for?
- Moving data in real time between services, databases and pipelines.
- Decoupling producers and consumers (one event, many readers).
- Feeding streaming processing (Spark, Flink) or ingestion into a data lake.
When to use it / when not
Use it when you need a durable, high-throughput event bus, or event-driven architectures where several systems react to the same stream.
Think twice for purely batch data (a daily file doesn't need Kafka) or for a simple task queue — there a traditional queue is lighter.
Get started in 1 minute
You need a running broker. The fastest way to try it locally is one with Docker:
docker run -d -p 9092:9092 apache/kafka:latest
pip install confluent-kafka
from confluent_kafka import Producer
producer = Producer({"bootstrap.servers": "localhost:9092"})
producer.produce("sales", key="ES", value='{"amount": 100.5}')
producer.flush()
print("Event published to topic 'sales'")
Quick trivia — test what you just read.
How much do you know about Apache Kafka?
Official documentation
The source of truth lives there. Here we orient you; the depth is up to you.
Open official docs ↗What to learn next
See alsoNº04 · Updated 2026-06-08