Real-Time Streaming Vocabulary

Tip: In streaming, semantics matter: exactly-once delivery is much harder than at-least-once — know the trade-offs.

0 / 5 completed

Exercise 1 of 5

The data engineer says: 'We publish order events to a Kafka topic — each partition is ordered, and consumers in the same group each read from different partitions in parallel.'

What is a Kafka topic partition?

Exercise 2 of 5

The architect explains: 'We use consumer groups — each group gets every message once, but within the group each partition is assigned to exactly one consumer. Add consumers to scale out.'

What is the purpose of a Kafka consumer group?

Exercise 3 of 5

The streaming engineer says: 'We use event time, not processing time, for our windowed aggregations — this handles late-arriving events correctly with watermarks.'

What is the difference between event time and processing time in stream processing?

Exercise 4 of 5

The team debates delivery guarantees: 'At-least-once is simpler — we might duplicate, but nothing is lost. Exactly-once requires idempotent producers and transactional consumers.'

What does exactly-once delivery guarantee in a streaming system?

Exercise 5 of 5

The Flink developer explains: 'We checkpoint state every 30 seconds — if the job crashes, it restores from the last checkpoint and replays events from that offset. That gives us fault tolerance.'

What is checkpointing in stream processing?