AdvancedVocabulary#flink#stream-processing#event-time#stateful#big-data

Apache Flink DataStream API

Apache Flink's DataStream API enables stateful stream processing with event-time semantics and exactly-once guarantees. Master vocabulary for watermark advancement and timer behavior, RocksDB vs HashMapStateBackend trade-offs, keyBy partitioning, 2PC Kafka sink recovery, and allowedLateness for late event handling.

0 / 5 completed

1 / 5

A Flink job uses a KeyedProcessFunction with a timer set for 10 seconds in event time. The watermark does not advance for 5 minutes. What happens to the timer?

2 / 5

What is the purpose of Flink's RocksDBStateBackend compared to the HashMapStateBackend?

3 / 5

A developer calls

stream.keyBy(event -> event.userId).window(TumblingEventTimeWindows.of(Time.minutes(5))).aggregate(new MyAggregateFunction())

. What does keyBy do to the parallelism?

4 / 5

A Flink job uses two-phase commit for exactly-once Kafka output. The pre-commit phase succeeds but the commit phase fails before acknowledgment. What happens on job recovery?

5 / 5

What does the allowedLateness setting on a Flink window do?