Build fluency in Apache Flink SQL — watermarks, window types, continuous queries, and connector DDL definitions.
0 / 5 completed
1 / 5
A data engineer asks why the Flink SQL job misses late events. In a PR review, you identify the missing WATERMARK definition. What does a watermark do in Flink SQL?
A WATERMARK declaration (e.g. WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND) defines event-time progress. Flink closes windows and emits results when the watermark passes the window end.
2 / 5
During a design review, the team debates using a tumbling window vs a hopping window for a 1-minute sales aggregate. The difference is:
Tumbling windows partition time into non-overlapping, equal-sized buckets. Hopping windows have a size and a slide; when slide is less than size, windows overlap and an event can belong to multiple windows.
3 / 5
A Flink SQL job needs to write aggregated results to an external database. In a PR, you use INSERT INTO sink_table SELECT ... FROM source_table. A colleague asks what kind of statement this creates in streaming mode:
In Flink SQL streaming mode, INSERT INTO ... SELECT creates a continuous query — it runs indefinitely, processing events as they arrive and writing results to the sink table in real time.
4 / 5
In a standup, a teammate says session windows are better for user-activity tracking than tumbling windows. You agree and explain that a session window in Flink SQL closes when:
Session windows have no fixed size — they group events separated by inactivity gaps smaller than the session gap. The window closes when a gap longer than the timeout is detected.
5 / 5
A Flink SQL CREATE TABLE statement defines a Kafka connector. A review comment asks what this DDL statement does at runtime. The correct answer is:
CREATE TABLE in Flink SQL is a virtual table definition. It stores connector config (topic, format, offset) in the catalog; Flink uses it to bind SQL queries to the external Kafka system.