Master the MergeTree engine, sparse primary index, columnar compression, and materialised views for real-time analytics
0 / 5 completed
1 / 5
What is ClickHouse designed for?
ClickHouse: is a column-oriented OLAP database built for analytics at scale — billions of rows, fast aggregations, and real-time dashboards. It trades the per-row transactional strengths of OLTP systems for extremely fast columnar scans, compression, and vectorised execution suited to analytical reporting and observability data.
2 / 5
What is the role of the MergeTree engine family?
MergeTree: stores rows in immutable parts sorted by the ORDER BY (sorting key). Background merges combine small parts into larger ones, keeping data sorted and compact. This sorted layout enables efficient range scans and a sparse primary index, forming the foundation for variants like ReplacingMergeTree and SummingMergeTree.
3 / 5
How does the sparse primary index in ClickHouse work?
Sparse index: ClickHouse indexes one entry per granule (a block of rows, by default 8192) instead of every row. A query on the sorting key uses these marks to skip to the relevant granules and scan only those, so the primary index stays small and memory-resident even for tables with billions of rows.
4 / 5
Why does ClickHouse achieve strong compression?
Compression: columnar storage groups homogeneous values, so specialised codecs (e.g. Delta, DoubleDelta, Gorilla) plus LZ4/ZSTD compress them very effectively. High compression reduces disk I/O — often the bottleneck in analytics — which both saves storage and speeds up scans because fewer bytes are read.
5 / 5
What is a materialised view in ClickHouse used for?
Materialised views: in ClickHouse, a materialised view runs on INSERT, transforming new rows and writing them to a target table (often a SummingMergeTree or AggregatingMergeTree). This continuously maintains pre-aggregated summaries so dashboard queries read small, pre-computed tables instead of scanning raw events.