Use materialized views
A materialized view calculates intermediate aggregate states for data from a source table according to the SELECT
query, which is specified in the view definition, and saves the results to its internal table or to a separate target table. Aggregate states in the view are updated automatically each time the source table receives new data. Materialized views allow you to optimize the execution time of complex queries with aggregates that access a large number of data records — precomputation of totals reduces the amount of data needed to be processed for SELECT
query execution.
Create materialized view
Syntax overview
To create a materialized view, use the CREATE MATERIALIZED VIEW
query. The basic syntax is:
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [<db_name>.]<mat_view_name> [ON CLUSTER <cluster_name>]
[TO [<db_name>.]<target_table>]
[ENGINE = <engine>]
[POPULATE]
AS SELECT
...
FROM [<db_name>.]<source_table>
...;
Below are clauses used to define a materialized view.
TO |
Target table to which the materialized view should forward the data |
ENGINE |
Engine of the materialized view’s internal table where aggregated data should be stored if a separate target table has not been explicitly specified in the materialized view definition via the |
POPULATE |
Indicates that data existing in the source table when the materialized view is being created should be automatically loaded into the view. It is not recommended to use |
AS SELECT … FROM … |
Query that reads new data inserted into the source table and determines how to transform that data for loading into the materialized view (or into the target table, if any). The You can change the |
NOTE
Any changes to data existing in the source table (such as update, delete, etc.) do not affect the materialized view. |
In ADQM, materialized views look the same as normal tables. For example, they are listed in the result of the SHOW TABLES
query.
To delete a materialized view, run the DROP VIEW
or DROP TABLE
query.
Example
This example shows how to create a materialized view to pre-aggregate data and speed up subsequent SELECT
query executions.
-
Create a source table to store pageview data:
CREATE TABLE page_views ( time DateTime, page_name String, time_on_page UInt64 ) ENGINE = MergeTree ORDER BY time;
-
Create a target table to store pageview counts and average pageview time by month:
CREATE TABLE page_views_monthly ( page_name String, month Date, page_view_count UInt64, avg_time_on_page AggregateFunction(avg, UInt64) ) ENGINE = SummingMergeTree() ORDER BY (page_name, month);
A column with average pageview time is of the AggregateFunction data type that allows storing partially aggregated data. The SummingMergeTree table engine used in the example requires this data type for all aggregates except
sum
orcount
. To work with materialized views, you can also use other table engines. For example, use the AggregatingMergeTree engine that works with aggregate functions only (this means that if you set the AggregatingMergeTree engine in the query above, thepage_view_count
column should also be of the AggregateFunction type). -
Create a materialized view:
CREATE MATERIALIZED VIEW page_views_monthly_mv TO page_views_monthly AS SELECT toDate(toStartOfMonth(time)) AS month, page_name, count() as page_view_count, avgState(time_on_page) AS avg_time_on_page FROM page_views GROUP BY page_name, month ORDER BY page_name, month;
When data is inserted into the
page_views
source table, the materialized view will transform it as specified in theSELECT
definition and send the result to thepage_views_monthly
target table. To insert intermediate aggregate states into a target table’savg_time_on_page
column of the AggregateFunction type, use the appropriate aggregate function with theState
suffix (avgState
) in the view’sSELECT
definition.If you run the
SHOW TABLES
query at this point, the output will include the created tables and materialized view:┌─name──────────────────┐ │ page_views │ │ page_views_monthly │ │ page_views_monthly_mv │ └───────────────────────┘
-
Populate the source table with test data. For example, randomly generate 1 million views for 4 pages:
INSERT INTO page_views SELECT toDate('2023-01-01 00:00:00') + (rand() % 182) AS time, ['landing', 'products', 'docs', 'about'][toInt32(randBinomial(4, 0.01)) + 1], rand() % 15 FROM numbers(1000000);
-
Compare the speed of the
SELECT
query execution when fetching data from the source table with raw input data and from the table with pre-aggregated monthly data.
Select data from thepage_views
table:SELECT page_name, toDate(toStartOfMonth(time)) AS month, count() AS page_view_count, round(avg(time_on_page), 2) AS avg_time_on_page FROM page_views GROUP BY page_name, month ORDER BY page_name, month;
┌─page_name─┬──────month─┬─page_view_count─┬─avg_time_on_page─┐ │ about │ 2023-02-01 │ 2 │ 7 │ │ about │ 2023-03-01 │ 1 │ 7 │ │ docs │ 2023-01-01 │ 95 │ 7.21 │ │ docs │ 2023-02-01 │ 93 │ 6.42 │ │ docs │ 2023-03-01 │ 102 │ 6.72 │ │ docs │ 2023-04-01 │ 84 │ 7.35 │ │ docs │ 2023-05-01 │ 101 │ 7.18 │ │ docs │ 2023-06-01 │ 92 │ 7.72 │ │ docs │ 2023-07-01 │ 5 │ 8.6 │ │ landing │ 2023-01-01 │ 163242 │ 7.01 │ │ landing │ 2023-02-01 │ 147695 │ 7 │ │ landing │ 2023-03-01 │ 163930 │ 6.99 │ │ landing │ 2023-04-01 │ 158155 │ 7.01 │ │ landing │ 2023-05-01 │ 163263 │ 6.99 │ │ landing │ 2023-06-01 │ 158840 │ 7 │ │ landing │ 2023-07-01 │ 5396 │ 7.02 │ │ products │ 2023-01-01 │ 6667 │ 7.05 │ │ products │ 2023-02-01 │ 6099 │ 6.99 │ │ products │ 2023-03-01 │ 6589 │ 7.02 │ │ products │ 2023-04-01 │ 6391 │ 7.01 │ │ products │ 2023-05-01 │ 6428 │ 6.98 │ │ products │ 2023-06-01 │ 6517 │ 7.04 │ │ products │ 2023-07-01 │ 213 │ 7.48 │ └───────────┴────────────┴─────────────────┴──────────────────┘ 23 rows in set. Elapsed: 0.023 sec. Processed 1.00 million rows, 28.04 MB (43.43 million rows/s., 1.22 GB/s.)
Select data from the
page_views_monthly
table — reading from this table will be 7 times faster than from the source table (with more data for selection, the difference will be even more noticeable). Note that theSELECT
query uses the same aggregate function as for data insertion, but with theMerge
suffix — it is required to combine a set of intermediate aggregate states and return the result of complete data aggregation (see AggregateFunction):SELECT page_name, month, sum(page_view_count) AS page_view_count, round(avgMerge(avg_time_on_page), 2) AS avg_time_on_page FROM page_views_monthly GROUP BY page_name, month ORDER BY page_name, month;
┌─page_name─┬──────month─┬─page_view_count─┬─avg_time_on_page─┐ │ about │ 2023-02-01 │ 2 │ 7 │ │ about │ 2023-03-01 │ 1 │ 7 │ │ docs │ 2023-01-01 │ 95 │ 7.21 │ │ docs │ 2023-02-01 │ 93 │ 6.42 │ │ docs │ 2023-03-01 │ 102 │ 6.72 │ │ docs │ 2023-04-01 │ 84 │ 7.35 │ │ docs │ 2023-05-01 │ 101 │ 7.18 │ │ docs │ 2023-06-01 │ 92 │ 7.72 │ │ docs │ 2023-07-01 │ 5 │ 8.6 │ │ landing │ 2023-01-01 │ 163242 │ 7.01 │ │ landing │ 2023-02-01 │ 147695 │ 7 │ │ landing │ 2023-03-01 │ 163930 │ 6.99 │ │ landing │ 2023-04-01 │ 158155 │ 7.01 │ │ landing │ 2023-05-01 │ 163263 │ 6.99 │ │ landing │ 2023-06-01 │ 158840 │ 7 │ │ landing │ 2023-07-01 │ 5396 │ 7.02 │ │ products │ 2023-01-01 │ 6667 │ 7.05 │ │ products │ 2023-02-01 │ 6099 │ 6.99 │ │ products │ 2023-03-01 │ 6589 │ 7.02 │ │ products │ 2023-04-01 │ 6391 │ 7.01 │ │ products │ 2023-05-01 │ 6428 │ 6.98 │ │ products │ 2023-06-01 │ 6517 │ 7.04 │ │ products │ 2023-07-01 │ 213 │ 7.48 │ └───────────┴────────────┴─────────────────┴──────────────────┘ 23 rows in set. Elapsed: 0.003 sec.
Modify materialized view
Use the ALTER TABLE…MODIFY QUERY
clause to modify a SELECT
query that was specified when a materialized view was created. This command is not recommended for materialized views without target tables assigned via the TO
clause as it does not change the structure and column definitions of a materialized view’s internal table.
Before executing the ALTER TABLE…MODIFY QUERY
query, enable the option: SET allow_experimental_alter_materialized_view_structure=1
.
Example
-
Use the ALTER TABLE…ADD COLUMN clause to add the
browser
column to thepage_views
source table and to thepage_views_monthly
target table (columns of a materialized view and its target table do not have to match, so changing the target table’s structure will not break data insertion):ALTER TABLE page_views ADD COLUMN browser String;
ALTER TABLE page_views_monthly ADD COLUMN browser String after month, MODIFY ORDER BY (page_name, month, browser);
-
Insert data into the source table:
INSERT INTO page_views SELECT toDate('2023-01-01 00:00:00') + (rand() % 182) AS time, ['landing', 'products', 'docs', 'about'][toInt32(randBinomial(4, 0.01)) + 1], rand() % 15, ['firefox', 'safary', 'chrome'][(number % 3) + 1] FROM numbers(1000000);
If you read data from the target table at this stage, you will see that the
browser
column is empty:SELECT page_name, month, browser, sum(page_view_count) AS page_view_count, round(avgMerge(avg_time_on_page), 2) AS avg_time_on_page FROM page_views_monthly GROUP BY page_name, month, browser ORDER BY page_name, month ASC, browser;
┌─page_name─┬──────month─┬─browser─┬─page_view_count─┬─avg_time_on_page─┐ │ about │ 2023-01-01 │ │ 1 │ 14 │ │ about │ 2023-02-01 │ │ 3 │ 6.67 │ │ about │ 2023-03-01 │ │ 2 │ 8.5 │ │ about │ 2023-05-01 │ │ 1 │ 12 │ │ docs │ 2023-01-01 │ │ 203 │ 7.13 │ │ docs │ 2023-02-01 │ │ 175 │ 6.91 │ │ docs │ 2023-03-01 │ │ 185 │ 7.21 │ │ docs │ 2023-04-01 │ │ 175 │ 7.15 │ │ docs │ 2023-05-01 │ │ 182 │ 7.53 │ │ docs │ 2023-06-01 │ │ 190 │ 7.94 │ │ docs │ 2023-07-01 │ │ 8 │ 9.88 │ │ landing │ 2023-01-01 │ │ 326986 │ 7.01 │ │ landing │ 2023-02-01 │ │ 295510 │ 6.99 │ │ landing │ 2023-03-01 │ │ 327021 │ 7 │ │ landing │ 2023-04-01 │ │ 316761 │ 7 │ │ landing │ 2023-05-01 │ │ 327207 │ 7 │ │ landing │ 2023-06-01 │ │ 316732 │ 6.99 │ │ landing │ 2023-07-01 │ │ 10702 │ 7.05 │ │ products │ 2023-01-01 │ │ 13224 │ 7.07 │ │ products │ 2023-02-01 │ │ 12103 │ 7.02 │ │ products │ 2023-03-01 │ │ 13225 │ 6.99 │ │ products │ 2023-04-01 │ │ 12807 │ 6.98 │ │ products │ 2023-05-01 │ │ 13156 │ 6.95 │ │ products │ 2023-06-01 │ │ 13000 │ 7.01 │ │ products │ 2023-07-01 │ │ 441 │ 6.95 │ └───────────┴────────────┴─────────┴─────────────────┴──────────────────┘
-
To send data from the new column of the source table to the corresponding column of the target table, you need to change the
SELECT
query of the materialized view. To do this, run theALTER TABLE…MODIFY QUERY
command:SET allow_experimental_alter_materialized_view_structure=1;
ALTER TABLE page_views_monthly_mv MODIFY QUERY SELECT toDate(toStartOfMonth(time)) AS month, page_name, browser, count() AS page_view_count, avgState(time_on_page) AS avg_time_on_page FROM page_views GROUP BY page_name, month, browser ORDER BY page_name, month, browser;
-
Add data to the source table again:
INSERT INTO page_views SELECT toDate('2023-01-01 00:00:00') + (rand() % 182) AS time, ['landing', 'products', 'docs', 'about'][toInt32(randBinomial(4, 0.01)) + 1], rand() % 15, ['firefox', 'safary', 'chrome'][(number % 3) + 1] FROM numbers(1000000);
Make sure the data from the
browser
column is now inserted into the target table:SELECT page_name, month, browser, sum(page_view_count) AS page_view_count, round(avgMerge(avg_time_on_page), 2) AS avg_time_on_page FROM page_views_monthly GROUP BY page_name, month, browser ORDER BY page_name, month ASC, browser;
┌─page_name─┬──────month─┬─browser─┬─page_view_count─┬─avg_time_on_page─┐ │ about │ 2023-01-01 │ │ 1 │ 14 │ │ about │ 2023-02-01 │ │ 3 │ 6.67 │ │ about │ 2023-02-01 │ safary │ 1 │ 7 │ │ about │ 2023-03-01 │ │ 2 │ 8.5 │ │ about │ 2023-04-01 │ firefox │ 1 │ 1 │ │ about │ 2023-05-01 │ │ 1 │ 12 │ │ about │ 2023-05-01 │ chrome │ 1 │ 14 │ │ about │ 2023-05-01 │ safary │ 1 │ 6 │ │ docs │ 2023-01-01 │ │ 203 │ 7.13 │ │ docs │ 2023-01-01 │ chrome │ 37 │ 6.92 │ │ docs │ 2023-01-01 │ firefox │ 38 │ 5.58 │ │ docs │ 2023-01-01 │ safary │ 29 │ 5.79 │ │ docs │ 2023-02-01 │ │ 175 │ 6.91 │ │ docs │ 2023-02-01 │ chrome │ 30 │ 6.77 │ │ docs │ 2023-02-01 │ firefox │ 26 │ 6.65 │ │ docs │ 2023-02-01 │ safary │ 18 │ 5.89 │ │ docs │ 2023-03-01 │ │ 185 │ 7.21 │ │ docs │ 2023-03-01 │ chrome │ 42 │ 7.83 │ │ docs │ 2023-03-01 │ firefox │ 31 │ 7.65 │ │ docs │ 2023-03-01 │ safary │ 33 │ 5.88 │ │ docs │ 2023-04-01 │ │ 175 │ 7.15 │ │ docs │ 2023-04-01 │ chrome │ 28 │ 8.25 │ │ docs │ 2023-04-01 │ firefox │ 25 │ 6.32 │ │ docs │ 2023-04-01 │ safary │ 26 │ 6.58 │ │ docs │ 2023-05-01 │ │ 182 │ 7.53 │ │ docs │ 2023-05-01 │ chrome │ 32 │ 7.03 │ │ docs │ 2023-05-01 │ firefox │ 20 │ 7.45 │ │ docs │ 2023-05-01 │ safary │ 34 │ 6.59 │ │ docs │ 2023-06-01 │ │ 190 │ 7.94 │ │ docs │ 2023-06-01 │ chrome │ 35 │ 8.37 │ │ docs │ 2023-06-01 │ firefox │ 29 │ 6.69 │ │ docs │ 2023-06-01 │ safary │ 27 │ 7.15 │ │ docs │ 2023-07-01 │ │ 8 │ 9.88 │ │ docs │ 2023-07-01 │ chrome │ 2 │ 2 │ │ docs │ 2023-07-01 │ firefox │ 1 │ 1 │ │ docs │ 2023-07-01 │ safary │ 1 │ 1 │ │ landing │ 2023-01-01 │ │ 326986 │ 7.01 │ │ landing │ 2023-01-01 │ chrome │ 54763 │ 6.99 │ │ landing │ 2023-01-01 │ firefox │ 54665 │ 7.02 │ │ landing │ 2023-01-01 │ safary │ 54294 │ 7 │ │ landing │ 2023-02-01 │ │ 295510 │ 6.99 │ │ landing │ 2023-02-01 │ chrome │ 48944 │ 7.03 │ │ landing │ 2023-02-01 │ firefox │ 49031 │ 7.03 │ │ landing │ 2023-02-01 │ safary │ 49298 │ 7 │ │ landing │ 2023-03-01 │ │ 327021 │ 7 │ │ landing │ 2023-03-01 │ chrome │ 54381 │ 6.98 │ │ landing │ 2023-03-01 │ firefox │ 54433 │ 7.03 │ │ landing │ 2023-03-01 │ safary │ 54556 │ 7.01 │ │ landing │ 2023-04-01 │ │ 316761 │ 7 │ │ landing │ 2023-04-01 │ chrome │ 52931 │ 6.95 │ │ landing │ 2023-04-01 │ firefox │ 52706 │ 6.99 │ │ landing │ 2023-04-01 │ safary │ 53075 │ 6.99 │ │ landing │ 2023-05-01 │ │ 327207 │ 7 │ │ landing │ 2023-05-01 │ chrome │ 54538 │ 6.99 │ │ landing │ 2023-05-01 │ firefox │ 54733 │ 7.02 │ │ landing │ 2023-05-01 │ safary │ 54618 │ 7.02 │ │ landing │ 2023-06-01 │ │ 316732 │ 6.99 │ │ landing │ 2023-06-01 │ chrome │ 52763 │ 6.97 │ │ landing │ 2023-06-01 │ firefox │ 52955 │ 7 │ │ landing │ 2023-06-01 │ safary │ 52795 │ 7.01 │ │ landing │ 2023-07-01 │ │ 10702 │ 7.05 │ │ landing │ 2023-07-01 │ chrome │ 1762 │ 7.08 │ │ landing │ 2023-07-01 │ firefox │ 1747 │ 6.98 │ │ landing │ 2023-07-01 │ safary │ 1767 │ 7.09 │ │ products │ 2023-01-01 │ │ 13224 │ 7.07 │ │ products │ 2023-01-01 │ chrome │ 2208 │ 7.04 │ │ products │ 2023-01-01 │ firefox │ 2210 │ 6.99 │ │ products │ 2023-01-01 │ safary │ 2273 │ 6.99 │ │ products │ 2023-02-01 │ │ 12103 │ 7.02 │ │ products │ 2023-02-01 │ chrome │ 1970 │ 7.1 │ │ products │ 2023-02-01 │ firefox │ 1978 │ 6.88 │ │ products │ 2023-02-01 │ safary │ 1870 │ 7.01 │ │ products │ 2023-03-01 │ │ 13225 │ 6.99 │ │ products │ 2023-03-01 │ chrome │ 2189 │ 6.97 │ │ products │ 2023-03-01 │ firefox │ 2186 │ 6.94 │ │ products │ 2023-03-01 │ safary │ 2251 │ 7.18 │ │ products │ 2023-04-01 │ │ 12807 │ 6.98 │ │ products │ 2023-04-01 │ chrome │ 2252 │ 6.91 │ │ products │ 2023-04-01 │ firefox │ 2210 │ 7.1 │ │ products │ 2023-04-01 │ safary │ 2089 │ 7.04 │ │ products │ 2023-05-01 │ │ 13156 │ 6.95 │ │ products │ 2023-05-01 │ chrome │ 2214 │ 7 │ │ products │ 2023-05-01 │ firefox │ 2188 │ 6.94 │ │ products │ 2023-05-01 │ safary │ 2151 │ 6.98 │ │ products │ 2023-06-01 │ │ 13000 │ 7.01 │ │ products │ 2023-06-01 │ chrome │ 2134 │ 6.89 │ │ products │ 2023-06-01 │ firefox │ 2054 │ 7.18 │ │ products │ 2023-06-01 │ safary │ 2062 │ 6.91 │ │ products │ 2023-07-01 │ │ 441 │ 6.95 │ │ products │ 2023-07-01 │ chrome │ 77 │ 6.65 │ │ products │ 2023-07-01 │ firefox │ 67 │ 7.27 │ │ products │ 2023-07-01 │ safary │ 64 │ 7.16 │ └───────────┴────────────┴─────────┴─────────────────┴──────────────────┘