Discover more from Matt Rickard
Cache invalidation is hard. Even if it's not really "cache invalidation." The problem is that you often want denormalized data from your relational databases. But complex joins and large amounts of data can make those queries expensive (in terms of both time and dollar cost).
The answer is often an incremental approach. A materialized view provides an up-to-date cached table of the denormalized data. They've been around in some form since 1998 (Oracle 8). You can manually implement them with triggers and state functions, but those solutions aren't generalizable.
The industry seems to be backfilling popular database products with support for martialized views. BigQuery added support in 2020. Snowflake
The recent few years of innovation have been built off two papers:
Differential dataflow (2013)
Out of this research, there's been a few different startups (e.g., Readyset, Materialized) that implement a common wire protocol (Postgres/MySQL) and add support for materialized views via one of these methods.