Matt Rickard

Share this post

Materializing Results

blog.matt-rickard.com

Discover more from Matt Rickard

Thoughts on engineering, startups, and AI.
Continue reading
Sign in

Materializing Results

Sep 9, 2022
Share this post

Materializing Results

blog.matt-rickard.com
Share

Cache invalidation is hard. Even if it's not really "cache invalidation." The problem is that you often want denormalized data from your relational databases. But complex joins and large amounts of data can make those queries expensive (in terms of both time and dollar cost).

The answer is often an incremental approach. A materialized view provides an up-to-date cached table of the denormalized data. They've been around in some form since 1998 (Oracle 8). You can manually implement them with triggers and state functions, but those solutions aren't generalizable.

The industry seems to be backfilling popular database products with support for martialized views. BigQuery added support in 2020. Snowflake

The recent few years of innovation have been built off two papers:

  • Differential dataflow (2013)

  • Noria: dynamic, partially-stateful data-flow for high-performance web applications (2018)

Out of this research, there's been a few different startups (e.g., Readyset, Materialized) that implement a common wire protocol (Postgres/MySQL) and add support for materialized views via one of these methods.

Share this post

Materializing Results

blog.matt-rickard.com
Share
Previous
Next
Comments
Top
New
Community

No posts

Ready for more?

© 2023 Matt Rickard
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing