Columnar Storage as Database Normalization: Understanding the Relational Foundation
By
ibobev
Crispy enough to crunch, soft enough to enjoy. A good bake.
Summary
The article explains that columnar storage in databases is essentially a form of normalization within the relational model, not a completely foreign concept. It demonstrates how row-oriented data can be transformed into column-oriented format through normalization techniques, showing that columnar storage maintains relational properties while optimizing for analytical queries. The author clarifies that this transformation is about reorganizing data representation rather than abandoning relational principles.
Key quotes
· 4 pulledSomething I didn't understand for a while is that the process of turning row-oriented data into column-oriented data isn't a totally bespoke, foreign concept in the realm of databases.
It's still of the relational abstraction. Or can be.
This representation has some nice properties.
It's easy to add a new row: we can just construct a row:
You might also wanna read
pg_lake: PostgreSQL Extension for Iceberg and Data Lake Integration
pg_lake is a PostgreSQL extension developed by Snowflake Labs that enables Postgres to function as a lakehouse system by integrating with Ic
Kore: A New High-Performance Columnar File Format for Big Data Analytics
Kore is a new high-performance binary file format for analytical workloads, claiming superior compression (38% vs 63% for Parquet), 131x que
How Mindbox replaced PySpark with YAML-based pipelines using dlt, dbt, and Trino
Data engineer Kiril Kazlou describes how Mindbox replaced PySpark-based data pipelines with a stack using dlt, dbt, and Trino, configured th
Postgres-Backed Durable Workflow Execution: An Alternative to External Orchestration Systems
This article explains the concept of durable workflow execution using Postgres as the backing database, as implemented by the DBOS system. I
dbos.dev·3d agoPostgres-Backed Durable Workflow Execution: An Alternative to External Orchestration Systems
This article explains the concept of durable workflow execution using Postgres as the backing database, as implemented by the DBOS system. I
dbos.dev·3d agoSix SQL patterns for detecting transaction fraud in benefit programs
A data professional on a program-integrity team shares six practical SQL patterns for detecting transaction fraud in government benefit prog
