Chapter 5 Storage Models & Compression¶

DATABASE WORKLOADS¶

On-Line Transaction Processing (OLTP)
- Fast operations that only read/update a small amount of data each time.
On-Line Analytical Processing (OLAP)
- Complex queries that read a lot of data to compute aggregates.
Hybrid Transaction + Analytical Processing
- OLTP + OLAP together on the same database instance

alt text

OBSERVATION

The relational model does not specify that the DBMS must store all a tuple's attributes together in a single page.

This may not actually be the best layout for some workloads…

OLTP¶

On-line Transaction Processing: - Simple queries that read/update a small amount of data that is related to a single entity in the database.

This is usually the kind of application that people build first.

OLAP¶

On-line Analytical Processing: - Complex queries that read large portions of the database spanning multiple entities.

You execute these workloads on the data you have collected from your OLTP application(s).

DATA STORAGE MODELS¶

The DBMS can store tuples in different ways that are better for either OLTP or OLAP workloads.

We have been assuming the n-ary storage model (aka "row storage", 别名"行存储") so far this semester.

N-ARY STORAGE MODEL (NSM)

The DBMS stores all attributes for a single tuple contiguously in a page.

Ideal for OLTP workloads where queries tend to operate only on an individual entity and insert heavy workloads.

The DBMS stores all attributes for a single tuple contiguously in a page.

Query:

alt text

Insert:

alt text

But here we met a serious problem: We only need some column but SSD stores all the columns

alt text

Advantages - Fast inserts, updates, and deletes. - Good for queries that need the entire tuple.

Disadvantages - Not good for scanning large portions of the table and/or a subset of the attributes.

DECOMPOSITION STORAGE MODEL (DSM)¶

The DBMS stores the values of a single attribute for all tuples contiguously in a page.

Also known as a "column store"

Ideal for OLAP workloads where read-only queries perform large scans over a subset of the table’s attributes.

alt text

Advantages - Reduces the amount wasted I/O because the DBMS only reads the data that it needs. - Better query processing and data compression (more on this later).

Disadvantages - Slow for point queries, inserts, updates, and deletes because of tuple splitting/stitching. (more cost on "changing")

REAL-WORLD DATA CHARACTERISTICS

COMMON SENSE

Data sets tend to have highly skewed distributions (偏态分布) for attribute values.
Data sets tend to have high correlation between attributes of the same tuple.

OBSERVATION

I/O is the main bottleneck if the DBMS fetches data from disk during query execution.
The DBMS can compress pages to increase the utility of the data moved per I/O operation.
Key trade-off is speed vs. compression ratio
Compressing the database reduces DRAM requirements.
It may decrease CPU costs during query execution.

DATABASE COMPRESSION¶

Goal #1: Must produce fixed-length values.

Only exception is var-length data stored in separate pool.

Goal #2: Postpone decompression for as long as possible during query execution.

Also known as late materialization (晚期实现).

Goal #3: Must be a lossless scheme.

When a DBMS uses compression, it is always lossless because people don't like losing data.
Any kind of lossy compression must be performed at the application level.

COMPRESSION GRANULARITY¶

granularity: 粒度

Choice #1: Block-level

Compress a block of tuples for the same table.

Choice #2: Tuple-level

Compress the contents of the entire tuple (NSM-only).

Choice #3: Attribute-level

Compress a single attribute within one tuple (overflow).
Can target multiple attributes for the same tuple.

Choice #4: Column-level

Compress multiple values for one or more attributes stored for multiple tuples (DSM-only).

We mainly focus on Block-Level and Column-Level in this part

BLOCK-LEVEL¶

ref-MySQL

Note

The DBMS must decompress data first before it can be read and (potentially) modified.
This limits the "scope" of the compression scheme.
These schemes also do not consider the high-level meaning or semantics of the data.

Ideally, we want the DBMS to operate on compressed data without decompressing it first.

alt text

COLUMN-LEVEL¶

Methods

Run-length Encoding
Bit-Packing Encoding
Bitmap Encoding
Delta Encoding
Incremental Encoding
Dictionary Encoding

RUN-LENGTH ENCODING

Compress runs of the same value in a single column into triplets:

The value of the attribute.
The start position in the column segment.
The # of elements in the run.

alt text

Requires the columns to be sorted intelligently to maximize compression opportunities.

alt text

BIT-PACKING ENCODING

When values for an attribute are always less than the value's declared largest size, store them as smaller data type.

alt text

MOSTLY ENCODING

Bit-packing variant that uses a special marker to indicate when a value exceeds largest size and then maintain a look-up table to store them.

alt text

BITMAP ENCODING

Store a separate bitmap for each unique value for an attribute where an offset in the vector corresponds to a tuple.

The i_th position in the Bitmap corresponds to the i th tuple in the table.
Typically segmented into chunks to avoid allocating large blocks of contiguous memory.

Note

Only practical if the value cardinality is low.

Some DBMSs provide bitmap indexes.

alt text

Obviously Bitmap has many limitations! :(

SQL
CREATE TABLE customer_dim ( 
    id INT PRIMARY KEY, 
    name VARCHAR(32), 
    email VARCHAR(64), 
    address VARCHAR(64),
    zip_code INT
);

Assume we have 10 million tuples. 43,000 zip codes in the US.

10000000 × 32-bits = 40 MB
10000000 × 43000 = 53.75 GB

Every time the application inserts a new tuple, the DBMS must extend 43,000 different bitmaps.

DELTA ENCODING

Recording the difference between values that follow each other in the same column.

Store base value in-line or in a separate look-up table.
Combine with RLE to get even better compression ratios.

RLE

RLE（Run-Length Encoding，游程编码）是一种简单的无损数据压缩方法。

它通过将连续重复出现的字符（或数据值）压缩成一个单一的字符和重复次数来减少数据的存储空间。例如：

数据序列：AAAAAABBBCCDAA
使用RLE压缩：6A3B2C1D2A

结合RLE可以提高压缩率。假设你有一列数值，记录每个数值之间的差异，然后使用RLE来压缩这些差异。例如：

原始数据：10, 10, 10, 12, 12, 12, 12, 15
计算差异：0, 0, 0, 2, 0, 0, 0, 3
使用RLE压缩差异：3x0, 1x2, 3x0, 1x3

这样，可以用更少的空间来存储差异，并且结合RLE可以显著提高压缩效果

alt text

INCREMENTAL ENCODING

Type of delta encoding that avoids duplicating common prefixes/suffixes between consecutive tuples. This works best with sorted data.

alt text

DICTIONARY COMPRESSION

Build a data structure that maps variable-length values to a smaller integer identifier.

Replace those values with their corresponding identifier in the dictionary data structure.

Need to support fast encoding and decoding.
Need to also support range queries.

Most widely used compression scheme in DBMSs.

alt text

Step-1: Mapping directly
Step-2: Deduplicate and Sort

A dictionary needs to support two operations:

Encode/Locate: For a given uncompressed value, convert it into its compressed form. (uncompressed -> compressed)
Decode/Extract: For a given compressed value, convert it back into its original form. (compressed -> uncompressed)

No magic hash function will do this for us.

ORDER-PRESERVING ENCODING

Step-1: Mapping directly
Step-2: Deduplicate and Sort

alt text

CONCLUSION¶

It is important to choose the right storage model for the target workload
- OLTP: Access - Row Store
- OLAP: Analytics - Column Store
DBMSs can combine different approaches for even better compression
Dictionary encoding is probably the most useful scheme because it does not require pre-sorting

Problem #1: How the DBMS represents the database in files on disk. (✅)

Problem #2: How the DBMS manages its memory and move data back-and-forth from disk. (Next Class)