LOB/BLOB Chain Formats

When a column value exceeds the space available on an INDEX page, InnoDB stores the overflow data on dedicated BLOB or LOB pages. Understanding these formats is necessary for inspecting externally stored columns and tracing data chains across pages.

When Does InnoDB Use External Storage?

InnoDB stores column data externally when:

  • A row's total size exceeds roughly half the page size (keeping at least two rows per page for B+Tree efficiency)
  • The column is a large BLOB, TEXT, JSON, or LONGBLOB value
  • The row format is DYNAMIC or COMPRESSED (MySQL 5.7+), which stores only a 20-byte external pointer on the INDEX page

The INDEX page record contains a 20-byte BLOB reference (also called an "extern field ref") pointing to the first overflow page:

BLOB Reference (20 bytes on INDEX page):
+----------+---------+----------+--------+
| space_id | page_no | offset   | length |
| 4 bytes  | 4 bytes | 4 bytes  | 8 bytes|
+----------+---------+----------+--------+

Old-Style BLOB Pages (Pre-8.0)

Three page types handle old-style externally stored data:

TypeValueDescription
BLOB10Uncompressed overflow data
ZBLOB11First page of a compressed BLOB chain
ZBLOB212Subsequent pages of a compressed BLOB chain

BLOB Page Header (8 bytes)

Each old-style overflow page has a simple header at FIL_PAGE_DATA (byte 38):

OffsetSizeFieldDescription
04part_lenNumber of data bytes stored on this page
44next_page_noNext overflow page number (FIL_NULL = last page)

The data immediately follows the 8-byte header. With the default 16K page size, each BLOB page carries up to 16,338 bytes of payload (16384 - 38 header - 8 BLOB header).

Chain Traversal

Old-style BLOB chains are singly linked lists:

INDEX page          BLOB page 5        BLOB page 12       BLOB page 20
+------------+      +------------+      +------------+      +------------+
| extern ref |----->| part_len   |      | part_len   |      | part_len   |
| page_no=5  |      | next=12    |----->| next=20    |----->| next=NULL  |
+------------+      | data...    |      | data...    |      | data...    |
                    +------------+      +------------+      +------------+

Walk the chain by reading next_page_no until it equals FIL_NULL (0xFFFFFFFF).

New-Style LOB Pages (MySQL 8.0+)

MySQL 8.0 introduced a richer LOB format with MVCC support at the LOB level. This format uses four new page types:

TypeValueDescription
LOB_FIRST22First page of an uncompressed LOB
LOB_DATA23Data page in an uncompressed LOB
LOB_INDEX24Index page linking LOB data pages
ZLOB_FIRST25First page of a compressed LOB
ZLOB_DATA26Data page in a compressed LOB
ZLOB_FRAG27Fragment page for small compressed LOBs
ZLOB_FRAG_ENTRY28Fragment entry index page
ZLOB_INDEX29Index page for compressed LOB

LOB First Page Header (12 bytes)

The first page of a new-style LOB contains metadata at FIL_PAGE_DATA:

OffsetSizeFieldDescription
01versionLOB format version
11flagsLOB flags
24data_lenTotal uncompressed data length
66trx_idTransaction ID that created the LOB

After the header, the first page contains LOB index entries that describe where the data chunks reside.

LOB Index Entries (60 bytes each)

LOB index entries form a doubly-linked list and describe individual data chunks:

OffsetSizeFieldDescription
06prev_nodePrevious entry (page_no + offset)
66next_nodeNext entry (page_no + offset)
121versionsNumber of versions of this entry
146trx_idTransaction that created this chunk
204trx_undo_noUndo record number
244page_noPage containing the data for this chunk
284data_lenLength of data on the referenced page
324lob_versionLOB version number

This structure enables MVCC for LOBs -- different transactions can see different versions of the same LOB by following different index entry chains.

LOB Data Page Header (11 bytes)

Each LOB_DATA page has a minimal header:

OffsetSizeFieldDescription
01versionLOB format version
14data_lenBytes of data stored on this page
56trx_idTransaction that wrote this page

Compressed LOBs

Compressed LOB pages (ZLOB_FIRST, ZLOB_DATA) use the same header layout as their uncompressed counterparts. The data_len field in the ZLOB first page header represents the total uncompressed length, while the data stored on individual ZLOB_DATA pages is zlib-compressed.

ZLOB_FRAG pages handle small compressed LOBs that fit within a single page fragment, avoiding the overhead of a full LOB chain.

Inspecting LOB Chains with inno

# Show all LOB pages in a tablespace
inno pages -f table.ibd -t blob

# Inspect LOB chain starting from a specific page
inno pages -f table.ibd --lob-chain -p 5

# JSON output for programmatic analysis
inno pages -f table.ibd --lob-chain -p 5 --json

The web UI Pages tab includes a LOB Chain panel that visualizes the chain when you select a LOB start page (BLOB, ZBLOB, ZBLOB2, LOB_FIRST, or ZLOB_FIRST).

Source Reference

  • LOB page parsing: src/innodb/lob.rs -- BlobPageHeader, LobFirstPageHeader, LobDataPageHeader, LobIndexEntry
  • LOB chain traversal: src/innodb/lob.rs -- walk_blob_chain(), walk_lob_chain()
  • WASM LOB chain: src/wasm.rs -- analyze_lob_chain()
  • MySQL source: storage/innobase/lob/lob0lob.cc, lob0first.cc