PostgreSQL Page Structure & Scan Methods Explained Now

Introduction

This post introduces the fundamentals of PostgreSQL’s data page (block) mechanism and how data retrieval works from a high-level perspective (Relative to PostgreSQL version 16). Understanding physical data storage and access methods, you will gain valuable insight into the core scan methods that PostgreSQL uses. This blog serves as a foundational guide for anyone looking to dive deeper into the internal workings of PostgreSQL.

Internal Layout of PostgreSQL Page Files

PostgreSQL organizes each data file into fixed-size units known as pages (also called blocks). By default, each page is 8,192 bytes (8 KB) in size. These are the characteristics of a page:

has a sequentially growing block number as block number (from 0).
PostgreSQL creates and appends (aka extends) a new page when an existing page becomes full

A page contains a series of header fields that describe the properties of this page followed by a series of tuples. The ones you have to know are:

pd_lsn (Log Sequence Number) – Indicates the WAL (Write-Ahead Logging) location of the last change made to this page. This is crucial for crash recovery and replication, ensuring that the page’s contents are in sync with the WAL stream.
pd_checksum – Stores a CRC checksum of the page to detect corruption.
pd_flags – Bitmask flags describing the status of the page, such as whether the page contains deleted tuples, has been pruned…etc.
pd_lower – Offset to the end of the line pointer array. It indicates where the free space begins—from the top downward.
pd_upper – Offset to the start of the tuple area. It shows where the free space ends—from the bottom upward.

You can treat a tuple as a row of a table that contains multiple column values. These tuples are referenced by line pointers, which point to the beginning of a tuple data. The image below shows that PostgreSQL preallocates each page and appends tuples from the end toward the beginning.

This architectural design (tuples growing from the bottom, and pointers growing from the top) enables efficient in-place updates and reuse of space while maintaining fast tuple access.

The gap between pd_lower and pd_upper represents the free space, which PostgreSQL tracks and utilizes to insert new tuples without having to allocate new pages unnecessarily.

TID-Based Tuple Location

PostgreSQL organizes each data file into pages and arranges data within each page to enable efficient access, as described earlier.

A quick recap here:

Heap Tuples: These are the actual row data, stored starting from the bottom of the page and stacked upwards.
Line Pointer Array (Item Pointers): Located at the top of the page, this is an array of 4-byte pointers. Each pointer references a heap tuple at the bottom of the page. This is the basis of TID as described next.
Tuple Identifier (TID): A TID consists of a block number and an offset number. It provides a direct reference to a specific tuple’s location within a table.

PostgreSQL uses TIDs internally to quickly locate row data. Indexes rely heavily on TIDs to efficiently point to the exact location of a tuple on disk, enabling fast lookups. For example, a TID value with block number = 0 and offset = 2 of a table having OID = 16390 under the database OID 16384, means that the tuple data is available at the zeroth block of data file base/16384/16390 at line pointer 2 (see image below).

Sequential Scan Method

Building on the earlier chapters: a table is a chain of fixed-size pages, and each page has line pointers that point to tuples. Every tuple has a TID (block number + offset). A sequential scan simply walks pages in order (0, 1, 2, …) and, on each page, walks every line pointers to access the tuples.

What it is: PostgreSQL reads the table from start to finish, page by page, tuple by tuple.
When it’s used: You want to read many or all rows (SELECT * FROM mytable;).
How it behaves: It reads pages in order, which is friendly to storage; for big tables it can split the work across workers.

Pros and Cons:

Pros: Simple, steady, and great for large reads.
Cons: It touches every page, so it’s inefficient if you only need a few rows.

This image illustrates the process of a sequential scan in PostgreSQL:

Index Scan + Fetch Method

Unlike sequential scan, an index scan + fetch involves traversing the index structure to find an index entry that satisfy the search condition (the WHERE clause provided). The index entry normally contains the TID value that PostgreSQL can utilize to quickly fetch the tuple data and return to the user.

Using a btree index as an example, each index entry stores a key and TID. For a query like SELECT * FROM mytable WHERE a = 5;, PostgreSQL looks up 5 in the index, collects the matching pointers, and jumps straight to this tuple in the table to return it.

The benefit is quite obvious; It skips the rest of the table, so you avoid a full scan if you only need to scan small amount of data out of a large table.

The image below illustrates the process if index scan + fetch:

Tuple Visibility

Up until now, we have discussed about heap tuple storage structure and common scan methods, but we have not mentioned anything about Tuple Visibility. This is one of the most important concepts in PostgreSQL (in most relational database systems in fact) that relates to transaction isolation and consistency.

PostgreSQL uses MVCC (Multi-Version Concurrency Control) so readers don’t block writers and everyone sees a consistent snapshot of the database. Every row version (a “tuple”) carries two transaction IDs:

xmin (who created it)
xmax (who deleted or superseded it).

When a query starts, it takes a snapshot of which transactions are in progress. A row is visible to that query if its creator committed before the snapshot and it hasn’t been deleted by a transaction that committed before the snapshot. Updates create a new row version with a fresh xmin and mark the old version’s xmax, so concurrent readers can still see the old data. VACUUM later removes row versions that no active snapshot can see, and uses the visibility map to mark pages where all rows are visible (enabling fast index-only scans). Hint bits cache commit status on rows to avoid repeated lookups, and freezing replaces very old transaction IDs to prevent wraparound. Together, these pieces give PostgreSQL its “readers don’t wait on writers” behavior while cleaning up obsolete versions in the background.

We will dive deeper into this topic in later posts, so please stay tuned.

Summary

This blog illustrates the fundamental concepts of PostgreSQL’s data storage structure and common scan methods, so you will have some ideas what PostgreSQL is doing to store your data and the process it has to take to scan your data set to find your desired data in shortest amount of time possible.

PostgreSQL Architecture and Storage Paradigm Exposed Now

Cary Huang

Hi, I’m Cary Huang — a tech enthusiast based in Canada. I’ve spent years working with complex production systems and open-source software. Through TechBuddies.io, my team and I share practical engineering insights, curate relevant tech news, and recommend useful tools and products to help developers learn and work more effectively.