
Introduction
Multi-Version Concurrency Control (MVCC) is the foundation of PostgreSQL's ability to handle concurrent transactions without locking. Unlike traditional databases that use write locks, PostgreSQL maintains multiple versions of data, allowing readers and writers to operate simultaneously.
The Tuple Header Structure
Every row in PostgreSQL (called a "tuple") has a header that contains critical MVCC information:
-- Tuple header fields (internal):
-- xmin: Transaction ID that created this tuple
-- xmax: Transaction ID that deleted/updated this tuple
-- cmin: Command ID within creating transaction
-- cmax: Command ID within deleting transaction
-- ctid: Physical location of this tuple version
-- View tuple information
SELECT xmin, xmax, cmin, cmax, ctid, *
FROM your_table
WHERE id = 1;Transaction IDs and Wraparound
PostgreSQL uses 32-bit transaction IDs (about 4 billion). While this seems large, it can wrap around. PostgreSQL handles this with a special "frozen" transaction ID (FrozenTransactionId).
-- Check transaction ID age
SELECT datname, age(datfrozenxid)
FROM pg_database;
-- Freeze old tuples to prevent wraparound
VACUUM FREEZE your_table;
-- Set autovacuum freeze parameters
ALTER TABLE your_table
SET (autovacuum_freeze_max_age = 2000000000);How Snapshots Work
A snapshot determines which tuple versions are visible to a transaction. When you start a transaction, PostgreSQL captures the current transaction state:
-- Get current snapshot
SELECT txid_current(), txid_current_snapshot();
-- Snapshot format: xmin:xmax:xip_list
-- Example: 100:100:0,99,98
-- xmin = 100 (oldest active tx)
-- xmax = 100 (no future transactions yet)
-- xip_list = 0,99,98 (in-progress txs)
-- The snapshot says:
-- - All txs < 100 are committed
-- - Txs 98, 99 are in-progress
-- - Tx 100 is the current transactionVisibility Rules
PostgreSQL follows specific rules to determine if a tuple is visible:
- If xmin is in progress: Not visible (unless same transaction)
- If xmin is committed after snapshot: Not visible
- If xmax is committed and same tx as xmin: Not visible
- If xmax = 0 (not deleted): Visible
- If xmax is in progress: Not visible
-- Examine visibility for a specific tuple
SELECT
xmin::text::bigint as creating_tx,
xmax::text::bigint as deleting_tx,
CASE WHEN xmax = 0 THEN 'alive' ELSE 'deleted' END as status,
*,
pg_column_size(*) as row_size
FROM users
WHERE id = 1;
-- Check which transactions are visible
SELECT xmin, xmax, * FROM t;Update is Delete + Insert
In PostgreSQL, an UPDATE doesn't modify the existing tuple. Instead, it:
- Marks the old tuple as deleted (sets xmax to current transaction)
- Creates a new tuple with the new values
-- What happens during UPDATE:
-- Original tuple: xmin=50, xmax=0
-- After UPDATE (tx 100):
-- Old tuple: xmin=50, xmax=100 (marked deleted)
-- New tuple: xmin=100, xmax=0 (new version)
-- This is why VACUUM is essential!
-- Dead tuples accumulate until vacuumed
-- Monitor dead tuples
SELECT schemaname, relname, n_dead_tup,
last_vacuum, last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;VACUUM and Visibility Map
PostgreSQL uses two important maps to track which pages can be vacuumed:
- Visibility Map (VM): Tracks which pages are all-visible
- Free Space Map (FSM): Tracks available space in pages
-- Check visibility map
SELECT
relname,
relkind,
pg_size_pretty(pg_table_size(oid)) as size,
n_live_tup,
n_dead_tup,
last_autovacuum
FROM pg_class
JOIN pg_stat_user_tables
ON oid = relid
WHERE relname LIKE 'users%';
-- Manual vacuum (reclaims space, updates FSM)
VACUUM your_table;
-- VACUUM FULL (rewrites table, locks heavily)
VACUUM FULL your_table;Production Best Practices
- Monitor dead tuples: High dead tuple counts indicate vacuum issues
- Configure autovacuum: Tune based on write patterns
- Use appropriate vacuum settings: For high-write tables, reduce vacuum threshold
- Plan for transaction ID wraparound: Monitor age() on critical tables
Conclusion
Understanding MVCC is essential for PostgreSQL DBA. It explains why your tables grow, why VACUUM is critical, and how PostgreSQL achieves its excellent concurrency characteristics. Proper MVCC management is key to maintaining performance at scale.
ByteHubble Engineering
Database Intelligence Platform
Helping teams build better databases with AI-powered insights.