Transaction & Concurrency Control
Transactions and concurrency in DBMS ensure data consistency and integrity when multiple operations run simultaneously.
What is a Transaction?
A transaction is a sequence of database operations treated as one logical unit of work. Either all operations succeed, or none of them take effect.
Example:
- Debit money from one account
- Credit money to another account
If either step fails, the database must rollback to the original state.
Why Transactions Are Necessary
Databases are shared systems with many users, queries, and possible failures.
Without transactions:
- Data could be corrupted by partial updates
- Crashes could leave the database inconsistent
- Concurrent users could overwrite each other’s work
Transactions ensure operations remain safe, consistent, and predictable.
ACID Properties
Transactions are governed by ACID, the core correctness guarantees of a DBMS.
Atomicity - A transaction is all or nothing. If one operation fails, all changes are undone.
Consistency - A transaction takes the database from one valid state to another valid state. All constraints, rules, and invariants must hold after commit.
Isolation - Concurrent transactions must not interfere with each other. Each transaction should behave as if it were running alone, even when many run in parallel.
Durability - Once a transaction commits, its changes persist even after crashes or restarts This is usually achieved using logs and persistent storage.
Transaction States
A transaction typically moves through these states:
- Active - executing operations
- Partially Committed - last statement executed
- Committed - changes made permanent
- Failed - error occurred
- Aborted - rolled back to original state
Understanding states helps reason about failures and recovery.
Concurrency Problems
When transactions run simultaneously, several anomalies can occur.
Dirty Read - One transaction reads data written by another uncommitted transaction. If the writer rolls back, the reader has seen invalid data.
Non-Repeatable Read - A transaction reads the same row twice and gets different values because another transaction modified it in between.
Phantom Read - A transaction re-runs a query and sees new rows inserted by another transaction. These problems arise purely due to concurrency, not bad queries.
Lost Update - Two transactions update the same data. One update overwrites the other. Result: one change silently disappears.
Concurrency Control Techniques
DBMS uses concurrency control to prevent these anomalies.
Lock-Based Protocols
Transactions acquire locks before accessing data.
- Shared lock (S) - for reading
- Exclusive lock (X) - for writing
Rules ensure that No conflicting operations occur and Consistency is preserved.
Downside: locks can cause deadlocks. Deadlocks - A deadlock occurs when T1 waits for T2 while T2 waits for T1. DBMS detects deadlocks and resolves them by aborting one transaction.
Timestamp-Based Protocols
Each transaction gets a timestamp. Operations are ordered based on timestamps to ensure correctness without locks. Useful in systems where conflicts are rare.
MVCC (Multi-Version Concurrency Control)
Instead of overwriting data:
- DBMS keeps multiple versions
- Readers see older committed versions
- Writers create new versions
Used by modern databases like PostgreSQL. MVCC improves read performance and reduces lock contention.
Isolation Levels
SQL defines isolation levels that control how much isolation a transaction gets.
| Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read | Lost Update |
|---|---|---|---|---|
| Read Uncommitted | ✅ | ✅ | ✅ | ✅ |
| Read Committed | ❌ | ✅ | ✅ | ✅ |
| Repeatable Read | ❌ | ❌ | ✅ | ✅ |
| Serializable | ❌ | ❌ | ❌ | ❌ |
Higher isolation = fewer anomalies, but lower concurrency. So only choose Serializable if system needs strict correctness and not concurrency.
Transaction Querying in SQL
Transactions are controlled using:
BEGIN;
-- SQL operations
COMMIT;
Or, on failure:
ROLLBACK;
Savepoints allow partial rollbacks inside a transaction.
BEGIN;
INSERT INTO orders VALUES (...);
SAVEPOINT sp1;
INSERT INTO payments VALUES (...);
ROLLBACK TO sp1; -- undo only the payment insert
COMMIT; -- commit the rest
Recovery & Logging
To guarantee durability and atomicity, a DBMS relies on logging mechanisms:
- Write-ahead logging (WAL) - All changes are recorded in a log before being applied to the database.
- Redo logs - Used to reapply committed changes that might not have been fully written to disk before a failure.
- Undo logs - Used to roll back incomplete or uncommitted transactions.
How Recovery Works?
On restart after a crash, the DBMS scans the logs. Redo is applied to all committed transactions that weren’t fully persisted. Undo is applied to any transactions that were incomplete or rolled back.
Conclusion
So to sum up the chapter
- Transactions → Correctness
- Concurrency Control → Safe parallelism
- Recovery → Crash tolerance
Together, these mechanisms form the backbone of modern database systems. They ensure that thousands of users can read and write data simultaneously without corruption or inconsistency.