ACID in Database – I for Isolation
この記事の目次
Isolation
Overview
In the realm of database management systems, ACID is a fundamental set of properties designed to ensure reliable transactions. ACID stands for Atomicity, Consistency, Isolation, and Durability. In this blog, we will focus on Isolation, one of the core principles, and explore how it is maintained in MySQL.
What is Isolation?
Isolation is a crucial property in database systems that guarantees each transaction operates independently from others. Essentially, it ensures that transactions happening at the same time do not interfere with one another. This property provides mechanisms to manage how changes made by one transaction are visible to others and prevents unintended data conflicts and inconsistencies. By controlling the visibility of updates, Isolation helps maintain a stable and predictable environment where transactions can proceed without negatively impacting the results of others, ensuring that each transaction’s work remains isolated and unaffected by concurrent operations.
We have something called Read phenomena
In database systems, particularly when dealing with concurrent transactions, several read phenomena can occur. These phenomena arise from the interactions between transactions and can impact the accuracy and consistency of the data read during transactions. Understanding how they work can help us a lot.
1/ Dirty read
A dirty read occurs when a transaction reads data that has been modified by another transaction but not yet committed. If the other transaction is rolled back, the data read will be invalid, leading to inconsistencies.
For example:
— Transaction 1 START TRANSACTION; UPDATE accounts SET balance = balance – 100 WHERE account_id = 1; — Transaction 2 START TRANSACTION; SELECT balance FROM accounts WHERE account_id = 1; — Transaction 1 ROLLBACK; — Reverts the change made to the balance — Transaction 2 — The value read by Transaction 2 was never committed and is now inconsistent |
Dirty reads can lead to incorrect data being used for decision-making because the changes might be undone, making the data unreliable.
2/ Non-repeatable read
Non-repeatable reads occur when a transaction reads the same data multiple times and sees different values due to changes made by other transactions between the reads.
For example:
— Transaction 1 START TRANSACTION; SELECT balance FROM accounts WHERE account_id = 1; — Transaction 2 START TRANSACTION; UPDATE accounts SET balance = balance + 100 WHERE account_id = 1; COMMIT; — Transaction 1 SELECT balance FROM accounts WHERE account_id = 1; — The second read returns a different value than the first read |
Non-repeatable reads can cause inconsistencies within a single transaction, as it may rely on data that changes unpredictably during its execution.
3/ Phantom Read
Phantom reads occur when a transaction reads a set of rows that match a certain condition, but another transaction inserts, deletes, or updates rows that match the same condition before the first transaction completes. This results in the first transaction seeing different sets of rows on subsequent reads.
For example:
— Transaction 1 START TRANSACTION; SELECT COUNT(*) FROM orders WHERE order_date = ‘2024-08-15’; — Transaction 2 START TRANSACTION; INSERT INTO orders (order_id, order_date) VALUES (2, ‘2024-08-15’); COMMIT; — Transaction 1 SELECT COUNT(*) FROM orders WHERE order_date = ‘2024-08-15’; — The count may differ from the first read due to the insertion by Transaction 2 |
Phantom reads can affect aggregate results and queries that depend on the number of rows meeting certain criteria, leading to potential discrepancies in the results.
4/ Repeatable Read
While Repeatable Read is an isolation level designed to address non-repeatable reads by ensuring that if a transaction reads a value, it will see the same value if it reads it again within the same transaction, it may still be susceptible to phantoms.
For example:
— Transaction 1 START TRANSACTION; SELECT balance FROM accounts WHERE account_id = 1; — Transaction 2 START TRANSACTION; UPDATE accounts SET balance = balance + 100 WHERE account_id = 1; COMMIT; — Transaction 1 SELECT balance FROM accounts WHERE account_id = 1; — The value remains consistent for the repeatable read, but other phenomena like phantoms can still occur |
Repeatable Reads ensure consistency for individual rows but may not fully address issues related to the insertion of new rows that meet the query criteria.
And Isolation levels
When multiple transactions are happening at the same time in a database, it’s crucial to manage how they interact with each other to keep data accurate and consistent. This is where isolation levels come into play. They define how one transaction sees the changes made by others
1/ Read uncommitted
This is the most relaxed level of isolation. It allows transactions to see changes made by other transactions even if those changes haven’t been saved permanently yet.
What you might see is:
Dirty Reads: You might read data that another transaction is still working on and hasn’t committed. If that transaction rolls back, your data will be incorrect.
Non-repeatable Reads: The data you read can change if another transaction updates it before you’re done.
Phantom Reads: New rows added by other transactions might show up in your results if you run the same query again.
2/ Read committed
This level ensures you only read data that has been fully committed. It prevents you from seeing uncommitted changes from other transactions.
What you might see is:
Dirty Reads: Prevented; you won’t see changes that aren’t committed yet.
Non-repeatable Reads: Can happen; data might change if another transaction updates it before you finish.
Phantom Reads: Can still occur; new rows added by others might appear in your results.
3/ Repeatable read
This level makes sure that if you read a piece of data during a transaction, you’ll get the same data if you read it again later in the same transaction. It prevents non-repeatable reads but may not prevent all issues.
What you might see is:
Dirty Reads: Prevented.
Non-repeatable Reads: Prevented; data remains the same within the transaction.
Phantom Reads: Can occur; new rows added by others may affect your results.
4/ Serializable
This is the strictest isolation level. It makes transactions behave as if they were executed one after another, rather than concurrently. This prevents dirty reads, non-repeatable reads, and phantom reads.
Use this level when data accuracy is critical, and you need to ensure that no other transactions can interfere with your results, even if it impacts performance.
In summary of Isolation levels
Read Uncommitted: Allows dirty reads, non-repeatable reads, and phantoms. Suitable for high performance with acceptable inconsistencies.
Read Committed: Prevents dirty reads but allows non-repeatable reads and phantoms. Balances consistency and performance.
Repeatable Read: Ensures consistency for reads within a transaction but may have phantoms. Good for scenarios needing stable reads.
Serializable: Highest level of isolation, preventing all read phenomena. Ensures the highest data accuracy but can affect performance.
Conclusion
Understanding the Isolation property in database transactions is essential for maintaining data integrity and consistency in environments where multiple transactions occur simultaneously. Isolation ensures that the operations of one transaction do not interfere with those of another, which is crucial for accurate and reliable data management.
Each isolation level offers a different balance between data consistency and performance:
Read Uncommitted provides minimal protection against inconsistencies but offers the highest performance.
Read Committed prevents dirty reads but can still allow non-repeatable reads and phantoms.
Repeatable Read ensures consistent data reads within a transaction and mitigates non-repeatable reads, though it may still experience phantom reads.
Serializable offers the highest level of isolation by eliminating all types of read phenomena, ensuring complete consistency at the cost of potential performance trade-offs.
By selecting the appropriate isolation level you can tailor the database behavior to meet your specific needs, whether it’s prioritizing performance, consistency, or a balance of both.
Thank you for reading.
Other blogs: ACID in databases – A for Atomicity
カテゴリー: