http://www.mssqltips.com/sqlservertip/2977/demonstrations-of-transaction-isolation-levels-in-sql-server/?utm_source=dailynewsletter&utm_medium=email&utm_content=headline&utm_campaign=20130826
Demonstrations of Transaction Isolation Levels in SQL Server
By: Derek Colley | Read Comments (5) | Related Tips: More > Locking and Blocking |
Problem
You are looking at ways of ensuring atomicity and consistency in your database by ensuring transactional integrity. You may want to avoid locks, or ensure problems such as lost updates or dirty reads cannot occur, or otherwise ensure the quality of your data. Transactional isolation is a narrow but somewhat complex subject that might be ideal for your needs.
Solution
This article will cover the five transaction isolation settings - READ UNCOMMITTED, READ COMMITTED (locking), READ COMMITTED (snapshot), REPEATABLE READ and SERIALIZATION. You'll find short explanations on the differences between them with emphasis on practical code examples to demonstrate the effects at different isolation levels. Finally, you'll find some links at the bottom for further reading. Please execute the code in 'Setting up the Test Environment' below in your development database context to get started, or read the 'ACID: Review' section below for a bit of background in the four principles of an RDBMS system.
ACID: Review
Before I dive into transaction isolation, it's worth taking a brief look at four of the fundamental principles underlying relational database management systems. These four principles are referred to as 'ACID', and each letter is an acronym for one property of RDBMS systems that is non-negotiable for the sake of the integrity of the system. These are:
- INSERT INTO dbo.MyTestTable VALUES (1)
- INSERT INTO dbo.MyTestTable VALUES (9)
- INSERT INTO dbo.MyTestTable VALUES ('Hello')
- INSERT INTO dbo.MyTestTable VALUES (3),(3),(3)
Atomic(ity) - The principle that each transaction is 'all-or-nothing', i.e. it either succeeds or it fails, regardless of external factors such as power loss or corruption. On failure or success, the database is left in either the state in which it was in prior to the transaction or a new valid state. The transaction becomes an indivisible unit.
Consistency - The principle that the database executes transactions in a consistent manner, obeying all rules (constraints). For example, consider the following table:
Now consider the following valid transactions that will leave the database in a consistent state:
But the following statements, if executed and allowed to modify data, will leave the database in an inconsistent state, since they violate some constraint (or allowed datatype) of the defined table. Hence an error is returned:
Isolation - This property means that each transaction is executed in isolation from others, and that concurrent transactions do not affect the transaction. This property level is variable, and as this article will discuss, SQL Server has five levels of transaction isolation depending on the requirements of the database.
Durability - This property means that the data written to the database is durable, i.e. it is guaranteed to be in storage and will not arbitrarily be lost, changed or overwritten unless specifically requested. More formally, it means that once a transaction is committed, no event can 'un-commit' the transaction - it is written and cannot be changed retrospectively unless by another transaction.
Setting up the Test Environment
Experiment 1: Read using READ UNCOMMITTED
READ UNCOMMITTED is the most optimistic concurrency isolation option available in SQL Server. It allows a transaction to get the value in a row even when locks are present on the row/object or it hasn't yet been committed to disk. Reads like this are also known as 'dirty reads' since they effectively read from the transaction log rather than disk or cache - the data is unpersisted. (Note if no concurrent transactions are occurring, the read will occur from disk/cache). To show the effects of READ UNCOMMITTED, we can open a transaction as follows:
Now select the value that's being updated using the following (in a separate query window):
Note the value for empSalary reflects the current *uncommitted* value. You can view the intent lock on the key (empID) and the intent exclusive locks on the object containers (the page on which the row is located and the object) imposed by the UPDATE statement using the following:
Now rollback the transaction to reset the EmpSalary for this employee to 22000.00:
Experiment 2: Read using READ COMMITTED (snapshot)
There are two levels of READ COMMITTED isolation, which are locking and snapshot. Locking is the most straightforward, and simply means that once an UPDATE transaction is open, exclusive and intent-exclusive locks are taken out on the page, key range (if appropriate) and object. When reading the row using READ COMMITTED while using locking, the SELECT query used will hang until the value of LOCK_TIMEOUT (session-level parameter, if set) has expired, at which point an error will be returned.
If the value of the database-level option READ_COMMITTED_SNAPSHOT is False, locking mode for READ COMMITTED transactions is the default option. If it is True, then snapshot is the default option unless overridden by the READCOMMITTEDLOCK table hint.
Here's a demonstration of READ COMMITTED isolation with locking, by using the table hint in a database with READ_COMMITTED_SNAPSHOT ON. You don't need the table hint if this value is OFF:
Now in a separate query window:
The query will hang as it is waiting for the key lock on EmpID to be released. Allow the query to execute by issuing in your first window:
READ COMMITTED with snapshot is different from locking. The UPDATE statement above will take the same locks on the objects, but with the SELECT transaction session specifying snapshot isolation, the row returned will be the last row version before the UPDATE statement began. This row version is stored internally and represents the last consistent state of the row. Logically it follows that if you are using row versioning, this capability must be DB-wide, since otherwise the transaction with the UPDATE statement would not know to maintain a version of the row before issuing the UPDATE. Therefore, to use snapshot isolation the option must be set using the ALTER DATABASE statement (note that all database user connections will be killed when doing this).
Note About Row Versioning
Row versioning is an internal feature used by SQL Server to maintain recent copies of rows that have been changed, for the purposes of maintaining table consistency and ensuring better isolation from reads or writes of transactions that concurrently access the same rows. Row versioning, also called 'Row-Level Versioning (RLV)' was first introduced in SQL Server 2005. Historical rows are kept in the 'version store', inside TEMPDB, and each row that has been 'versioned' has a row pointer added to it which allows the query engine to locate the versioned row. Interestingly, the 'inserted' and 'deleted' tables used with triggers and the OUTPUT clause (to name two uses) use a similar method of versioning. There are performance sacrifices made when using this level of transaction isolation - please see the note under 'Next Steps' for more information.
Below is an example of using the READ COMMITTED with snapshot isolation:
Now in a separate query window:
The query will return the last consistent row of data. Note the empSalary column is 22000.00 despite the transaction being open and the update written (but uncommitted). This is correct, and the SELECT is reading from the previous row version, not the present state of the row. This is compliant with the C in ACID - consistency.
Experiment 3: Read using SNAPSHOT isolation
For all intents and purposes, reads using READ COMMITTED - snapshot and SNAPSHOT are almost identical - but not identical. There are some differences when it comes to details and behavior. READ COMMITTED - snapshot will read the most recent consistent row version since the start of the statement being issued, where snapshot isolation will read the most consistent row version since the transaction started. This can cause problems with concurrent transactions since SELECTs inside the transaction that occur later than the COMMIT time of the UPDATE transaction will return an incorrect value. Likewise, update conflicts can occur for the same reason when concurrent updates are attempted. See Experiments 4 and 5 for demonstrations of this, and please refer to Kalen Delaney's excellent text, 'SQL Server 2008 Internals' (Microsoft, 2009, Ch: 10 Pgs: 641-643) for an in-depth discussion of this behavior.
To use snapshot isolation, you must first enable the feature as follows:
Now start the UPDATE again, and issue the SELECT in a separate query window like so:
Now in a separate query window:
You will note that, like READ COMMITTED, the correct snapshot of the data row is returned, yielding empSalary = 22000, which is consistent and correct. Rollback the transaction.
Experiment 4: Read using REPEATABLE READ
The REPEATABLE READ isolation level is similar to the READ COMMITTED isolation level, in that it guarantees the output of uncommitted transactions won't be read by other concurrent transactions. However, if a separate concurrent transaction commits before the first one, it is possible to read the same row twice within the transaction and obtain different values. Likewise it is possible that additional 'phantom' rows could be present depending on the behavior of the concurrent transaction.
Execute the following:
Now while this is executing, execute the following in a separate query window:
Note the results below. Despite the two SELECTs being in one explicit transaction, the empSalary value differs between the individual statements in that transaction. The next isolation level helps to solve this problem.
REPEATABLE READ is the isolation level to use if read requests (note: not updates) are returning inconsistent data *within one transaction*, and consists of a superset of the READ COMMITTED isolation type features (i.e. it encapsulates READ COMMITTED characteristics). Here is an example of using REPEATABLE READ when a concurrent UPDATE is occurring:
Run the below while the above is executing:
You'll notice that the UPDATE transaction is waiting on the SELECT transaction, and that the SELECT transaction yields the correct data if the transaction consistency as a whole is considered. Interestingly though, this still doesn't hold true for phantom rows - it's possible to insert rows into a table and have the rows returned by a calling SELECT transaction even under the REPEATABLE READ isolation level.
Run the below while the above is executing:
To counter this problem, we need to use the SERIALIZABLE isolation level - the toughest of the bunch.
Experiment 5: Serializable Isolation
SERIALIZABLE has all the features of READ COMMITTED, REPEATABLE READ but also ensures concurrent transactions are treated as if they had been run in serial. This means guaranteed repeatable reads, and no phantom rows. Be warned, however, that this (and to some extent, the previous two isolation levels) can cause large performance losses as concurrent transactions are effectively queued. Here's the phantom rows example used in the previous section again but this time using the SERIALIZABLE isolation level:
Run the below while the above is executing:
Next Steps
I highly recommend Chapter 10 ('Transactions and Concurrency') in 'SQL Server 2008 Internals' by Kalen Delaney (Microsoft Press, 2009) as it gives a far more thorough grounding in transactional consistency and related topics than I can here. There's also some links below containing relevant material to broaden your understanding. Thanks for reading!
Note About Performance
I hope this article has been relevant and interesting, but I cannot finish without writing about performance. The reason there are five variable levels of transaction isolation in SQL Server is so the DBA or developer can tailor the isolation level to the type of query (and frequency of query) being performed. Generally, the more pessimistic the isolation level (SERIALIZABLE, READ COMMITTED - locking), the worse the performance of the query operating under that scope. This is plain to see when you consider one example, READ COMMITTED - locking, which forces other queries to wait for the resources being held for the first query. This can cause significant performance delays along the application stack, potentially leading to timeouts or other errors.
However, reducing transaction isolation levels to the most optimistic (READ UNCOMMITTED) is not necessarily a good idea under all circumstances either, for the reasons demonstrated in this article. Some systems, such as finance / banking systems, require absolute data integrity and for this to be maintained, the isolation principle is paramount. You should choose carefully the level of transaction isolation depending on what is required from the query or queries you are writing. Your queries, and associated isolation levels, should always be tested in a suitable test/development environment before deployment to production.
- Snapshot Isolation in SQL Server 2005 (mssqltips): http://www.mssqltips.com/sqlservertip/1081/snapshot-isolation-in-sql-server-2005/
- SQL Server SERIALIZABLE Isolation Level and Duplicate Key Insertion Attempts (mssqltips):http://www.mssqltips.com/sqlservertip/2250/sql-server-serializable-isolation-level-and-duplicate-key-insertion-attempts/
- Isolation Levels in the Database Engine (Books Online): http://msdn.microsoft.com/en-us/library/ms189122(v=sql.105).aspx
Comments
Post a Comment