Little Learnings : Concurrency control and locking

The purpose of concurrency control is to prevent two different users (or two different connections by the same user) from trying to update the same data at the same time. Concurrency control can also prevent one user from seeing out-of-date data while another user is updating the same data.

The following examples explain why concurrency control is needed. For both examples, suppose that your checking account contains $1,000. During the day you deposit $300 and spend $200 from that account. At the end of the day your account should have $1,100.

Example 1: No concurrency control
At 11:00 AM, bank teller #1 looks up your account and sees that you have $1,000. The teller subtracts the $200 check, but is not able to save the updated account balance ($800) immediately.
At 11:01 AM, another teller #2 looks up your account and still sees the $1,000 balance. Teller #2 then adds your $300 deposit and saves your new account balance as $1,300.
At 11:09 AM, bank teller #1 returns to the terminal, finishes entering and saving the updated value that is calculated to be $800. That $800 value writes over the $1300.
At the end of the day, your account has $800 when it should have had $1,100 ($1000 + 300 - 200).
Example 2: Concurrency control
When teller #1 starts working on your account, a lock is placed on the account.
When teller #2 tries to read or update your account while teller #1 is updating your account, teller #2 will not be given access and gets an error message.
After teller #1 has finished the update, teller #2 can proceed.
At the end of the day, your account has $1,100 ($1000 - 200 + 300).
In Example 1, the account updates are done simultaneously rather than in sequence and one update write overwrites another update. In Example 2, to prevent two users from updating the data simultaneously (and potentially writing over each other's updates), the system uses a concurrency control mechanism.

concurrency control mechanisms

The pessimistic concurrency control mechanism is based on locking. A lock is a mechanism for limiting other users' access to a piece of data. When one user has a lock on a record, the lock prevents other users from changing (and in some cases reading) that record. Optimistic concurrency control mechanism does not place locks but prevents the overwriting of data by using timestamps.

PESSIMISTIC vs. OPTIMISTIC concurrency control

Locks and lock modes
A lock is a mechanism for preventing two or more users from doing conflicting operations at the same time. Operations conflict if at least one of the operations involves updating the data (via UPDATE, DELETE, INSERT, ALTER TABLE, and so on). If all the operations are read-only operations (such as SELECT), then there is no conflict.

Pessimistic concurrency control (or pessimistic locking) is called "pessimistic" because the system assumes the worst — it assumes that two or more users will want to update the same record at the same time, and then prevents that possibility by locking the record, no matter how unlikely conflicts actually are.
The locks are placed as soon as any piece of the row is accessed, making it impossible for two or more users to update the row at the same time. Depending on the lock mode (shared, exclusive, or update), other users might be able to read the data even though a lock has been placed. For more details on the lock modes, see Lock modes: shared, exclusive, and update.

Optimistic concurrency control (or optimistic locking) assumes that although conflicts are possible, they will be very rare. Instead of locking every record every time that it is used, the system merely looks for indications that two users actually did try to update the same record at the same time. If that evidence is found, then one user's updates are discarded and the user is informed.
For example, if User1 updates a record and User2 only wants to read it, then User2 simply reads whatever data is on the disk and then proceeds, without checking whether the data is locked. User2 might see slightly out-of-date information if User1 has read the data and updated it, but has not yet committed the transaction.

Choosing concurrency control mechanism
In most scenarios, optimistic concurrency control is more efficient and offers higher performance. When choosing between pessimistic and optimistic locking, consider the following:

Pessimistic locking is useful if there are a lot of updates and relatively high chances of users trying to update data at the same time.
For example, if each operation can update a large number of records at a time (the bank might add interest earnings to every account at the end of each month), and two applications are running such operations at the same time, they will have conflicts.

Pessimistic concurrency control is also more appropriate in applications that contain small tables that are frequently updated. In the case of these so-called hotspots, conflicts are so probable that optimistic concurrency control wastes effort in rolling back conflicting transactions.

Optimistic locking is useful if the possibility for conflicts is very low – there are many records but relatively few users, or very few updates and mostly read-type operations.

Locks and lock modes
A lock is a mechanism for preventing two or more users from doing conflicting operations at the same time. Operations conflict if at least one of the operations involves updating the data (via UPDATE, DELETE, INSERT, ALTER TABLE, and so on). If all the operations are read-only operations (such as SELECT), then there is no conflict.

Little Learnings

Search This Blog

Thursday, April 28, 2011

Concurrency control and locking

No comments:

Post a Comment