Saturday, November 16, 2013

Secrets of the ANTs Data Server 02 -- Lock-Free


ADS was billed as lock-free, and there is an important sense in which it was lock-free. I’ll explain that in a minute, but first I want to explain how it was not lock-free. The word “lock” seems to mean something different to database people than it does to general programmers who work on parallel systems. To non-database programmers there is really no such thing as a truly lock-free program with parallel threads of execution (I’m using “thread” in a generic sense in this section). There are always going to be critical sections --places in the code where two threads might access the same data-- and you have to do something to lock out other threads when one thread is in the critical section.

ADS makes extensive use of lock-free data structures, but it still has  several critical sections and it uses spinlocks to guard those critical sections. Spin locks can waste processing time if not used carefully and under some conditions can cause starvation --a situation where one thread is trying to get in, but other threads keep jumping in first. However, used judiciously, spin locks can be very effective and they were used properly in ADS.

There is another kind of lock that you can’t avoid in a SQL engine like ADS that allows arbitrary user-supplied queries. In such systems, you need some sort of flag that tells you when one transaction has modified a particular value, so if another transaction tries to modify the same value before the first transaction has committed, then this is prevented.

However, what SQL database people typically call locks is more complicated than this. In many SQL databases there is a complex hierarchy of locks that are used during normal UPDATE, INSERT and DELETE statements to prevent inconsistent updating of data and to do so while also avoiding the two typical problems that locks cause: starvation and deadlock.

Starvation is what happens when multiple threads of execution are waiting on the same lock and one of the threads never manages to get the lock. Every time the thread that has the lock releases it, some other thread grabs it and the poor starving thread --for some reason-- is never fast enough or never in the right position to get it.  Traditional database  locks prevent this from happening by using a first-in-first-out queue of threads that are waiting on one particular lock. Threads get in line and get to take the lock in the same order that they originally requested it.

Deadlock happens when there is a cycle of locks like this: thread A has lock 1 and is blocked on lock 0 so thread A can never make progress until lock 0 becomes available. Thread B has lock 2 and is blocked on lock 1. And thread C has lock 0 and is blocked on lock 2. Every one of the threads is blocked, waiting for another thread to release a lock, and every one of the threads is holding the lock that another thread needs to continue. No thread will ever make progress because they are all blocked on a lock that will never be released because the thread that holds it is also blocked.

Traditional database locks typically avoid deadlock by having a set of locks at different levels of exclusion such as “read lock”, “write lock”, “intended write lock”, etc. and a set of rules for taking locks and promoting locks arranged so that if all of the threads follow the rules, then deadlock is impossible.

ADS does not have these sorts of traditional database locks for normal database operations. ADS does use a shared lock/exclusive lock strategy on various schema objects and internal data structures to keep them from being changed while someone is using them, but this isn’t normal database work. During normal database operations (UPDATE, INSERT, DELETE) ADS does block changes to data cells that have already been changed by another transaction (a cell is one column in one row) but it does not have multiple lock levels, does not have any kind of lock data structure (for database updates), and does not queue up transactions that are waiting on a cell. I’ll describe that in the section on MVCC.

ADS also does not use a traditional lock structure to guarantee exclusive access when updating internal data structures like indexes that need to be updated when a cell is updated, nor does it use a lock-free data structure for on-disk indexes. I’ll describe how it does that in the section on phased execution.


No comments:

Post a Comment