Saturday, January 10, 2009

Novell eDirectory Dib Lock - DIB Lock How it works - Part 2

3) Threads doing LDAP update operations typically do not hold the dib lock for very long periods of time - milliseconds probably. Probably the real issue you are facing is the fact that there is a background thread, called the "checkpoint thread", which periodically runs to flush all dirty blocks (blocks which have been modified in cache but not yet written to disk) from cache to disk to establish a database checkpoint. In order to do this, it must obtain the dib lock. While it is holding the lock, all other threads that want to do LDAP update operations will be blocked.

The amount of time it takes for the checkpoint thread to flush dirty blocks to disk (and hence, the amount of time it holds the dib lock) depends on how much dirty cache there is. The more dirty cache there is, the longer it takes. The checkpoint thread wakes up every second to see if there are dirty cache blocks that need to be written to disk. If there are no other threads that have obtained the dib lock, and none waiting to obtain it, it will obtain the lock and start writing out dirty blocks. However, while it is writing out dirty blocks, if another thread requests the dib lock, the checkpoint thread will usually immediately give up the dib lock so that update operations will not have to wait for it to finish writing out all dirty blocks. However, if the checkpoint thread has not been able to complete a checkpoint (which is writing out all dirty blocks) for a certain period of time, it will not release the dib lock, but will continue writing until all dirty blocks have been written and a checkpoint established. This "certain period of time" is called the checkpoint interval. The term "checkpoint interval" is a misnomer. It seems to suggest that it is how often the checkpoint thread wakes up and does a checkpoint. But that is NOT what it is. The fact of the matter is, the checkpoint thread is continuously waking up and attempting to complete a checkpoint. The checkpoint interval is simply the longest time that the checkpoint thread will allow to go by without completing a checkpoint. If the last completed checkpoint was too long ago (more than the seconds specified by the checkpoint interval), the checkpoint thread holds onto the dib lock and completes the checkpoint before giving up the lock. We sometimes refer to this as "forcing a checkpoint."

By reducing the checkpoint interval, the checkpoint thread will "force" a checkpoint more often. This means that it will generally have fewer dirty blocks to write out - because not as much dirty cache can build up in the shorter interval. There is probably a better way to keep dirty cache from building up. There are two settings in the _ndsdb.ini file that you can set to control dirty cache buildup: "maxdirtycache" and "lowdirtycache". For example:

maxdirtycache=30000000

lowdirtycache=0

"maxdirtycache" and "lowdirtycache"

"max dirty cache" and "low dirty cache"

max dirty cache and low dirty cache

These settings tell the checkpoint thread to not allow more than 30 MB (roughly) of dirty cache to build up. Whenever it sees that more than 30 MB of dirty cache has accumulated, it will lock the dib and write it all out (down to zero - the number specified by the lowdirtycache setting). By setting maxdirtycache to the right value, the checkpoint thread forces a checkpoint more frequently, but writes out a smaller amounts of dirty cache each time. This, in effect, reduces the length of time the checkpoint thread holds the dib lock whenever it forces a checkpoint. Note that this does NOT reduce the overall amount of writing that must be done - it just spreads it out over time - amortizes it so to speak. Also, although increasing checkpoint frequency "spreads out the writes" it may not improve overall throughput. In fact, overall throughput may actually decrease some, because there is now not as much opportunity for a "piggybacking" effect - which is where multiple update operations update the same block before it is written to disk. Because the checkpoint thread is writing more frequently, a given block that is updated by multiple update operations may be written to disk multiple times now instead of being written out once.

I don't know what value you should set maxdirtycache to. You will probably want to do some experimenting. It will be different for every system - depends a lot on your disk system and how efficient it is.

Well, that is probably more than you wanted to get, but I wanted to make sure you had sufficient understanding to try various things intelligently.

As an extra note, you should know that we did some analysis and discovered that there are things that could be done to more efficiently write out dirty blocks to disk. FLAIM does not currently keep the disk channel as busy as it could. On Windows Andy Hodgkinson implemented some changes that produced dramatic improvements in how quickly FLAIM can write out dirty blocks - 10 to 20 times improvement. I believe those same changes could be made on other platforms as well. Currently, FLAIM is being transitioned to India, so the changes would most likely need to be done by engineers in Bangalore. I don't know when or if these changes will make it into a shipping version of eDir.

No comments: