Sunday, August 28, 2005

More on Transaction Manager

After many hours of debugging, I finally have a version of the Transaction Manager that successfully handles the BitMgr Test. The BitMgr is modeled after the OneBit Resource Manager described in Transaction Processing: Concepts and Techniques. It uses a single data page, containing a set of bits that can either be on or off. Despite the simple design, the One BitMgr allows many of the features of a transactional system to tested.

Although the ARIES algorithm is described in great detail by its inventors, there are still a few areas that are alluded to but not fully described in the paper. For example:

  1. A post commit action is something that must be done after a successful commit. An example of a post commit action is the deleting of a container after it has been dropped. Since deleting a container is not recoverable, it is necessary to defer this action until it is definitely known that the transaction will commit. The challenge is how to ensure that post commit actions are properly executed, despite failures that may disrupt the commit processing.
  2. ARIES uses the LSN within a page to track the status of an update. This does not map very well to actions that are not necessarily related to a page, and also when they need to be redone unconditionally at system restart. As an example, when a container is created, the system should log it in such a way that at restart the container is recreated if it was not successfully created before. The way to do this is to maintain the status of the container in a page. However, this page also must reside in some container, so we have a recursive situation. Another example is the action of opening a container, which must be redone at system restart unconditionally.
  3. If a transaction has rolled back to a Savepoint, then any locks acquired after the Savepoint was established can be released. However, how can we determine which locks are safe to delete? ARIES suggests using the LSN as the Savepoint marker, however, this cannot be used to determine which locks are safe to remove. For instance, read locks are not related to LSNs.
  4. The transaction manager interacts with many other modules, and it is a challenge to ensure high concurrency, and deadlock avoidance between it and other modules.

No comments: