Sunday, December 20, 2009

Network client server SimpleDBM

I am happy to report that an initial implementation of network client server has been checked in. This will allow the SimpleDBM clients to use a thin client which will connect to a remote SimpleDBM server.

The client interface is a simplified version of the Database API.

The network protocol is based on a simple request/reply model. The messages are passed in the SimpleDBM serialization format.

I looked at the possibility of using a third-party NIO framework for SimpleDBM. However, all existing frameworks seemed to be unnecessarily complicated, due to a desire to make them generic. In the end, I ended up writing my own NIO server. It was easier than I thought it might be; though it is early days  and I have not tested all the failure scenarios yet.

There is no attempt to optimise the network traffic in the current implementation. The main goal right now is to prototype the client interface and get it to a satisfactory level. Optimisation can happen once the interface is stable.

The current implementation does not have any security constraints; security must be handled by the client application.

An example of how the client interacts with the server is shown in the snippet below:


Properties properties = parseProperties("test.properties");
// start a session
SessionManager sessionManager = new SessionManager(properties,
"localhost", 8000);
TypeFactory ff = sessionManager.getTypeFactory();
Session session = sessionManager.openSession();
try {
// create a table definition
TypeDescriptor employee_rowtype[] = { ff.getIntegerType(), /* pk */
ff.getVarcharType(20), /* name */
ff.getVarcharType(20), /* surname */
ff.getVarcharType(20), /* city */
ff.getVarcharType(45), /* email address */
ff.getDateTimeType(), /* date of birth */
ff.getNumberType(2) /* salary */
};
TableDefinition tableDefinition = sessionManager
.newTableDefinition("employee", 1, employee_rowtype);
tableDefinition.addIndex(2, "employee1.idx", new int[] { 0 }, true,
true);
tableDefinition.addIndex(3, "employee2.idx", new int[] { 2, 1 },
false, false);
tableDefinition.addIndex(4, "employee3.idx", new int[] { 5 },
false, false);
tableDefinition.addIndex(5, "employee4.idx", new int[] { 6 },
false, false);
// create table
session.createTable(tableDefinition);
// now lets insert/update a row
session.startTransaction(IsolationMode.READ_COMMITTED);
boolean success = false;
try {
Table table = session.getTable(1);
Row tableRow = table.getRow();
tableRow.setInt(0, 1);
tableRow.setString(1, "Joe");
tableRow.setString(2, "Blogg");
tableRow.setDate(5, getDOB(1930, 12, 31));
tableRow.setString(6, "500.00");
table.addRow(tableRow);
TableScan scan = table.openScan(0, null, false);
try {
Row row = scan.fetchNext();
while (row != null) {
System.out.println("Fetched row " + row);
row.setString(6, "501.00");
scan.updateCurrentRow(row);
row = scan.fetchNext();
}
} finally {
scan.close();
}
success = true;
} finally {
if (success) {
session.commit();
} else {
session.rollback();
}
}

// now delete all the rows
session.startTransaction(IsolationMode.READ_COMMITTED);
success = false;
try {
Table table = session.getTable(1);
TableScan scan = table.openScan(0, null, false);
try {
Row row = scan.fetchNext();
while (row != null) {
System.out.println("Deleting row " + row);
scan.deleteRow();
row = scan.fetchNext();
}
} finally {
scan.close();
}
success = true;
} finally {
if (success) {
session.commit();
} else {
session.rollback();
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
session.close();
}

Sunday, October 11, 2009

Update on Data Dictionary implementation

I described the implementation of the data dictionary in a previous post. I have updated the implementation to include support for dropping tables.

I stated in my last post that the creation of the table definitions would be handled as post commit actions. In the end this wasn't necessary, and the table creation is logged against the page 0 of the virtual container 0. This container is pre-created with a single page when a new database is created. This allows arbitrary actions to be logged against the page (0,0).  Logging as post commit action would not have worked as the definitions should be created before any container that depends on these is created.

I also fixed issues with the logging of the table and container creation so that these are now logged as undoable log records, and in the event the transaction aborts, the table creation is undone. I will describe the changes in a separate post.

The action to drop the table definition is handled as a post commit action. This is because we want to avoid dropping any object until we are sure that the transaction will be committed.

Saturday, July 04, 2009

From Subversion to Mercurial

SimpleDBM source code repository has been migrated from Subversion to Mercurial. For a somewhat rude and OTT presentation on the benefits of distributed version control systems over traditional centralized systems see the Linus Torvald's presentation on GIT.

Sunday, April 12, 2009

What next?

The SimpleDBM core database engine has been out for a while, and although I am still working on improving the robustness of the code, writing more test cases, etc., it is also time to start thinking about what next.

Couple of areas interest me.

A simple SQL layer so that it becomes easier to write code against the database. Unlike other projects, I am not interested in a full-blown SQL implementation, just enough to cut down the procedural code one must write. My current thoughts are towards a very basic set of SQL statements to manipulate data from just one table. No joins, etc.

A network server/client so that instead of embedding SimpleDBM inside the application it can be accessed remotely. Embedded implementations are hard to make use of in a multi-user environment. I am looking at projects like Netty and MINA as the underlying TCP/IP framework so that I can concentrate on the functional aspects rather than writing all the network stuff.

Tuesday, March 31, 2009

Dwarfs standing on the shoulders of giants

As I mentioned in a previous post, when building SimpleDBM I have used design ideas that I have picked from various technical research papers as well as other OpenSource projects. I would like to list some of the influences here, and acknowledge my indebtedness.

Papers from IBM's System-R research project have influenced the role of the database engine in SimpleDBM. I have named the core engine as RSS in honor of the System-R Relational Storage System (RSS).

A general reference that has proved invaluable is the classic Transaction Processing: Concepts and Techniques, by Jim Gray and Andreas Reuter. The Lock Manager and the Buffer Manager modules are based upon descriptions in this book.

The Lock Manager also uses ideas from the OpenSource project Shore. The handling of lock conversions and the deadlock detector are based upon code from this project.

The Transaction Manager is based upon the algorithms published in the ARIES series of papers by C. Mohan of IBM. I am also indebted to Mohan for the implementation of lock isolation modes, and implementation of free space management in containers.

The BTree implementation is based upon the algorithms described by Ibrahim Jaluta et al in the paper Concurrency control and recovery for balanced B-link trees.

The idea of using the write ahead log records for normal execution of forward actions is taken from Apache Derby. This is a neat idea that ensures that both normal and recovery processes use the same code for performing updates to the database. Essentially all updates are executed through log records, even during normal execution.

Projects that I have learned from but that haven't directly influenced the implementation include:

Delayed release

The next BETA release of SimpleDBM has been delayed due to my desire to re-factor some of the code. The re-factoring has been focused on following:
  • Removing all singletons - even Loggers. Unfortunately, Log4J and other logging packages all use singletons, so this is not completely possible unless I replace the entire logging package.
  • Ensuring that the serialization mechanism uses constructor based object initialization rather than retrieving state after construction.
  • Above enables the use of final fields in many objects, allowing them to be immutable.
All of these changes are designed to increase robustness, and also to ensure that multiple instances of SimpleDBM can co-exist within the same JVM, and even the same Class Loader, without conflict.