A couple of years ago I was with a project designing a system to manage Investor Meetings.
An Investor must have a valid Pass in order to join a Meeting. Each Pass has a unique serial number.
It was a major issue and lots of things were tried to ensure that different Passes did not have the same number and that numbers were not “lost”. The system had to be “fast”, being able to create lots of Passes for different Meetings for different Organizations at the same time. Serializing transactions on central database tables was not an option.
For some reason, a simple “SELECT … FOR UPDATE” would not do the trick.
At one point it was decided to use database sequences. Each meeting would have a sequence for generating the Pass-numbers.
This proved to a bad idea too, as creating and destroying sequences are relatively slow operations and the system experienced deadlocks from time to time. Also, sequences are generally not rolled back when transactions are, and Pass-numbers would be “lost”.
I decided I would try to figure out why the simple “SELECT … FOR UPDATE” did not work and what database I could actually get it to work with.
This blog-entry is about my findings with different databases: DB2, Oracle, PostgreSQL, Apache Derby and HSQLDB.
Test database setup
The setup contains 3 tables: meeting, pass and pass_next.
A row in the meeting table represents a Meeting. In the actual system this table is very central and contains lots of information related to the Meeting. It is being accessed by lots of different transactions from different parts of the application, so we do not want to serialize access to any rows in this table.
A row in the pass table represents a Pass. Each Pass must have a unique serial number, each Meeting having its own sequence starting from 1.
A row in the pass_next table represents the next number to be used for a Pass for a specific Meeting. This table is managed only by the transactions generating the corresponding rows in the Pass table and it is acceptable for those transactions to hold a brief lock on its “own” row in the pass_next table.
DDL below:
CREATE TABLE meeting ( id INTEGER NOT NULL, info CHARACTER VARYING(50) NOT NULL, CONSTRAINT meeting_pkey PRIMARY KEY (id) ); CREATE TABLE pass ( id INTEGER NOT NULL, meeting_id INTEGER NOT NULL, pass_number INTEGER NOT NULL, CONSTRAINT pass_pkey PRIMARY KEY (id), CONSTRAINT fk_pass_mtng FOREIGN KEY (meeting_id) REFERENCES meeting (id), CONSTRAINT uk_pass_mtng_nmbr UNIQUE (meeting_id, pass_number) ); CREATE TABLE pass_next ( meeting_id INTEGER NOT NULL, pass_number INTEGER NOT NULL, CONSTRAINT pass_next_pkey PRIMARY KEY (meeting_id), CONSTRAINT fk_pass_nxt_mtng FOREIGN KEY (meeting_id) REFERENCES meeting (id) ); |
REPEATABLE READ
The isolation level “REPEATABLE READ” should in principle do the trick. This isolation level ensures that if a given transaction reads the same row a different time it would get the same result. In other words, other transactions are not allowed to commit updates to the row in the meantime – or at least those updates are not visible…
“REPEATABLE READ” with “SELECT … FOR UPDATE” should ensure the transaction gets a “write-lock” on the row and everything should be fine.
Unfortunately, not all databases support the “REPEATABLE READ” isolation level, and the performance is not as good as with “READ COMMITTED”, which is the most common isolation level.
“SERIALIZEABLE” is out of the question, as each transaction would basically hold a lock on the entire table for the duration of the transaction…
In other words, we need to find a solution that works with isolation level “READ COMMITTED”…
Updating a row
The test takes 4 different approaches for updating the pass_next row:
- a simple SELECT followed by an UPDATE
- URS: a simple SELECT where the update is done in the result-set (“updateable resultset”)
- SFU: a SELECT … FOR UPDATE followed by an UPDATE
- SFU+URS: a SELECT … FOR UPDATE where the update is done in the result-set.
Not all the databases supports updateable result-sets or “SELECT … FOR UPDATE”, these tests are skipped for those.
Test results
Simple | URS | SFU | SFU+URS | |
---|---|---|---|---|
DB2 | Failed | 16,52 | Failed | 12,47 |
DB2 (rr) | Failed | 12,9 | 12,69 | 9,57 |
PostgreSQL | Failed | Failed | 25,52 | 54,12 |
HSQLDB | Failed | Skipped | Skipped | Skipped |
HSQLDB (rr) | Failed | Skipped | Skipped | Skipped |
Oracle | Failed | Failed | 45,45 | 46,92 |
Derby | Failed | 22,41 | Failed | 17,2 |
Derby (rr) | Failed | 16,83 | 4,02 | 13,61 |
NOTES:
- (rr) means isolation level “REPEATABLE READ”.
- Timings cannot be compared among databases vendors, as they have not been tested on the same hardware: DB2, HSQLDB and Derby was local instances, Oracle and PostgreSQL remote on different hardware.
- HSQLDB is used in “in-memory” mode. It is tested only because the JUnit tests done in the continuous-integration build in the above mentioned project used it.
Driver/product information:
DB2
jdbc:db2:TST com.ibm.db2.jcc.DB2Driver DB2/NT SQL09050 IBM DB2 JDBC Universal Driver Architecture 3.50.152 |
PostgreSQL
jdbc:postgresql:db_test org.postgresql.Driver PostgreSQL 8.3.3 PostgreSQL Native Driver PostgreSQL 8.3 JDBC4 with SSL (build 603) |
HSQLDB
jdbc:hsqldb:mem:test org.hsqldb.jdbcDriver HSQL Database Engine 1.8.0 HSQL Database Engine Driver 1.8.0 |
Oracle
jdbc:oracle:thin:@judby:1521:tst oracle.jdbc.driver.OracleDriver Oracle Oracle Database 11g Release 11.1.0.0.0 - Production Oracle JDBC driver 11.1.0.6.0-Production |
Derby
jdbc:derby:test;create=true org.apache.derby.jdbc.EmbeddedDriver Apache Derby 10.4.2.0 - (689064) Apache Derby Embedded JDBC Driver 10.4.2.0 - (689064) |
Conclusion(s)
There is not a single conclusion but several. Lets take the most important first:
It is possible to implement simple row-locking with most databases, using JDBC and with isolation level READ COMMITTED.
If the solution must be portable between all the above mentioned databases (except HSQLDB), you must use SFU+URS: Do SELECT … FOR UPDATE and use an updateable result-set.
If your database supports REPEATABLE READ (or you are using PostgreSQL), it is possible to do a SELECT … FOR UPDATE followed by a normal UPDATE… For PostgreSQL and Derby, this even looks as the fastest solution
How
SFU – SELECT … FOR UPDATE is done simply by appending “FOR UPDATE” to your SQL (some ORM tools actually lets you specify it in the meta-data), e.g.:
SELECT pass_number, meeting_id FROM pass_next WHERE meeting_id=? FOR UPDATE |
URS – Updateable result-set:
PreparedStatement ps = con.prepareStatement("... FOR UPDATE", ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_UPDATABLE); ps.setInt(1, meetingId); ResultSet rs = ps.executeQuery(); if (rs.next()) { int nxt = rs.getInt(1) + 1; rs.updateInt(1, nxt); rs.updateRow(); } else { // insert... } con.commit(); // resource cleanup... |
The source for the entire test-setup can be downloaded here: Source for the DBLockTest.
Note that you need commons-logging and Apache POI.
This was really helpful. Saved my day.