Flying out of the gate with Derby

This article was originally published in VSJ, which is now part of Developer Fusion.
The need for easily storing and retrieving data is a universal application requirement. If you have developed software for any length of time, the chances are you have needed the service of a relational database system time and time again. Yet, as a Java developer, your choices are rather limited. You can go with a tried-and-tested commercial RDBMS, most likely not written in Java, although usually accessible through 100% Java JDBC connectors. These commercial products often require extensive setup, the continual service of a database administrator for operation and tuning, and will likely require hefty licensing fees for each client that you will deploy.

On the other hand, you may elect to go with an open source Java-based RDBMS implementation. These products are usually designed for embedded application. Support for SQL standard is typically limited, and support for RDBMS features minimal. But their liberal licenses, ease of administration, and trivial code integration often makes up for their lack of features.

It would be ideal to have an open source, 100% Java, easy-to-administer RDBMS that can deliver on some of the features of the commercial products. This article takes a hands-on look at the Apache Derby project – an RDBMS that delivers the best of both worlds.

Derby – a fairy-tale product cycle

Picture for a moment the perfect lifecycle for a commercial-grade software product. Initially, it should be conceived by an entrepreneurial startup; born out of passion, where the designers won’t take “no” for an answer, and innovative solutions rule the day. Once the product has proved itself capable of solving problems for a sizeable customer base, the small company should then be acquired by a larger company. Taking the software into its embrace, the large company will then “harden” the product by adding quality assurance, broadening its scope, adding robust enhancements, and writing high quality documentation. Once the product has reached its peak adoption, then the large company should hand the product over to open source, with a super liberal license for anyone to adopt and modify. You, as the beneficiary can then take this robust code base and include it into your own product, commercial or otherwise, and benefit from all of the blood and sweat that has gone into the code base.

Of course, the ideal software product lifecycle above doesn’t sound like something that would happen in a capitalistic free economy. But once or twice in a lifetime, it does!

The ideal product lifecycle described above is exactly what happened with Apache Derby. Derby was conceived by a small company called Cloudscape; fortified with real-world armour when Cloudscape was sold to Informix; acquired its high-end engineering features during its tenure at “big blue” when IBM acquired it; and now exists as an open source project with one of the most liberal free-to-use licenses around.

Extensive and impressive feature list

Derby’s idyllic upbringing has provided it with an amazing features list. There are many advanced features that are simply not available with other open source Java databases. A partial list of features that sets Derby apart from other Java RDBMS includes:
  • 100% Java implementation
  • 100% Java type 4 JDBC driver
  • SQL92E standard support with many features of SQL 99
  • ACID transaction support with full transaction isolation
  • J2EE support via JNDI, connection pooling, and XA
  • views, temporary tables and savepoints
  • BLOB and CLOB data types
  • row and table locking
  • cost-based query optimization
  • server-side referential constraints
  • triggers and stored procedures
  • store Java code in database, for server-side functions, triggers or stored procedures
  • can work off read-only database on CD-ROM
  • data import and export
  • fast database encryption option
In the code example, you will explore the use of referential constraints, stored procedures, and in-database Java code storage.

Two flexible modes of operation

Derby can work in either an embedded mode, or in the full client/server mode. Figure 1 illustrates the embedded mode of operation.

Figure 1
Figure 1: Embedded mode

In Figure 1, you can see that the Derby engine is embedded within your application. When operating in the embedded mode, your application is accessing the database storage directly and exclusively. This means that no other application will be able to access the database at the same time. The main advantage of operating in the embedded mode is the fact that no network or server setup is required. Since your application includes the Derby engine, users do not even need to be aware that you are using a relational database.

There is no tricky configuration or proprietary APIs to learn when using Derby in the embedded mode. In fact, you only need to:

  1. Make sure that you have the Derby JARs in your application CLASSPATH
  2. Use the embedded JDBC driver
  3. Write your database access code to JDBC
That’s correct, there are no Derby-specific APIs at all. Just code your application to JDBC and it will work fine with Derby. In fact, if you already have JDBC code, chances are that you can switch it to use Derby embedded quite readily.

The network (client/server) mode of operation should be familiar to anyone who had used ORACLE, DB2, SQL Server or MySQL. Figure 2 illustrates this mode of operation.

Figure 2
Figure 2: Client/server mode

In Figure 2, an independent Java process is running Derby as a network server, listening to client connections over the network. This Derby network server has exclusive access to the database storage and can accept multiple simultaneous client connections. This allows multiple users to access the RDBMS at the same time over a network.

While not as convenient as the embedded mode of operation (i.e. you must run and manage a server process separately, perhaps on a different machine), Derby’s client-server mode can provide a viable solution when you need multi-user access.

In Figure 2, note that the client applications are coded to JDBC. There are no Derby-specific APIs. In fact, the very same JDBC application running in embedded mode in Figure 1 can be run in client-server mode with minimal configuration or code change.

Setting up a Derby database

To fully appreciate the ease of setup and use of Derby, you need to experience it for yourself. First, download and install derby. Next, make sure your CLASSPATH environment variable includes:
derby_installation_directory\lib\
	derby.jar
derby_installation_directory\lib\
	derbytools.jar
You will want to edit the supplied sp.bat file to set your derby_installation_directory, and then run it to set up your CLASSPATH.

Once you have set the CLASSPATH, change to a working directory. The database will be created within this directory. Copy the createdb.sql file from the db\createdb directory of the code distribution to this working directory. Next, start the command line interface (called ij) to Derby:

java org.apache.derby.tools.ij
You can also use the ij.bat supplied under the batch subdirectory, be sure to edit the batch file to add your own derby_installation_directory. This should start the command line utility, using the embedded Derby engine. The prompt you see will be similar to:
ij version 10.0
ij>
You can try typing in “help;” and see the commands available. To create the database and table that will be used, just run the createdb.sql file. Use the ij command:
run 'createdb.sql';
The run command will load a script file and execute the Derby commands within it, similar to a batch file for Windows. The content of createtb.sql is shown below.
connect
	'jdbc:derby:vsjdb;create=true';
drop table orders;
drop table custs;
create table custs
	(id char (5) not null,
		name char(40) not null,
			primary key(id));
create table orders
	(id char(8) not null,
		custid char(5) not null,
			total integer,
				primary key(id,custid));
insert into custs values (
	'1', 'John Smith');
insert into custs values (
	'2', 'Mary Todd');
insert into orders values(
	'0001', '1', 39999);
insert into orders values(
	'0002', '1', 2999);
insert into orders values(
	'0003', '1', 1904);
insert into orders values(
	'0004', '2', 3232);
insert into orders values(
	'0005', '2', 109900);
You need to have a semi-colon at the end of every line, exactly the same as if you entered the commands from the command line. The first line of the script is the connect command, with a connection URL that specifies the embedded Derby driver. The connect command in the script is:
connect
	'jdbc:derby:vsjdb;create=true';
The use of jdbc:derby indicates a Derby driver using JDBC protocol, while vsjdb is the name of the databse – to be created and accessed through the embedded driver. The create=true attribute indicates that the database should be created if it does not exist.

This createtb.sql script will create a database called vsjdb (if it does not exist), and then create a custs and orders table in this database. It will also populate the table with two customers and five orders. If you have worked with SQL on other databases, the SQL statements in the script should look very familiar to you.

You can check your working directory, and you will now see a vsjdb sub-directory. This is where the data is kept. A database can be backed up by simply making a copy of this subdirectory.

Table 1 and Table 2 summarise the two tables.

Table 1: The custs table
Field Name Data Type Note
Id char(5) primary key
Name char(40)  

Table 2: The orders table
Field Name Data Type Note
Id char(8) primary key (compound)
Custid char(5) primary key (compound)
Total int  

Working with RDBMS data

To see the customer records, use this SQL SELECT statement, at the ij prompt:
select * from custs;
To see the orders, use the command:
select * from orders;
To delete a customer, try the command:
delete from custs where id='1';
You can select the rows of the custs table again and see that the customer has been deleted. To delete all the orders associated with this customer, use the command:
delete from orders where custid='1';
You can now check the orders table and see that all orders from customer #1 have been deleted.

Before you leave ij, delete all the records from the database:

delete from custs;
delete from orders;
This will empty the database for the next part – writing customer Java code.

Use the exit command to exit from ij:

exit;
Since the connect URL specified the embedded driver, ij had exclusive access to the database all this time.

Embedded Derby in your own Java application

Time to rev up our Java compiler and write some code. This code will put Derby to a little test, giving you some confidence that the RDBMS is capable of dealing with real-world volumes of data.

The code presented below, called FillTable.java, will create 1000 customers with 10 orders each – a grand total of 10,000 orders. You can find the source of FillTable.java under the embed subdirectory.

import java.sql.*;
public class TableFiller {
	Connection conn = null;
	PreparedStatement insertCust
		= null;
	PreparedStatement insertOrder
		= null;
The declarations above are standard JDBC declarations. conn holds the JDBC connection. insertCust and insertOrder will hold the SQL statement for inserting records to the custs and orders table respectively. PreparedStatement is used to give the RDBMS a chance for optimization.

Next, in the code below, driverName contains the class name of the JDBC driver to load and use. This is the EmbeddedDriver. The connection URL is in url. You can see that an absolute path is used to locate the database (note that you always use forward slash as path separator). If you are always going to run this program from your working directory, you can just use “jdbc:derby:vsjdb”. You can also change the location of where Derby looks for the database files by setting the derby.system.home property (for example, java -Dderby.system.home=c:\mydbdir org.apache.derby.tools.ij).

String driverName =
"org.apache.derby.jdbc.EmbeddedDriver";
String url =
	"jdbc:derby:c:/derby1021/vsjdb";
The loadDrivers() method loads the JDBC driver class and creates obtains a JDBC connection from Derby. The two INSERT statements are then prepared.
public void loadDrivers()
	throws SQLException,
	ClassNotFoundException {
	Class.forName(driverName);
	conn = DriverManager.getConnection(
		url);
	insertCust = conn.prepareStatement(
		"INSERT INTO custs VALUES(
		?, ?)");
	insertOrder =
		conn.prepareStatement(
		"INSERT INTO orders VALUES(
		?, ?, ?)");
}
The main() method of TableFiller creates an instance of TableFiller, and call its addCustomer() method to create 1000 customers. It also calls the addOrder() to create the 10,000 orders. To indicate progress, it prints the customer number and ‘.’ to the system output.
public static void main(String args[])
	throws Exception {
	TableFiller tf = new TableFiller();
	tf.loadDrivers();
	int orderNum = 1;
	for (int i=1; i<1001; i++) {
		String custNum = "" + i;
		tf.addCustomer(
			custNum, "customer #" + i);
		System.out.print("\n"
			+ custNum);
		for (int j=1; j<11; j++) {
			tf.addOrder( ""+ orderNum,
				custNum, j * 99);
			orderNum++;
			System.out.print(".");
		}
	}
	tf.closeAll();
}
The addCustomer() method parameterizes and executes the insertCust PreparedStatement.
public void addCustomer(
	String id, String custname)
	throws SQLException {
	insertCust.setString(1, id);
	insertCust.setString(2, custname);
	insertCust.executeUpdate();
}
The addOrder() method parameterizes and executes the insertOrder PreparedStatement.
public void addOrder(String id,
	String custid, int total)
	throws SQLException {
	insertOrder.setString(1, id);
	insertOrder.setString(2, custid);
	insertOrder.setInt(3, total);
	insertOrder.executeUpdate();
}
closeAll() cleans up by closing the two PreparedStatement and the JDBC connection, telling Derby to release any held resources.
public void closeAll()
	throws SQLException {
	insertCust.close();
	insertOrder.close();
	conn.close();
}
}
As the customers and orders are created, the program will print the customer number to the screen. A number from 1 to 100 will appear as each customer is created, and a ‘.’ is printed for each order created.

To compile the code, just make sure you have set your CLASSPATH, and then use the standard Java compilation command:

javac TableFiller.java
Make sure you don’t have ij or another Derby session running – you need exclusive access to the database in embedded mode. After successful compilation, make sure you have copied the TableFiller.class file into your working directory, then run TableFiller from there:
java TableFiller
You should see the status being printed as the records are created in the database.

Verifying the created records

Now, to see the customers and orders that you’ve created, make sure you are in the working directory and log into ij again. Connect to the vsjdb database.
connect 'jdbc:derby:vsjdb';
Then use the following SELECT to see all the 1,000 customers.
select * from custs;
…and use the following SELECT to see all 10,000 orders.
select * from orders;
Of course, using standard JDBC coding similar to FillTable.java, you can integrate Derby’s relational database access functionality into any of your projects. The ability to easily and speedily handle production volume of data makes Derby a great choice if you need an embedded database for your project. Derby’s liberal Apache license allow you to bundle it with your project, even if you decided to sell your product commercially.

Exploring advanced Derby

Thanks to its big shop heritage during its tenure in IBM, Derby has a wealth of advanced functionality – typically only found in a high-end RDBMS. Figure 3 reveals several interesting features.

Figure 3
Figure 3: Database features in Derby

In Figure 3, the database storage contains data tables, schemas, relations information – as all relational databases must. In addition, you can see that it can also contain triggers, stored procedures, and even Java code. These advanced features will be explored in the rest of this article. Derby’s network mode of operation will also be used.

Setting up network mode of operations

To access Derby’s network server mode, you first need to add the following JAR to your CLASSPATH.
derby_installation_directory\lib\derbynet.jar
You will also need to download a network JDBC driver for client access. If you are using the IBM DB2 JDBC Universal driver, you will also need to add the following JARs to your CLASSPATH:
derby_installation_directory\lib\db2_jcc.jar
derby_installation_directory\lib\db2_jcc_license_c.jar
The sp.bat file in the batch subdirectory of the code distribution includes these JARs in the CLASSPATH.

To start the network server, use the command:

java org.apache.derby.drda.NetworkServerControl start
You can also use the netsvr.bat file in the batch subdirectory.

By default, the server will listen on TCP port 1527 for incoming client requests. You can use the “-p <port number>” argument to change the port.

If you need to shutdown the server, use the command:

java org.apache.derby.drda.NetworkServerControl shutdown
Or you can use the netstop.bat file in the batch subdirectory.

The server is now ready for our next stage of exploration.

Automating cascading delete

Now that you have over ten thousand orders, you need to make sure that the data in the database is not corrupted. Fixing data by hand is out of the question.

Let me illustrate a potential problem, which centres around the deletion of a customer.

Just delete the customer from the custs table using an SQL statement such as:

delete from custs where id='838';
This will delete customer number 838 from the database. Excellent!

If this is all you do, the data in the database is officially corrupted. Why? It is because now there are 10 orders in the database with no associated customer!

This can cause all sorts of applications to break. Any application that fetches order information and tries to print customer information will now have to deal with an exception. These orders will never show up if we tabulate all the customers and their orders.

In RDBMS terms, the referential integrity of the database is violated. This is because a customer record is deleted, without the deletion of the orders that refer to it.

Therefore, what we really wanted to do is this:

delete from custs where id='838';
delete from orders where custid='838';
Furthermore, we want to make sure the two delete commands are executed in the same transaction, so that they either both succeed or both fail.

This is called cascading delete, because the deletion of the customer record leads to the deletion of the orders.

Derby has the ability to handle referential integrity.

Derby triggers

One way to ensure referential integrity is to define a trigger. A trigger is code or SQL statement that will execute within the database server when data is modified or accessed. Derby supports the definition of triggers (using SQL or Java), and we can write code to delete the associated orders whenever a customer record is deleted. However, with Derby, there is a much simpler way.

It is possible to tell Derby, using standard SQL DDL (data definition) syntax, that there is a reference between the orders and custs tables. You can also tell Derby to perform the cascading delete automatically.

Adding referential constraint

To add the automatic cascading delete, modify the original createdb.sql. See the createdbref.sql file in the db\ref subdirectory of the distribution. You need to make a change to the table definition of the orders table. This change is shown in blue below:
create table orders
	(id char(8) not null,
	custid char(5) not null
	references custs (id) on delete
	cascade, total integer,
	primary key(id,custid));
Another difference is in the connection URL to database. This time, the network driver is specified. Note that the network driver requires a user and password, the default user id of APP and password of APP will work for us:
connect 'jdbc:derby:net://localhost/
	vsjdb:user=APP;password=APP;';
You can now create this new set of tables that enforce referential constraints. Make sure you have copied the createdbref.sql file to your working directory. Start ij again. Now, delete all of the records and create the new tables using the command:
run 'createdbref.sql';
After you’ve created the table, you can run FillTable.java again to populate the tables with 10,000 orders. Use the modified version in the network subdirectory. The modification to original TableFiller.java is shown in blue below:
import java.sql.*;
import java.util.Properties;
public class TableFiller {
	Connection conn = null;
	PreparedStatement insertCust
		= null;
	PreparedStatement insertOrder
		= null;
	String driverName=
		“com.ibm.db2.jcc.DB2Driver”;
	String url =
		“jdbc:derby:net://localhost/
		vsjdb:user=APP;password=APP;”;
	...
The driver loaded will now be the network driver (com.ibm.db2.jcc.DBDriver), and the URL has been changed to access the network JDBC driver as well. Otherwise, the code is exactly the same as the original TableFiller.java. If you are writing your own JDBC code, you can make these strings external configuration parameters, read in from a text file, and you will be able switch between embedded and network RDBMS with no code change required.

Multi-user concurrent access

Now, to observer multi-user connections, start a session of ij in another console window, and connect to the server using the command:
connect 'jdbc:derby:net://localhost/
	vsjdb:user=APP;password=APP;';
In the original console window, compile and run the new TableFiller.java. This will start the creation of the 1,000 customer records and 10,000 order records. While this is happening, go back to the ij window and issue commands such as:
select * from custs;
…and:
select * from orders;
You will be able to see all the customers and orders being created at the time when you issue the command. The network server allows multi-user concurrent access to the data.

Testing cascading delete

Once you have created the data records, try the following in ij:
select * from custs where id='700';
…followed by:
select * from orders where
	custid='700';
You will see the customer records, and the 10 orders of this customer.

Now, try to delete the customer, using the command:

delete from custs where id='700';
This will delete the customer record, cascading to the deletion of the associated orders. Now try the above selects again, you will see that all 10 orders have been deleted.

Writing a Derby stored procedure

As the last experiment, you will create a Derby stored procedure in the Java programming language. You will also store the code to the stored procedure within the Derby database – making it easy to distribute the data and code together.

The stored procedure will be called deleteAll(), and as its name implies, it will delete all records in the database. The Java code for this stored procedure is located in the stored sub-directory of the code distribution, and is called CleanTables.java. The code is reproduced below:

import java.sql.*;
public class CleanTables {
	public static void delAll ()
		throws SQLException {
		Connection conn =
		DriverManager.getConnection(
			"jdbc:default:connection");
		Statement delCusts =
			conn.createStatement();
		delCusts.executeUpdate(
			"delete from custs");
		delCusts.close();
		conn.close();
		return;
	}
}
You can observe that the stored procedure is a public static method in a Java class. The JDBC connector used by the code is obtained via the url:
jdbc:default:connection
This tells the Derby server to supply a default connection to the code. Derby will manage this connection. The actual statement that is executed is:
delete from custs;
Since cascading deletion is in effect, deleting everything from the custs table will also delete all orders from the orders table.

Storing Java code within the Derby database

To place the code into the Derby database, you need to first compile it and create a JAR file. Compile the code for CleanTables.java:
javac CleanTables.java
Create a JAR:
jar cvf vsjstproc.jar *.class
Or you can use the makejar.bat file to do this. Copy the resulting vsjstproc.jar into the working directory.

From the working directory, you need to install the JAR file into the database. Use the following command in ij (make sure your session is connected via the network driver):

call sqlj.install_jar('vsjstproc.jar',
	'APP.cleanTable', 0);
This command actually calls a stored procedure that will install the vsjproc.jar and gives it an internal name of cleanTable within the database. APP refers to the application schema. Having the JAR inside the database is very convenient. When you copy the database, the code will move with it.

If you ever need to remove this JAR from the database, you can use:

call sqlj.remove_jar(
	'APP.cleanTable', 0);
You only need the internal name in your remove call.

After installing the JAR file into the database, you need to tell Derby to look at the JAR when it needs to load Java code. This is done through setting a property via a stored procedure:

call syscs_util.syscs_set_database_
	property('derby.database.
	classpath', 'APP.cleanTable');
This will cause Derby to look inside the APP.cleanTable JAR file when loading Java classes. You are finally ready to define the stored procedure called deleteAll(). Use the following command:
create procedure deleteAll() parameter
	style java language java modifies
	sql data external name
	'CleanTables.delAll';
Derby will be able to find the CleanTable class from the APP.cleanTable JAR and locate the static delAll() method in that class. If you need to remove this stored procedure later, you can use the command:
drop procedure deleteAll;
With the stored procedure in place, you can now delete everything in the two tables by simply calling the stored procedure deleteAll() in ij:
call deleteAll();
After executing this stored procedure, you can verify that the two tables are now empty.

Conclusions

Derby is a feature-rich relational database system that can be integrated into your projects today.

The flexibility to support both embedded mode and client-server mode of operation allows it to adapt to diverse deployment scenarios. Its 100% Java orientation provides straightforward code integration, and enjoys the same ‘deploy anywhere’ benefit as your Java application. Its speedy execution on production data volume, together with support for advanced features such as referential integrity and stored projects, makes it ideal even for sophisticated data access requirements. Last but not least, its liberal Apache licence lets you bundle it freely into your products.

The open source mine has many hidden gems. If you don’t take a second look at Apache Derby, you may just have overlooked a diamond.


Sing Li is a consultant, trainer and freelance writer, who specializes in Java, web applications, distributed computing and peer-to-peer technologies. His recent publications include Early Adopter JXTA, Professional JINI and Professional Apache Tomcat, all published by Wrox Press.

Downloading and installing Derby

You can find Derby in the incubator area of the Apache site.

Once you have unzipped the distribution, note your base directory, this is your derby_installation_directory. You will need to use this directory in setting your classpath.

Downloading the network JDBC driver for Derby

In order to access Derby in network mode, you will need the network JDBC driver. IBM has donated this driver to the Derby project, but at the time of writing it is not yet part of the standard distribution.

For the time being, you will need to download the IBM DB2 JDBC Universal driver for Apache Derby Network Server. Follow the instruction at the URL to set your CLASSPATH.

You might also like...

Comments

About the author

Sing Li United States

Sing Li has been writing software, and writing about software for twenty plus years. His specialities include scalable distributed computing systems and peer-to-peer technologies. He now spends ...

Interested in writing for us? Find out more.

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” - Donald Knuth