Dependency inversion and the factory design patterns

This article was originally published in VSJ, which is now part of Developer Fusion.
Developers have been getting excited, and occasionally scratching their heads, about design patterns ever since (or even before) Gamma, Helm, Johnson and Vlissides (the “Gang of Four”, or GoF) published their seminal work, Design Patterns: Elements of Reusable Object-Oriented Software.

Some of the difficulties that developers experience when getting to grips with design patterns stems from the way that they are presented: with emphasis on UML diagrams and with discussions of “forces” and “intents”. This is all very reasonable, but there are times when a practical example, using the technology that the developer is working with, can clarify and show the beauty of the patterns very succinctly. This is the approach I will take in this article, examining a representative problem and examining how the application of a few of the creational design patterns solves it.

Defining the problem

At a fundamental level, design patterns represent canned experience that you can exploit when faced with a problem that has already been solved before. Consequently, for this article to work it needs some code and a problem.

The starter code

To highlight the use of the factory patterns, I’ve chosen to focus on a simple and fairly common piece of code that you might find in a data access layer:
public List<Customer> GetCustomerByCountry(
	string country)
{
	List<Customer> customers = new List<Customer>();
	SqlConnection con =
		new SqlConnection( GetConnectionString());
	SqlCommand cmd = new SqlCommand( "SELECT *
		FROM Customers WHERE Country = @Country", con );
	SqlParameter param =
		new SqlParameter( "@Country", country );
	cmd.Parameters.Add( param );
	SqlDataReader reader = null;
	try
	{
		con.Open();
		reader = cmd.ExecuteReader();
		while( reader.Read() )
		{
			// Read customer data
			customers.Add( new Customer( ... ) );
		}
	}
	finally
	{
		if( reader != null ) reader.Close();
		con.Close();
	}
	return customers;
}
The code above a reasonably typical example of a classic ADO.NET data access method, but I’ve deliberately omitted two parts from the code simply because they’re not relevant to the problem at hand, namely:
  1. Code to read the connection string from a configuration file, which would be the expected norm and which is encapsulated in the GetConnectionString() method
  2. The code that populates the Customer object (whose type is also not shown) from the returned data
So with our starter code in place, it’s time to introduce the problem.

The problem

So on the assumption that you own the above code, place yourself in the scenario where your project manager calls you into the office and drops the small bombshell that the code has to be updated to work with a range of databases. Putting that another way, you need to decouple the dependency of the code on the specific System.Data.SqlClient types. So you stroll back to your desk and start analysing the code to see how you can accommodate this new requirement. It’s highly likely that after a couple of minutes you would identify the three main problems with the code as it currently stands, namely:
  1. You need to replace the current SqlXXXX variables in the code with higher-level, database-agnostic, abstractions.
  2. You have to avoid the use of the new statement, since that requires you to specify a database-specific type to be instantiated.
  3. The @parameterName syntax is not consistent across all implementations of SQL.
Fortunately for you, the ADO.NET team has already designed their code using some common design, which you can exploit to help in your proposed solution. Specifically, they have used three design patterns that address the problem at hand and provide you with the answer, namely:
  1. Factory Method
  2. Abstract Factory
  3. Simple Factory, which happens not to be a GoF pattern

Working with abstractions

The vast majority of the common design patterns involve working with abstractions, and abstract base classes or interfaces in particular. This is an essential requirement when decoupling code, which is at the heart of the problem that you’re currently in the process of solving. Specifically, it allows the consumer (the data access code shown above) to be programmed against the abstractions, rather than against the concrete implementations. In effect, this is the same behaviour that we experienced with COM programming, with clients and servers being written against interfaces, enabling us to select an implementation at run-time based on a CLSID or ProgID.

Fortunately (or rather, by good design) the ADO.NET managed provider architecture is designed around just such a suite of abstract base classes, including DbCommand, DbConnection, DbParameter, all of which can be found in the System.Data.Common namespace. Each managed provider consists of a set of concrete types that are derived from their abstracted base. For example, SqlConnection derives from DbConnection and SqlCommand derives from DbCommand.

You should be realising that if you can rewrite the code using the abstract types in place of the currently hardcoded SqlClient types, then the system should be able to work with many different database engines using a single code base.

The need for factories

Working with abstractions is only possible if you can avoid lines of code that create objects using their concrete types, as shown below:
SqlConnection con =
	new SqlConnection();
Clearly, this can be trivially replaced with:
DbConnection con =
	new SqlConnection();
However, this would still leave the code tied to a specific managed provider, because the concrete type is still being used directly. What is required, therefore, is a way of encapsulating the construction of an object, and that is precisely why we use factories.

Factory Method

So let’s examine the first pattern that is going to help you solve the problem at hand: Factory Method. As its name implies, the Factory Method pattern is related to object construction. It defines an approach whereby the type of an object that is being constructed is determined by the type of the object that is making it.

If that sounds like a bit of a mouthful, you can find a clear example of this design pattern in the ExecuteReader() method of the DbCommand class, which is red in the following code snippet:

public abstract class DbCommand
{
	public abstract DbDataReader
		ExecuteReader();
	... // other methods elided
	    // for clarity
}
public class SqlCommand : DbCommand
{
	public override DbDataReader
		ExecuteReader()
	{
		SqlDataReader reader;
		... // creation code elided
		return reader;
	}
}
public class OleDbCommand : DbCommand
{
	public override DbDataReader
		ExecuteReader()
	{
		OleDbDataReader reader;
		... // creation code elided
		return reader;
	}
}
public abstract class DbDataReader
		{ ... }
public class SqlDataReader :
	DbDataReader
		{ ... }
public class OleDbDataReader :
	DbDataReader
		{ ... }
You can see the UML representation of this code, and therefore of the Factory Method pattern in Figure 1.

Figure 1
Figure 1: The Factory Method Pattern in UML using the actual ADO.NET types

As you can see from the code, each concrete command class creates its own provider-specific data reader that is derived from the abstract DbDataReader type. This means that you can now write code like this:

DbCommand cmd;
cmd = ... ; // to be covered next
DbDataReader reader;
reader = cmd.ExecuteReader();
...
The code above is looking promising, as you’re now programming against the DbDataReader and DbCommand abstractions, but you’re still faced with the problem of how to create the appropriate database-specific command and connection objects. Given that you need to make multiple, related objects it’s time to introduce the second pattern: Abstract Factory.

Abstract Factory

When discussing factories, it becomes easy to think that a factory makes a single product. However, this is not always the case. For example, there might be a Ford plant churning out lots of different types of spare parts for Ford cars, and a Toyota factory creating lots of spare parts for Toyota cars. The concept of a factory that creates many different, but related, parts is encapsulated in the GoF’s Abstract Factory pattern.

The first stage in working with the Abstract Factory pattern is to define the abstractions for the individual parts that the factory will make. In the case of ADO.NET (and in particular restricting the scope to our problem), these are the DbCommand, DbParameter and DbConnection types. With these in place it becomes easy to see how the ADO.NET team reached their definition for the abstract factory type, as follows:

public abstract class
System.Data.Common.DbProviderFactory
{
	public abstract DbConnection
		CreateConnection();
	public abstract DbCommand
		CreateCommand();
	public abstract DbParameter
		CreateParameter();
	... // other items elided
	    // for clarity
}
Each managed provider has its own derived version of DbProviderFactory, such as SqlClientFactory, OleDbFactory and OracleClientFactory. Again, for completeness you can see the UML diagram for the Abstract Factory Pattern in Figure 2.

Figure 2
Figure 2: The Abstract Factory Pattern in UML using the actual ADO.NET types

You should also recognise that the CreateConnection(), CreateCommand() and CreateParameter() methods are each individually manifestations of the Factory Method pattern. This is very much how you would expect an Abstract Factory to be defined, as the type of the objects that will be made will depend on the precise type of factory object that is used in the manufacturing process.

So how does using the Abstract Factory pattern affect the code? Well, the essential parts of the code now look like this:

DbProviderFactory factory = ... ;
// to be covered later
DbConnection con =
	factory.CreateConnection();
con.ConnectionString = ... ;
// read from configuration file
DbCommand cmd =
	factory.CreateCommand();
cmd.CommandText = ...;
cmd.Connection = con;
DbDataReader reader = null
try
{
	con.Open();
	reader = cmd.ExecuteReader();
	while( reader.Read() )
	{
	}
}
finally
{
	if( reader != null )
		reader.Close();
	con.Close();
}
As you can see, the code is almost completely independent of any database-specific code. There is, however, one final step in the migration process so that the code becomes database-agnostic: creating the initial factory. For that, you need to deal with a final, non-GoF pattern: Simple Factory.

Simple Factory

In short, a Simple Factory is a method that takes in some form of identifier and which returns an object of the appropriate type. The great benefit of Simple Factory is that it enables you to remove any final dependencies in the code, as it abstracts away the instantiation of an object.

The ADO.NET team use this technique to allow us to create our factory objects, via the DbProviderFactories.GetFactory() method, which is defined as follows:

public static class
	DbProviderFactories
{
	public static DbProviderFactory
		GetFactory(string providerName)
	{
		... // implementation elided
		    // for clarity
	}
}
The GetFactory() method takes a parameter specifying the name of the managed provider for which you want to create the factory, so for example you would use the string “System.Data.SqlClient” when using SQL Server and “System.Data.OracleClient” when using Oracle. Of course, such strings are ideally stored in the application’s (or web) configuration file, commonly either as part of the <connectionStrings> element or in an <appSettings> section.

Thus the code is finally converted into:

<!- In the configuration file ->
<connectionStrings>
	<clear />
<add name="nwind" connectionString="..."
	providerName="System.Data.SqlClient" />
</connectionStrings>

// The revised source code
public List<Customer> GetCustomerByCountry(
	string country )
{
	ConnectionStringSettings settings =
		ConfigurationManager.ConnectionStrings["nwind"];
	List<Customer> customers = new List<Customer>();
	DbProviderFactory factory =
		DbProviderFactories.GetFactory(
		settings.ProviderName );
	DbConnection con = factory.CreateConnection();
	con.ConnectionString = settings.ConnectionString;
		DbCommand cmd = factory.CreateCommand();
	cmd.Connection = con;
	cmd.CommandText =
		"SELECT * FROM Customers WHERE Country = @Country";
	DbParameter param = factory.CreateParameter();
	param.Name = "@Country";
	param.Value = country;
	param.DbType = DbType.String ;
	cmd.Parameters.Add( param );
	DbDataReader reader = null;
	try
	{
		con.Open();
		reader = cmd.ExecuteReader();
		while( reader.Read() )
		{
			// Read customer data
			customers.Add( new Customer( ... ) );
		}
	}
	finally
	{
		if( reader != null ) reader.Close();
		con.Close();
	}
	return customers;
}
This code now has no dependencies on any provider-specific types, which means that it would theoretically be possible to simply change the connection string and provider names within the configuration file in order to work with a different database. Insofar as dealing with the standard coding challenge, the three patterns have worked admirably. However, in this specific situation of working with databases there is a pitfall: the SQL dialects that each database uses.

The SQL dialect trap

Unfortunately, the code presented above still contains a database-specific dependency, even though it would appear to be provider neutral. The problem is that different database engines take different approaches to specifying parameters in queries. The snippets below show how the query would need to be represented for three common database engines:

SQL Server

SELECT * FROM Customers
	WHERE Country = @Country
Microsoft Access
SELECT * FROM Customers
	WHERE Country = ?
Oracle
SELECT * FROM Customers
	WHERE Country = :Country
Clearly, this is a major inconvenience to writing database-agnostic code, but it is no means the only issue that you will come across, as there are subtleties of implementation throughout SQL. So does this represent an insurmountable problem? Not necessarily. One solution that is presented in the MSDN Library documentation is to use string concatenation to prepare commands, rather than use parameterised queries. Personally, I believe that the security implications of that approach might be too unpalatable for most development (and certainly security) teams. At the least it would require extensive validation of the input in order to prevent SQL injection attacks.

An alternative springs to mind, however. If you chose to use stored procedures, a decision which assumes that stored procedures are available on all database engines, then only the parameter name remains an issue; the command string itself is embedded within the database. To handle this you could, of course, trivially develop a Simple Factory or Factory Method implementation that returns an appropriate parameter name given an input string.

Other issues arising from the new code

Setting aside the issues of SQL dialects, are there any other issues with the new code? There is perhaps one slightly unappealing aspect of this code: the code doesn’t feel quite as “slick” as the provider-specific code. You might have noticed that creating the command object takes a few more lines of code, because both the connection and the command text have to be specified by setting properties. This is probably not a major issue, but it does highlight one point: patterns typically have some “negative” consequences that you will have to deal with.

The other direct issue that arises from the use of these particular patterns is that you immediately become constrained to using the methods and properties that are defined on the abstract classes (or interfaces). It might certainly be tempting to perform a downcast to a derived type, but the moment that you do that you lose all of the benefits that these patterns provide. Clearly, therefore, patterns should only be adopted when these negative consequences would break the application or when they massively outweigh any benefits.

The OO gobbledygook

You knew that there had to be some OO gobbledygook in an article on design patterns, and it comes now. There’s actually a guiding principle in OO design that summarises what we’ve done, and it’s called the Dependency Inversion Principle.

When we started out, we had a fairly classical dependency from the high level component, which contained our GetCustomerByCounty() method, on the low level SQL Server managed provider component, as shown in Figure 3.

Figure 3
Figure 3: The initial dependency

The Dependency Inversion Principle states that you should always depend on abstractions, no matter whether you are a high or level component. Figure 4 shows the dependencies once the code has been reworked with the factory patterns.

Figure 4
Figure 4: The inverted dependencies

As you can see, all the dependency arrows point towards the abstract base classes defined in the System.Data.Common namespace, and thus the dependency on the managed provider(s) has been inverted.

Conclusion

This article has shown how using some simple design patterns can improve the flexibility of your code. The solution to our problem, how to remove the dependencies in our code on a specific set of types, was provided through the use of three patterns: Factory Method, Abstract Factory and Simple Factory.

The common theme across these patterns is that the factories provide methods that return abstract type references, such as DbProviderFactory or DbConnection. This enabled us to program against the abstractions, thus removing the dependencies. Of course, real contract types are being instantiated, but as that object construction is encapsulated away inside the factories our code remains clean and dependency free.

What makes our solution even more appealing in this particular case is that the ADO.NET team has already done most of the work for us by pre-empting the problem and providing a set of abstract classes that enables us to use these patterns without reinventing the wheel. This just goes to show the power of a design built around the Dependency Inversion Principle and a few creational patterns.


David Wheeler is a freelance trainer and consultant, specialising in skills transfer to teams working with Microsoft .NET. He helps moderate Microsoft’s ASP.NET forums, is a regular speaker at DevWeek, and will also be presenting sessions on how to use Design Patterns in .NET at Software Architect 2007. He can be contacted at [email protected].

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Debugging is anticipated with distaste, performed with reluctance, and bragged about forever.” - Dan Kaminsky