CLR and SQL Server 2005

This article was originally published in VSJ, which is now part of Developer Fusion.
For a developer, SQL Server 2005 must look like a candy store: no matter what you are interested in, it is chock-full of goodies. For a .NET developer one of the biggest new features is the inclusion of the Common Language Runtime (CLR) inside the SQL Server. This allows you as a developer to write stored procedures, user-defined functions and triggers in your favorite CLR language. In addition you can also create user-defined types and user-defined aggregates using the CLR. In this article we’ll look at the implications of this for a developer, but first we need to look briefly at how the integration of the CLR has been done.

SQL Server as a CLR host

By including the CLR inside SQL Server, SQL Server acts as a host of the CLR much like IIS and ASP.NET. Having databases host what can be regarded as Virtual Machines (VM’s) is not something new for SQL Server. Other database vendors have been doing it for a while. What is different, though, is that most other vendors are hosting the VM out of process, whereas SQL Server hosts the CLR in the same memory space as SQL itself.

By hosting the CLR in-process we are achieving very good performance, but, as every developer knows, running something in-process also exposes the risk of rogue code bringing down the hosting process. In the case of IIS/ASP.NET this is not that a big problem, as this would just cause a recycling of the worker process. However, recycling the equivalent of the worker process in SQL Server would mean restarting the server. Obviously we can’t do this with SQL Server, as we need SQL Server to be up and running as close to 24/7 as possible.

To overcome this problem, the hosting mechanism relies on new features in CLR 2.0, which gives the host much more control over the environment than what was possible in CLR 1.x. In essence, SQL Server has total control over the CLR, memory allocation, creation of threads etc. If things go seriously wrong SQL Server has the ability to tear down CLR application domains and even close down the CLR process.

This is one reason why using the CLR hosted inside SQL Server (SQLCLR) is much safer than running extended stored procedures (XP’s) or the stored procedures used to interact with COM objects (sp_OACreate etc). The other reason is that even when we write CLR code, which can potentially do anything, the DBA can limit what the code is allowed to do. This is achieved by assignment of permissions during creation time.

Creation of SQLCLR code

The code you want to use inside SQLCLR has first to be compiled into an assembly (dll), and then uploaded to a SQL Server database. By uploading the assembly it is stored in a system catalog. Later, when executing some methods in the assembly, the assembly is loaded from that system catalog, instead of from the file system or the Global Assembly Cache (GAC).

The upload is done through the new T-SQL syntax: CREATE ASSEMBLY. Imagine that you have compiled the source code in this listing:

using System;
using Microsoft.SqlServer.Server;
public class Math {
	[SqlFunction]
	public static SqlDouble ClrFact(SqlDouble x) {
		SqlDouble y = 1.0;
		while(x > 1.0) {
			y *= x;
			x -= 1;
		}
		return y;
	}
}
…into an assembly called math.dll, and this assembly is located in the c:\assemblies directory.

The syntax to upload the assembly to the Northwind database in SQL Server looks like this.

use [Northwind];
go
CREATE ASSEMBLY [MathAsm]
FROM 'c:\assemblies\math.dll'
WITH PERMISSION_SET = SAFE
go
You can see how the assembly is uploaded from the file system, and it is assigned the SAFE permission set. You do not necessarily need to create the assembly from disk, the FROM statement can also be the binary representation of the assembly. For the complete syntax of how to upload an assembly I suggest the CREATE ASSEMBLY topic in SQL Server Books OnLine (BOL).

So we have now uploaded the assembly to SQL Server and we are ready to use it. Well, not really; because we need to create a T-SQL wrapper around the methods first as there is no such thing as a C#/VB.NET/your_language.NET stored procedure, user-defined function or trigger. This also implies that the .NET method’s parameter/return type(s) need to be SQL compatible. In addition the method has to be public static.

To create a T-SQL object against the method(s) you use the “normal” CREATE PROCEDURE/FUNCTION/TRIGGER syntax, but with a little “twist”. Instead of defining the method body in your CREATE statement, you point to your CLR method by defining the assembly, class and method name. This is accomplished by using the EXTERNAL NAME syntax in your CREATE STATEMENT. In the listing below, you can see the syntax to create a function called FactCLR from the method in the source code (above) for math.dll:

use [Northwind];
go
CREATE FUNCTION FactClr(@x FLOAT)
RETURNS FLOAT
EXTERNAL NAME MathAsm.Math.ClrFact
go
Notice how the name of the wrapper function does not have to be the same as the name of the CLR function. You should also be aware that the class name and method name in EXTERNAL NAME are case sensitive.

Visual Studio 2005 and SQLCLR code

As you can see from the above, there are quite a few steps to do when uploading code to SQL Server, especially if you have many methods to create. To make this easier for the developer Visual Studio 2005 introduces a new project type, SQL Server Project, which provides code templates that make it easy for developers to write code for the SQLCLR.

The project type allows the developer to automatically upload the compiled assembly to the database. The deployment also automatically creates in the database the procedures, functions, triggers, types and aggregates defined in the assembly based on custom attributes (SqlProcedure, SqlFunction, SqlTrigger, etc.) with which the developer adorns his code. An example of this is the method in the source code (above) for math.dll, which has a SqlFunction attribute. If the developer used Visual Studio 2005 and the SQL Server Project, the method would automatically be created as a user-defined function in SQL Server when the developer deployed the assembly through Visual Studio.

One big part of development is debugging. Visual Studio 2005 provides the database developer with a seamless debugging of code running in SQL Server. If the developer uses the SQL Server Project, he can just press F5 and automatically step into his code inside SQL Server.

Usage of CLR code in SQL server

Having seen how to use the SQLCLR above, the question is now when to use it? A common misconception when people first heard about the SQLCLR was that T-SQL would be dead in the water, and we should now write all our code using the SQLCLR. This is absolutely not true; the T-SQL language is far from dead and in SQL Server 2005 there are significant enhancements to the language. Some exciting new features are:
  • Structured exception handling using try and catch blocks
  • The ability to write recursive queries to traverse hierarchies
  • New analytical functions
In a database application where we do data manipulation we can divide the operations into two main parts; the declarative part where we use the query language for SELECT/INSERT/UPDATE/DELETE statements, and the procedural part with WHILE statements, assignments, triggers, cursors etc. The T-SQL language is highly effective for declarative statements, and takes full advantage of the power of the query processor. However, T-SQL is less effective for procedural programming, and we should try to avoid procedural constructs if they can be expressed declaratively.

Bearing this in mind, we should try to avoid using CLR for declarative coding, but use it as an effective alternative for logic that cannot be expressed declaratively. Some examples where CLR code can be used instead of T-SQL are:

  • Operations which need to access resources outside of SQL Server. CLR has many system libraries which makes it very easy and extremely efficient to access external resources: databases, file system, web etc.
  • Where we can utilize the system libraries for operations inside SQL Server: string manipulation, data validation through regular expressions, etc.
Another scenario is where you need to perform complex calculations on a per-row basis over data in database tables. Here you will benefit greatly from the fact that the CLR code is compiled into and executed as true machine code. The method in the source code (above) for math.dll is an example of this. This method could be re-written as T-SQL code:
create function sqlFact(@x float)
returns float
as
begin
	declare @y float;
	set @y = 1.0;
	while (@x > 1)
	begin
		set @y = @x * @y;
		set @x = @x - 1;
	end
	return @y
end;
These two methods would be used to calculate values based on a column in a table and return a result for each row.

Here is an example of how these methods would be executed, and you can see how we are measuring the time it takes to perform the calculations:

declare @start datetime
set @start = getdate()
select dbo.sqlFact(quantity)
from [order details]
select datediff(ms,
	@start, getdate())
go
declare @start datetime
set @start = getdate()
select dbo.FactClr(quantity)
from [order details]
select datediff(ms,
	@start, getdate())
go
In the example the functions are executed over the quantity column in the Order Details table in the Northwind database. On the machine I used to run these functions, the CLR method ran about six times faster than the T-SQL function!

Other examples when the use of CLR code can be beneficial are:

  • As a replacement for Extended Store Procedures (XPs)
  • Where custom aggregations need to be done over data. Examples of these types of aggregations can be statistical calculations, which you would perform with T-SQL cursors if the CLR wasn’t available. For these types of calculations you can use the new user-defined aggregates (UDA’s) in the SQLCLR.
  • When you need to extend the SQL type system with custom types. This would be done by the user-defined type (UDT) construct. Examples where you would use UDT’s are custom date or time types and currency types.

Summary

The SQLCLR feature in SQL Server 2005 gives database developers another weapon in their arsenal, and it will allow them to develop even more powerful and efficient applications. However, it needs to be used with care, and should not be seen as a replacement for T-SQL.


Niels Berglund is a member of DevelopMentor’s technical research, curriculum development and teaching staff, and is co-author of its database courses. He specializes in the .NET system software and database areas, and is co-author of the first book released for SQL Server 2005: A First Look at SQL Server 2005 for Developers.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Better train people and risk they leave – than do nothing and risk they stay.” - Anonymous