Sleight of page

This article was originally published in VSJ, which is now part of Developer Fusion.
The term “sleight of hand” describes a secret manipulation used by a magician or card sharp. It is also known as “prestidigitation”, a splendid but underused word in my opinion. This article is about a number of techniques in ASP.NET used to substitute a different page in the response for the one the user originally requested, usually without their knowledge. Hence for my title I have coined the phrase “sleight of page”.

Why substitute?

When you enter a URL into a browser a request is sent to the server pointed at by the host part of the URL. I’m going to call the server (or group of servers) to which the IP address is routed the “physical” location. In the simple case, this server parses the remainder of the URL and returns the appropriate page, usually based on a file path under the server root which I’ll call the physical path, as shown in Figure 1.

Figure 1
Figure 1: The physical path

In many scenarios the server that you want to service the request may not be the server at the physical address:

  • When moving a site from one server to another
  • In load balancing where multiple servers handle the same site
  • Domain registration systems which point many domains at a single server
There are also scenarios where you might wish to respond to a URL that doesn’t match the physical location of resource you intend to handle the request with:
  • A friendly URL published in a magazine or newspaper
  • A link or bookmark that links to an old version of the site
  • Dynamically generated content

The oldest trick in the book

HTTP redirection is a browser-based solution to this problem, and works by sending a response that directs the browser to the correct location, see Figure 2.

Figure 2
Figure 2: Using redirection

This can be accomplished by configuring the server (IIS supports redirection on Virtual Directories), or in ASP.NET by calling Response.Redirect. This replaces the standard response with one containing a redirect instruction to the browser, and if the endResponse parameter is set it terminates processing of the request immediately:

Response.Redirect(
	"http://www.vsj.co.uk/", true);
One advantage of this method is that the form can compose the target URL in any way it wishes, for example by parsing a query string, then plays no further role in handling the request. This technique is more often used to enable less technical website owners to redirect a domain to a URL than it is to get involved in modifying DNS records. Changes to the redirect are instant (they do not have to propagate through DNS – see box The Domain Name System).

The main drawback to HTTP redirection is that most browsers update the URL in the address bar when they receive a redirect, which may not be desirable. If you have paid a domain registration company for the URL “www.mycompany.com” you may not be pleased that the address the user sees after entering the site is “www.mywebcontractor.com/clients/mycompany”. Bookmarks will also be made to the target address rather than to your domain, so if the hosting provider changes in the future they may end up being out of date!

In the frame

Domain registration companies have long since recognised the limitations of HTTP redirection, but still like instant updates and allowing the user to enter a URL rather than just a server address.

The most common alternative approach uses frames to mask the redirected URL. The effect of this is shown in Figure 3, and the code for a basic redirection page is in Listing 1.

Listing 1: Frame for Redirection

<html>
	<frameset>
		<frame src=http://www.vsj.co.uk/
			frameborder="0"
			scrolling="no"/>
		<noframes>
		<a href="http://www.vsj.co.uk/">
Click here to tranfer to new site</a>
		</noframes>
	</frameset>
</html>

Figure 3
Figure 3: Masking redirection

There was a time when not all browsers supported frames, but now only a few mobile browsers still have limitations in this area. A noframes section can give provide a backup, albeit not a very graceful one.

This mechanism can work well for nearly any type of website. Links within the site will update only the frame (and thus not the URL in the address bar), although this can be overridden using a target="_top" attribute on the link. In theory it is possible to implement deep links (i.e. links where a path and/or file name follow the domain in the original URL) using this method although few domain hosts offer this in practise. This is usually seen as the main limitation of this method, especially since bookmarks are made to the address in the address bar, thus to the homepage.

Listings 2 and 3 show the mark-up and code respectively for redirecting to VSJ articles by article number. You would need a separate ASP.NET form for each type of page you wished to redirect using this method, so you could end up replicating most of the structure of the target site.

Listing 2: Page for Deep Redirection

<html xmlns="http://www.w3.org/1999/xhtml">
	<frameset frameborder="0">
		<frame src="<%# InitialUrl %>" />
	</frameset>
</html>

Listing 3: Code for Deep Redirection

using System;
using System.Web;
public partial class FramesDeep : System.Web.UI.Page
{
	public string InitialUrl;
	protected void Page_Load(object sender, EventArgs e)
	{
// Test with URL "FramesDeep.aspx?ArticleNo=610"
		InitialUrl = "http://www.vsj.co.uk/dotnet/display.asp?id="
				+ Request.QueryString["ArticleNo"];
		DataBind();
	}
}

Page forwarding

Both of the previous techniques rely on the browser to make two requests, the first to get the redirect or frame page, and the second to get the real content. This may be an acceptable overhead to redirect from a domain to a homepage, but becomes hopelessly inefficient if used frequently within a web site. When redirecting to pages within an ASP.NET site an additional option is available in the form of the HttpServerUtility.Transfer function. This populates the response with the output of a specified page on the site instead of that from the currently executing page (see Figure 4).

Figure 4
Figure 4: Page forwarding

This example shows how a page might redirect when faced with a fatal error:

if ( true /* Service is available */ )
{
	Server.Transfer(
		"Error.aspx?msg="
		+ Server.UrlEncode(
		"Service Not Available,
		please try again later" ) );
}
This technique can be used to transition from one form to another within the site too, but since the URL in the browser does not change it can cause unexpected behaviour if the user hits the back button. Cross-page posting may be a more appropriate way to achieve this (see my previous article, Posting Across Pages).

If the target of the transfer is a static file (e.g. a static HTML error page) then you can use HttpServerUtility.TransferFile which avoids the overhead of processing a second page by copying the contents of a file directly into the HTTP response. You will need to ensure that the HTTP headers are configured appropriately for the file being returned.

URL rewriting

Using Server.Transfer causes two pages to execute on the server, and so has some server side overhead. It also only works where the URL points at an ASP.NET page that can invoke Server.Transfer.

ASP.NET provides extensibility in the form of modules implementing the IHttpModule interface. These modules can intercept requests (and responses) at an early stage in the ASP.NET processing pipeline and apply changes to them, including rewriting the URL. The process is shown in Figure 5. Although this still results in the execution of two entities, an HttpModule has a far lower overhead than a page.

Figure 5
Figure 5: Intercepting requests

Listing 4 shows the code for a simple HttpModule that detects requests for a specific URL (“Home.apsx”) and rewrites the path so that a new version of the page can be served. To add this HttpModule to your application you also need to add it to your web.config file. The way you do this depends on whether you are using IIS6 or IIS7; Listing 5 shows both variants along with an extra tag that can allow the two configurations to co-exist. This is useful in projects that may be deployed on either IIS version.

Listing 4: Simple HttpModule

using System;
using System.Web;
public class NewSiteModule : IHttpModule
{
	public NewSiteModule()
	{ }
	#region IHttpModule Members
	public void Dispose()
	{ }
	public void Init( HttpApplication context )
	{
		context.BeginRequest += new EventHandler(
			context_BeginRequest );
	}
	void context_BeginRequest(object sender, EventArgs e)
	{
		HttpApplication app = (HttpApplication)sender;
		if (app.Context.Request.Path.ToLower().Contains("home.aspx"))
		{
			app.Context.RewritePath( "~/NewHome.aspx" );
		}
	}
	#endregion
}

Listing 5 – Installing an HttpModule

<?xml version="1.0"?>
<configuration>
	<appSettings/>
	<connectionStrings/>
	<system.web>
		<compilation debug="false" />
		<authentication mode="Windows" />
		<!-- START IIS6 CONFIG -->
		<httpModules>
			<add name="NewSiteModule" type="NewSiteModule"/>
		</httpModules>
		<!-- END IIS6 CONFIG -->
	</system.web>
	<!-- START IIS7 CONFIG -->
	<system.webServer>
		<!-- Including this tag allows you
			to have BOTH IIS6 and IIS7 config tags -->
		<validation
			validateIntegratedModeConfiguration="false"/>
		<modules>
			<add name="NewSiteModule" type="NewSiteModule" />
		</modules>
		</system.webServer>
	<!-- END IIS7 CONFIG -->
</configuration>

URL rewriting with a HttpModule has the advantage that it is completely transparent (the browser is never aware that this substitution has ever taken place, nor is the page that is served) and that the page the URL refers to doesn’t have to exist on the filesystem. ASP.NET will handle the request (since it has a .aspx extension) and the IHttpModule.BeginRequest event fires before ASP.NET looks to see if the page exists or not!

It is possible to mirror the structure of a whole site in this way. I worked with one customer who had moved their site from an in-house implementation to one provided by an external supplier using a third-party framework. We were able to ensure that any bookmarks to old pages still worked by having a HttpModule look up an XML file (you could also use a database) containing mappings from old URLs to new ones.

It is also possible to provide friendly URLs for your users in this way. On another site product pages are all served using a single page with a query string: “Product.aspx?ProductName=MyProduct”. To make more presentable URLs for customers, I use a HttpModule to rewrite URLs like “Product/MyProduct.aspx” to invoke the real Product.aspx page.

Some search engine crawlers don’t cope well with query strings, so as well as providing a user friendly URL this technique is helping to provide a search-engine friendly one.

Ensuring your HttpModule is invoked

In order for a HttpModule to modify a request, it must be in the request processing pipeline. As we have already seen, we can place your HttpModule in the ASP.NET processing pipeline simply by adding it to your web.config file. This isn’t the full story as ASP.NET doesn’t get involved in handling all web requests. How requests are routed to ASP.NET depends on the whether you are using the .NET development server, or the IIS version you are using.

IIS 6
In IIS6, ASP.NET is only invoked if the file extension of the URL is specifically mapped to it. These mappings are established in the “Internet Information Services” configuration tool. Open the “Properties” page for the site, and on the “Virtual Directories” tab click the “Configuration” button. The “Application Configuration” dialog should appear as shown in Figure 6.

Figure 6
Figure 6: The Application Configuration dialog

By default the standard ASP.NET extensions such as .apsx, .asmx and .ashx all map to ASP.NET. You can also see some classic ASP mappings here. Anything that does not appear in this list will be handled by native IIS without any recourse to managed code, and this usually results in the URL being treated as a reference to a static file.

If you wish to rewrite the URL of a resource type that is usually handled by native IIS (e.g. .html or .gif), then you need to add it to this list. The best way to determine the “Executable Path” you should use is to copy it from an entry already mapped to ASP.NET. The HTTP Verbs handled should usually include GET, HEAD and POST, and optionally DEBUG. If we rewrite the URL to point to an ASP.NET page or handler (.aspx, .ashx) ASP.NET all will be well as ASP.NET understands how to handle such requests. If, however, we wish to rewrite the URL to point to a static resource in a different location (e.g. rewriting to a reference a different .jpg or .html resource) we have a problem, because ASP.NET does not inherently know how to handle requests of these types. We can tell it how to handle these requests by adding a httpHandlers section to the web.config file and mapping to an appropriate handler:

<httpHandlers>
	<add verb="GET" path="*.jpg"
	type="System.Web.StaticFileHandler"/>
</httpHandlers>

IIS 7
For ASP.NET developers, IIS7 is a huge step forward from its predecessors. This is because ASP.NET is much more tightly integrated with the server than in previous versions, where it was just one possible request processor integrated using ISAPI. In IIS7 ASP.NET is integrated into the request processing pipeline, and you can configure ASP.NET HttpModules to be invoked for all requests, even if they will not ultimately be handled by ASP.NET. To enable this feature, add a runAllManagedModulesForAllRequests attribute to the <modules> tag in your web.config file:

<modules
	runAllManagedModulesForAllRequests
	="true">
	<add name="NewSiteModule"
		type="NewSiteModule" />
</modules>
In IIS7 there is no need to add a StaticFileHandler as there was in IIS6 because our HttpModule is part of the main IIS request processing pipeline (we have NOT had to direct the request out of IIS and into ASP.NET) and the main IIS request processing pipeline already handles requests as static files by default (just like the main IIS6 pipeline).

ASP.NET Development Server
The Development Server is launched by Visual Studio to enable test and debugging of websites on machines that don’t have IIS installed. This server is NOT derived from IIS and works entirely in the managed space. All requests pass through the ASP.NET processing pipeline, and those not routed elsewhere are processed using the ASP.NET StaticFileHandler. This means that you don’t need to do any specific configuration to get ASP.NET to handle requests for non-ASP.NET file types. A common source of deployment problems arises from this, as sites that work perfectly in the development server will need additional configuration to work on IIS. The Development Server in VS2007 uses an IIS6 style web.config file, the VS2008 version is configurable to use IIS6 or IIS7 style configuration files.

Alternative to HttpModules

Inserting an HttpModule into the processing pipeline is an extremely flexible way of changing the resource that will be processed in response to a particular request URL. The disadvantage of this approach is that your HttpModule will be loaded and invoked for each request, whether or not there is a redirection associated with it. This clearly introduces a small overhead to the processing of each request, even if your implementation is very efficient.

ASP.NET supports mapping certain resource names to a HttpHandler based on a wildcard. We already looked at adding a StaticFileHandler (one of the built in ASP.NET HttpHandlers) to cope with redirected requests to file types not normally handled by ASP.NET. For example, if your ASP.NET site is replacing a previous PHP implementation, you could use the IHttpHandler implementation shown in Listing 6 to redirect all requests for pages ending with .php (e.g. old links, bookmarks) to your new homepage.

Listing 6 – PHPHandler

using System;
using System.Web;
public class PHPHandler :
	IHttpHandler
{
	public void ProcessRequest(
		HttpContext context )
	{
		context.Server.Transfer(
			"~/NewHome.aspx" );
	}
	public bool IsReusable
	{
		get { return true; }
	}
}

I have implemented this handler in a .cs file in the App_Code directory rather than as a .ashx handler file because the latter have their class names mangled during compilation and so cannot easily be referenced for installation. You can install the HttpHandler by adding the following to your web.config file:

<httpHandlers>
	<add verb="GET" path="*.php"
		type="PHPHandler"/>
</httpHandlers>
Note that this maps all requests with the .php extension to the handler even if they refer to a subfolder, and whether or not the subfolder exists. Note also that the order of handlers is important – each wildcard will be evaluated in turn in the order the handlers appear, and the first matching handler will be invoked.

Of course, it is possible to build more sophisticated handlers that perform a complex transformation on the URL or use an XML file or data table to map the request URL onto a corresponding page in the new site.

Figure 7
Figure 7: HTTPHandler

Figure 7 shows how handler invocation works in this case, where we transfer another .aspx page. The HttpHandler could instead directly return content by generating it internally, or using Server.TransferFile to return static content.

Conclusion

In this article I have demonstrated many different techniques that can be used to break the strict linkage between request URLs and the resources used to service those requests. Many of the subjects I have touched on, such as the use of HttpModules and HttpHandlers, have wide applicability and I hope that my brief introduction will inspire you to explore them further!


Ian Stevenson has been developing Windows software professionally for 10 years, in areas ranging from WDM device drivers through to rapid-prototyping of enterprise systems. Ian currently works as a consultant in the software industry, and can be contacted at [email protected].

The Domain Name System

The Domain Name System (or DNS) translates domain names such as “www.vsj.co.uk” to the corresponding IP address, 89.151.107.226 in this case. HTTP requests are always made to IP addresses, so when the URL a browser needs to render contains a domain name, it asks the configured DNS server to translate.

The configured DNS server may be on the corporate network, or may be provided by the ISP. In either case, it will refer the request up a hierarchy of DNS servers until a server that knows the IP address is found.

If every DNS lookup was processed in this way, it would introduce big delays to request processing, so each server in the hierarchy caches lookup information that passes through it. These cached addresses are often retained for 24 or even 48 hours. When the master DNS record is updated, users may still received cached lookups of the old address for up to this time, while the new address propagates through the system. This is known as DNS propagation delay.

For more information, see Wikipedia’s DNS page.

You might also like...

Comments

About the author

Ian Stevenson United Kingdom

Ian Stevenson has been developing Windows software professionally for 10 years, in areas ranging from WDM device drivers through to rapid-prototyping of enterprise systems. Ian currently works a...

Interested in writing for us? Find out more.

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“If debugging is the process of removing software bugs, then programming must be the process of putting them in.” - Edsger Dijkstra