Library code snippets

Asynchronous HttpWebRequest

Since there was just a posting on HttpWebRequest and since it just so happens that I've been having fun with that very class today, I though I'd add my two cents regarding this wicked cool type.

I have a collection of URLs, a couple of hundred for example, and need to query each URL to determine if the site still exists and, if so, when it was last modified. Doing this synchronously could take a couple of minutes depending on bandwidth and traffic.

However, by scanning the URLs asynchronously, you can build a much more responsive and user-friendly application.

There are three important points that need to be integrated to completely solve this:

  1. Scan the list in a new thread so your UI stays responsive.
  2. Use HttpWebRequest.BeginGetResponse() to initiate an asynchronous request.
  3. Use ThreadPool.RegisterWaitForSingleObject() to register a timeout delegate for unresponsive Web requests.

Yo! Hands off the Thread, Man!

Starting up a new thread is very simple using the .NET Framework System.Threading namespace. We create a new thread, mark it to run in the background and kick it off.

Thread t = new Thread(new ThreadStart(ScanSites));
t.IsBackground = true;
t.Start();

As you might guess, the ScanSites method (a custom method shown below) will run under a thread separate from the Windows.Forms UI. The user will be able to interact with the application without noticing the background process chugging along (hopefully).

Reach Out and Touch Someone

A Web request begins its life fairly mundane. You first need to create a new request. The concrete HttpWebRequest.Create() method returns an abstract WebRequest object. You can modify its properties before calling BeginGetResponse() .

The BeginGetResponse() method is typical of many other asychronous kick-off routines: it requires a pointer to a callback routine and a user-defined argument.

  private void ScanSites ()
  {
    // for each URL in the collection...
    WebRequest request = HttpWebRequest.Create(uri);
    request.Method = "HEAD";
 
    // RequestState is a custom class to pass info
    RequestState state = new RequestState(request,data);
 
    IAsyncResult result = request.BeginGetResponse(
      new AsyncCallback(UpdateItem),state);
 
    // PLACEHOLDER: See below...
  }
 
  private void UpdateItem (IAsyncResult result)
  {
    // grab the custom state object
    RequestState state = (RequestState)result.AsyncState;
    WebRequest request = (WebRequest)state.request;
    // get the Response
    HttpWebResponse response =
      (HttpWebResponse )request.EndGetResponse(result);
    // process the response...
  }

So far, so good. But what's up with the PLACEHOLDER comment? Read on my friend. Read on...

Careful with that Axe Eugene!

Although the WebRequest class has a Timeout property, it is ignored when using asynchronous requests. So we need to set up our own timer to keep an eye on lengthy HTTP calls. If a call takes too long, we should jump in and abort it, probably marking the URL as suspicious (for example, the site is down or no longer exists).

Here's an example. Replace the PLACEHOLDER comment above with the following call. Then add the ScanTimeoutCallback routine somewhere in your class.

  ThreadPool.RegisterWaitForSingleObject(
    result.AsyncWaitHandle,
    new WaitOrTimerCallback(ScanTimeoutCallback),
    state,
    (30* 1000),  // 30 second timeout
    true
    );
 
  private static void ScanTimeoutCallback (
    object state, bool timedOut)
  {
    if (timedOut)
    {
      RequestState reqState = (RequestState)state;
      if (reqState != null)
        reqState.request.Abort();
    }
  }

Not too hard, ay? We've covered all three points with pretty straighforward code. Multi-threaded programming in Windows.Form is trivial, but required for a good user experience. Sending asynchronous HTTP requests is equally trivial; just remember to construct a custom state object containing all the relevant information you may need within the AsyncCallback. And, finally, remember to abort requests that refuse to complete. A little code goes a long way.

Comments

  1. 08 Jan 2010 at 09:26

    Steve thanks for the article. It was really helpful. I am working on a project that somehow uses the web request technique. But I'm facing a tough problem, my application is a web scrapping application. The website I'm scrapping has a combo box list and which of course fires a do post back on change. I want to use the web request to raise or fire the event of the do post back with the value I want and extract the web source page. is that possible?

    Thanks in Advance, Happy holidays Beheiry

  2. 16 Aug 2009 at 12:21

    Thanks for this article but I have a question concerning making asynchronous http requests. Is it possible to call the same script say [ http://buzzme.com/index.aspx?message=tasty ] three thousand times (3,000) asynchronously to submit different message parameters without it breaking? ... and at the same time getting the response of each request and logging the output somewhere in a db.

  3. 16 Aug 2009 at 09:11

    Thanks for this article but I have a question concerning making asynchronous http requests. Is it possible to call the same script say [ http://buzzme.com/index.aspx?message=tasty ] three thousand times (3,000) asynchronously to submit different message parameters without it breaking? ... and at the same time getting the response of each request and logging the output somewhere in a db.

  4. 18 Feb 2009 at 18:37
    It took me about 4 hrs searching around the inter webz to finally wrap my head around this concept. After digging through incomplete and just plain wrong code I ended up putting together a complete working example, from informayion found in a few different posts. The code below is not commented as the original poster in this forum did a great job of explaing, too bad he left some classes and, how to handle the cross threaded response. Enjoy Hope this makes some ones learning experience easier than mine was. using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; using System.Drawing; using System.Linq; using System.Text; using System.Windows.Forms; using System.Threading; using System.Net; using System.IO; namespace sampleApp { public partial class Form1 : Form { List list = new List(); public DelegateAddString t_DelegateAddString; public Form1() { InitializeComponent(); t_DelegateAddString = new DelegateAddString(this.TbAddString); } private void Form1_Load(object sender, EventArgs e) { list.Add("http://google.com"); list.Add("http://yahoo.com"); list.Add("http://msn.com"); list.Add("http://aol.com"); } private void TbAddString(String str) { textBox1.Text += (str); } private void button1_Click(object sender, EventArgs e) { Thread t = new Thread(ScanSites); t.IsBackground = true; t.Start(); } private void ScanSites() { foreach (string uri in list) { WebRequest request = HttpWebRequest.Create(uri); request.Method = "GET"; object data = new object(); RequestState state = new RequestState(request, data, uri); IAsyncResult result = request.BeginGetResponse( new AsyncCallback(UpdateItem), state); ThreadPool.RegisterWaitForSingleObject( result.AsyncWaitHandle, new WaitOrTimerCallback(ScanTimeoutCallback), state, (30 * 1000), true); } } private void UpdateItem(IAsyncResult result) { String str; RequestState state = (RequestState)result.AsyncState; WebRequest request = (WebRequest)state.Request; HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(result); Stream s=(Stream)response.GetResponseStream(); StreamReader readStream = new StreamReader( s ); string dataString= readStream.ReadToEnd(); response.Close(); s.Close(); readStream.Close(); string lastMod = String.Empty; if (response.Headers["last-modified"] != null) lastMod = response.Headers["last-modified"]; str = "Read: " + state.SiteUrl + ": " + response.ContentLength.ToString() + " bytes. Last-Mod: " + lastMod; Thread.Sleep(400); Invoke(t_DelegateAddString, new Object[] { str }); } private static void ScanTimeoutCallback(object state, bool timedOut) { if (timedOut) { RequestState reqState = (RequestState)state; if (reqState != null) reqState.Request.Abort(); MessageBox.Show("aborted- timeout"); } } class RequestState { public WebRequest Request; // holds the request public object Data; // store any data in this public string SiteUrl; // holds the UrlString to match up results (Database lookup, etc). public RequestState(WebRequest request, object data, string siteUrl) { this.Request = request; this.Data = data; this.SiteUrl = siteUrl; } } } }
  5. 22 May 2007 at 20:17

    "Careful with that Axe Eugene" - I love that song.

     

  6. 30 Mar 2006 at 18:45

    Hi Steven

    I've got the following problem and your proposal seems to be the solution:

    I want to do a asynchronous Request, where I have to post a large amount of data to some old asp - component. "Asynchronous" means for me to start the posting and imideately continue with my workflow while in background the huge amount of data is posting...

    therrefore I call as suggested in msdn - help:

    // Start the asynchronous request.

    IAsyncResult result = ( IAsyncResult ) httpRequest.BeginGetResponse ( new AsyncCallback ( RespCallback ), state );

    // this line implements the timeout, if there is a timeout, the callback fires and the request becomes aborted

    ThreadPool.RegisterWaitForSingleObject (
    result.AsyncWaitHandle, new WaitOrTimerCallback ( TimeoutCallback ),
    httpRequest, DefaultTimeout, true );

    Now I set my breakpoint in the RespCallback() - Method and here'se the problem:
    after BeginGetResponse () the process first jumps to RespCallback() and then jumps to RegisterWaitForSingleObject()

    So this is not realy asynchronous...
    Did I miss something or is this general behaviour ?

    after trying your solution (perform this request in an extra thread) I saw that this would be the right way but I have some questions left:

    If you start this request in an extra thread, why do you make this request asynchronous ?
    I think, asynchronity in this place is achieved by the extra thread which runs in the background

    Is there a way to achieve the asynchronity I want without this extra thread (I'm not sure that what I want realy matches the meaning of "asynchronity" used by BeginGetResponse() ... EndGetResponse() ) ?


    thanx in advance and nice greetings
      Bernd


















  7. 01 Jan 1999 at 00:00

    This thread is for discussions of Asynchronous HttpWebRequest.

Leave a comment

Sign in or Join us (it's free).

Steven Cohn

We'd love to hear what you think! Submit ideas or give us feedback