Creating a Generic Site-To-Rss Tool

Wrapping Up

Link Prefix

One property in the class needs to be explained. LinksPrefix contains the prefix pre-appended to each news item link that is discovered. Notice when harvesting the HTML that all links in the site are not “full” links, but they are usually “partial” links, pointing to some place in the same site. In cases such as these (and .netWire is), we want to specify the LinksPrefix as http://www.dotnetwire.com to make the links of the news items “full” again.

RFC Date Formats

RSS 2.0 requires a publish date formatted as an RFC822 date. We are using the RFC1123 format, which seems to return essentially the same result.

Using the Generic Class with .netWire

It's ready. Let's use it! Here's a simple code that can now use the class to parse .netWire and return an XML RSS feed from it:

Dim rss As RSSCreator.RSSCreator = _
New RSSCreator.RSSCreator("http://www.dotnetwire.com")

With rss
    .LinksPrefix = rss.UrlToParse
    .RegexPattern = "<p\s*class=""clsNormalText""><a\shref=""(?<link>.*)?
        (""\s*target=""newwindow"")(.|\n)*?>(?<title>.*\n?.*)?
        (</a><br>\s*\n*)(?<description>(.|\n)*?)(<br>(.|\n)*?>)
        (?<category>.*)?\.\s*(?<pubDate>.*)?(\.</span>)"

    .RSSFeedName = "The unofficial .NetWire RSS feed"
    .RssFeedLink = "http://www.DotNetWire.com"
    .RssFeedDescription = "A basic feed that parses the .NetWire site"
    .RssFeedCopyright = "Copyright 2003 Roy Osherove"

    return .GetRss()
End With

What's in the Download?

The download contains several projects:

  • RssCreator: Library with the source for the RSSCreator class.
  • MakeRss: Simple ASP.NET project that retrieves a feed for the .netWire site.
  • SiteToRss: Simple WinForm application that represents a simple utility to test various sites with regular expressions.

You might also like...

Comments

About the author

Roy Osherove Israel

Roy Osherove has spent the past 6+ years developing data driven applications for various companies in Israel. He's acquired several MCP titles, written a number of articles on various .NET topic...

Interested in writing for us? Find out more.

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“Anyone who considers arithmetic methods of producing random digits is, of course, in a state of sin.” - John von Neumann