Microsoft Indexing Service and ASP.NET

Page 2 of 3
  1. Getting Started
  2. Building your Search Query
  3. Querying (contd.)

Building your Search Query

Once you have your connection you can obviously execute queries against it. The syntax used to query the Indexing Service is a limited subset of SQL, documented in MSDN's Indexing Service section.

For example the search I use on the .Net portion of this site looks like

select doctitle, filename, vpath, rank, characterization
from scope('"/dotNet"')
where FREETEXT(Contents, 'searchText')
and filename <>'search.aspx'
order by rank desc

select parameters

Lets examine some of the columns that Index Server tables contains.

  • Access
    This is the last accessed date of a document
  • AllocSize
    This is the current allocated disk size allocated to a document
  • Attrib
    This is the current file system attributes (read only, system etc.) flagged to the file.
  • Characterization
    This is the document abstract, if available. This is configurable as part of the index properties, where you can also set the number of characters in the abstract. The abstract is then produced from the document body.
  • ClassID
    This is the OLE ClassID for the document.
  • Contents
    The complete contents of a file in the index. This can be queried on but not retrieved as part of the select clause.
  • Create
    This is the creation date of the document.
  • DocAuthor
    This is the document author, if the document provides this meta data. Office documents, Adobe PDFs and media files generally have this property, HTML, XML, ASP and ASP.Net documents do not.
  • DocTitle
    This is the document title, extracted from the document meta data in Office documents, or from the <title /> tag in markup documents.
  • FileIndex
    This is the index name the file was found in.
  • FileName
    This is the document filename.
  • HitCount
    This is the number of times the search term appears in your document.
  • Path
    This is the full  physical path to the document, including the file name to a document, for example c:\inetpub\wwwroot\examples\example.aspx
  • Rank
    This is an indication of relevance. Index server ranks its results between 0 and 1000. The higher the rank the more relevant to the search criteria index server believes it is.
  • ShortFileName
    This is the 8.3 short DOS file name of the document.
  • Size
    This is the document size.
  • USN
    This is the update sequence number, an NTFS attribute.
  • VPath
    This is the virtual path to the document, including the file name, relative to the root of the web site the index is for, for example /examples/example.aspx. If more than one virtual path exists to a file IndexServer chooses the virtual path it believes best matches the query.
  • WorkID
    This is the internal ID index server uses for a each file.
  • Write
    This is the date the file was last written to.

Additional columns are provided for OLE aware documents, such as those produced by Microsoft Office. These include DocAppName, DocAuthor, DocCharCount, DocComments, DocCreateDTM, DocEditTime, DocKeywords, DocLastAuthor, DocLastPrinted, DocLastSaveDTM, DocPageCount, DocRevNumber, DocSecurity, DocSubject, DocTemplate, DocTitle and DocWordCount.

from parameters

So now we know what we can select, we need to examine where we are selecting from. From the sample query above we can see the from statement,

from scope('"/dotNet"')


There are a few methods to limit your search.

You can use the scope() function, as shown in the example. This is the main component of the from predicate. The scope function takes zero or more comma-separated scope arguments. A scope argument combines a Traversal_Type and a Path). You can also specify scope with an empty argument list, or (). This is the default scope and effectively sets the scope to start at the virtual root of your web site ( / ). Each Scope_Argument must be surrounded by single quotes.

As an alternative to using scope(), you can use any one of a set of predefined views that Index server provides. You can reference one of these pre-defined views in the FROM predicate by specifying the View_Name.

For example,

select fileName from scope()<

This returns all file names in the current index, with no limitation on the the directories to search.

select filename

from scope('shallow traversal of "D:\inetpub\wwwroot\examples"',

'deep traversal of "/examples2" '

'hierarchical traversal of "/examples3")

In this example we have three scope limitations, three different Transversal_Type arguements and threes different path arguements, one physical and two virtual.

A shallow traversal searches the resources in the specified folder, but not in any of the subfolders.

A deep traversal searches against any and all subfolders of the given folder, all the way to the bottom of the folder hierarchy.

A hierarchical traversal searches against folder resources in a specified folder. A hierarchical traversal search can be used for a task such as determining the folder hierarchy of a specified folder.

select filename from EXTENDED_WEBINFO

In this example we can see one of the pre-defined views, EXTENDED_WEBINFO. The pre-defined views are documented in MSDN.

You might also like...

Comments

Contribute

Why not write for us? Or you could submit an event or a user group in your area. Alternatively just tell us what you think!

Our tools

We've got automatic conversion tools to convert C# to VB.NET, VB.NET to C#. Also you can compress javascript and compress css and generate sql connection strings.

“An idiot with a computer is a faster, better idiot” - Rich Julius