Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Researching search

I’ve had "searching" on my mind lately but in a very narrow focus. Specifically, I have a Brownfield application for a client I dig up every so often whenever new features need to be added to it. (Off topic: As soon as I can re-brand the sucker, it’ll make a dandy little sample app for some Brownfield-related posts and webcasts. There, I said it on the Internet so the guilt will be that much greater if I don’t follow through.)

One of the key components to this application is a document repository that you can search both with meta-criteria as well as a full text seaLegalSearchrch of the documents (see screenshot).

The meta-criteria part is boring. Bunch of lookup tables in a SQL Server database and I’m building up a SQL query manually in the search process. (I *did* say this was a Brownfield application.)

The full text search is done through a kind of poor-man’s search index. I’m using a custom catalog with Microsoft’s Indexing Service, available on Windows Server 2000 and later.

To do this, and I’m going to skim partially because it’s not really the focus of the post but mostly because I think it’s outdated, I create a catalog that indexes the document folder. Then, I make the catalog accessible to SQL Server so I can combine queries to it with queries to the metadata. Here’s the SQL to do so:

EXEC sp_addlinkedserver <linkedServerName>, ‘Index Server’, ‘MSIDXS’, ‘<name of catalog>’

Now I can write queries like the following:

SELECT DocumentID, Title, Date, Filename
FROM doc_list_view dl,
OpenQuery( <linkedServerName>, 
   'SELECT Filename FROM SCOPE( ) 
    WHERE CONTAINS( Contents, ''<queryTerm>'' ) '
) q
WHERE Date BETWEEN 'January 1, 1850' AND 'December 31, 2008'
AND q.Filename = dl.Filename
ORDER BY Date DESC, Title

Easy as raccoon pie, yesno? The nice thing about this is that it will automatically index all Microsoft Office docs and PDFs are handled by installing Adobe Reader 8.0 (though it’s a separate download for 64-bit).

Fast-forward to 2008 and the Indexing Service is no longer turned on by default on either Vista or Windows Server 2008. Posed a problem for the client who runs it locally on his laptop but it was easily remedied simply by turning the service on. For my own newly-minted Server 2008 machine, Indexing Service and the new Windows Search can’t run together. And I’m loathe to enable the older technology so it’s time to update if I can.

That led me to this page which confuses me to no end. Starting with the terminology: Windows Search, Windows Desktop Service, Instant Search, Windows Indexing Service, Indexed Search, MSN Desktop Search are the terms used to discuss various incarnations of the technology. And that’s not including codenames or different versions of the same technology. Adding to my addlement is when Microsoft changes terminology from one page to the next. One example is this page which, in the section on the History of Windows Search, leads you to the download page for Windows Search for Windows Server 2003. That one is titled: Description of Windows Desktop Search 3.01.

So now I’m facing the unenviable, but oddly ironic, task of having to search through the myriad of search options for a solution that will run on Windows Vista, Windows Server 2008, and Windows Server 2003 (where the main site is hosted). At first glance, it seems Windows Desktop Search is the way to go for Server 2003 as it appears the new Windows Search is API-compatible with it.

In any case, this’ll make one heckuva a sample for the Brownfield series on managing integration with external systems. Because as you’ve probably guessed by now, there are no automated (or even manual) tests for this whole process.

Kyle the Researched

This entry was posted in Brownfield. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Kyle Baley

    @Andy: Actually, I was going to mention SharePoint/WSS but left it out hoping someone would comment on it. I did consider it but it’s not really an option due to the cost/benefit. We need only to index a folder with documents, not create an entire portal. Plus, as you mention, it needs to run locally on Vista.

  • Andy S

    Don’t forget SharePoint Enterprise Search. It’s not really appropriate for a locally hosted solution, but definitely something to think about in an enterprise deployment.