Monday, December 28, 2009

Tip: difference in metadata only does not qualify document items as different

In my last post I described the approach to set-up your managed properties collection as part of initializing the enterprise search experience. To test the outcome of this initialization, I next uploaded a couple of documents, and arbitrary filled in some of the metadata fields. Actually, I uploaded the same dummy / test document multiple times, each time renaming it. To my surprise, enterprise search next continuously returned only 1 of the uploaded documents. The cause of this is that the search crawling detected that the renamed documents are actually the same / duplicates. And on default, enterprise search via the SearchCoreResults webpart does not include duplicates. The tip is therefore in order to properly test the enterprise search in your application, make sure to upload different documents (that is, with difference in the document content) within the crawled content source.

