Tuesday, December 29, 2009

Peculiarities for handling space-character in field names for query and search: ManagedProperty name vs Field / ColumnName

For better human readability, and thus understanding by the end users, it is often desired to use spaces in the DisplayName of SiteColumns and ContentTypes FieldRefs: "The columnname". But beware, this gives some peculiarities with the different approaches to query and display the SharePoint content...

Enterprise Search

Upon first crawling of the content source, this ends up in a CrawledProperty with the name "The_x0020_ColumnName". This can be mapped to a ManagedProperty. The name of this property may not include any spaces. Here it could be "TheColumnName". With Enterprise Search, one can now query and select on the property 'TheColumnName'. For instance, query via the AdvancedSearchBox webpart, and display the results via the SearchCoreResults webpart. Rather handy to skip in the XML-based query (SQL-CAML), and the rendering of the query result (XSLT), the special handling of the space character.

CAML-based site content query

When query and displaying site data via the ContentQueryWebPart, one must select and display on the actual field / column name. That is, in the CommonViewFields specification, the actual column name must be used. Due to the space in the name, CAML space-mapping must be applied; resulting in 'The_x0020_columnname'. Example:
<property name="CommonViewFields" type="string">
Title,String;The_x0020_columnname,Choice;Editor,User
</property>
This is all pretty well-known SharePoint stuff.
Something that is less known, is that yet another CAML translation must be applied to successfully display the CQWP queryresults via XSLT rendering. The CAML-space character '_x0020_' must itself be translated into '_x005F_x0020_'. So for instance:
<td class="ms-vb" nowrap="">
    <xsl:value-of select="@The_x005F_x0020_columnname"/>
</td>

Thus...

To summarize: when you put a space in the displayname of a field, this may wind up in 4 different 'names' within the query/display pipeline:
  1. The actual displayname itself: "The columnname"
  2. The managed property name for within enterprise search: "TheColumnName"
  3. The query fieldspecification within CQWP: "The_x0020_columnname"
  4. The query result rendering used by CQWP: "The_x005F_x0020_columnname"

Monday, December 28, 2009

Tip: difference in metadata only does not qualify document items as different

In my last post I described the approach to set-up your managed properties collection as part of initializing the enterprise search experience. To test the outcome of this initialization, I next uploaded a couple of documents, and arbitrary filled in some of the metadata fields. Actually, I uploaded the same dummy / test document multiple times, each time renaming it. To my surprise, enterprise search next continuously returned only 1 of the uploaded documents. The cause of this is that the search crawling detected that the renamed documents are actually the same / duplicates. And on default, enterprise search via the SearchCoreResults webpart does not include duplicates. The tip is therefore in order to properly test the enterprise search in your application, make sure to upload different documents (that is, with difference in the document content) within the crawled content source.

Monday, December 21, 2009

Automated approach for initializing Enterprise Search experience

Whenever you want to utilize the Enterprise Search functionalities in your application, you must take into account for correct initialization: set up a content source, administer its crawling scheme, create the searchable managed properties, set up a Search Scope. Although the different steps can be done manually via the SharePoint GUI (combination of Central Admin and your own application), this is less workable within the context of an ALM based project. Your application is then multiple times (re)deployed, and to different environment (development, test, staging, production). Each time the manual installation/initialization steps would need to be repeated. This is cumbersome, and [thus] error prone. A better approach (as always) is to strive for a fully automated initialization of the enterprise search. I’ve applied this several times via the approach outlined here:
  • create a new Feature, with a FeatureReceiver codebehind
  • In the activation method of the feature, do the following steps:

    Create the content source

  1. use the Search Object Model to create the content source
  2. if applicable, administer include and exclude rules
  3. create the crawl schemes; full and incremental
  4. Create managed properties

    Important to realize here is that a managed property can only be made if the mapping crawled property is available.

  5. make sure the crawled content source contains at least one item, by either adding a dummy listitem (for regular Lists), or uploading a dummy document (for document library)
  6. Use the content type(s) definition(s) to determine the fields of the searchable content, and assign per field a non-nil value
  7. Initiate a full crawl, in order to let the crawler make up crawled properties for each of the content type(s) fields
  8. After the full crawl, loop through the collection of determined content type(s) fields, and for each field create a Managed Property of the proper type, and associate it with the automatically created crawled property
  9. Remove the dummy content(s) of step 4
  10. Create the Search Scope

  11. use the Search Object Model API to create a Search Scope
  • In the deactivation method of the feature, do the proper reversible actions of the feature activation event.
What is proper, is situation / application dependent. Normally, you would implement in the feature deactivation a full restore to the status before the feature activation. Here that means removal of the managed properties, crawled properties, search scope and content source. However, when you delete the content source, you typically undo more than strict the feature activation. In a production situation, the content source has been crawled and crawled, building up the index administration. Upon content source deletion, you also loose al this hard work content crawling and indexing. Feature deactivation would then thus not only undo the feature activation itself, but also work done later. Whether this is appropriate, depends on the application and content specifications. Every content can be recrawled. However, for a large and complex set (documents, .pdf’s, TIFF files, LOB via BDC…) this can be time consuming, and during the required full crawl the application search cannot find and return all search requests.

Saturday, December 19, 2009

WS-I members completed Basic Security Profile 1.1

This week, SAP, Microsoft, IBM, Intel and Layer 7 completed the WS-I Basic Security Profile 1.1 by successfully demonstrating the interoperability between their platforms based on the profile implementations of at least 4 WS-I member companies. Check out this SAP Community Network blog for more background information.
Example of interoperability test results

Sunday, December 13, 2009

SAP Influencer Summit '09 exhibits evidence of the Microsoft connection

Above blog reports on the SAP Influencer Summit held last week in Boston. At this summit, the mid-term future directions of SAP as IT and solutions company where presented to the audience of 275 analysts and IT influencers.
Noticable in the context of SAP / Microsoft interoperability are the following observations:
  • Silverlight was formally named as the user interface surface of choice over Adobe’s Flex, and Sharepoint & Office interoperability is clearly seen as the path forward over that horizon. The dev environment is similarly on the .net side of the ocean.
  • Excel and Crystal Reports (and SAP’s Xcelsius) are similarly foundational components to analytics reporting and dashboarding.
Makes me wonder: is SAP finally making a renewed stand on the integration and interoperability of the SAP and Microsoft platforms + products? The fact that noise and attention from SAP on the recently by Microsoft announced DUET Enterprise is yet effectively non-present, is at least confusing with the messages made at the summit. I guess we'll still have to wait and see whether and in which direction(s) the 2 companies will interoperate and partner.

Saturday, December 12, 2009

Tip: use Lookup iso Choice field for (semi)fixed set of defined values

Often, in your SharePoint information architecture there are some fields identified with a set of defined allowed values. A common approach is then to apply the Choice fieldtype herefore. For real fixed set of possible values this is a very sensible approach. For instance, for datatype sex; we'll have 'male', 'female', and well 'unknown' in case of doubts. However, more than often the set is (semi)fixed: departments within a company, the customer base of an IT consultancy, etc. In such situation, usage of Choice is not flexible nor user-friendly towards the functional managers of the application. For each modification, it is required to change the definition of the Choice-based field. And one must then also not forget to propagate this change to the lists on which the field / site column has been applied (direct, or via contenttype). All this is more technically doing, than SharePoint functional management. A better approach is to utilize the Lookup fieldtype, and refer to another list with in it the currently known set of defined values (masterdata values actually). Whenever a modification to this set is in order, functional management can suffice with adjusting this masterdata list. Actually, this is just sane data model normalisation. Like in such context, in addition to defining the set of allowed values, it is also possible to augment them with more details. For instance, department name (allowed value), and in another list column more details of the department.
Something you'll have to take into account when applying this datamodelling, is a peculiarity upon provisioning the SiteColumns. The way Lookup SiteColumn are administered in the SharePoint content database, is with the Guid of the referential SharePoint list. This results in a problem when provisioning the SiteColumns the SharePoint standard way via Feature, with the field specifications in XML. The ID of the referential list is typically unknown at coding/specification time, and will be different per environment provisioning. This is a known issue, and so there is also a known resolution. I'm not going to describe it here, of even try to take the credits for it. Frankly, I've used a blog-entry of Chris O'Brien, Creating Lookup columns as a feature, as start information source for hinting how to solve this sequential provision issue.

Tuesday, December 1, 2009

Cosmetic extension / modification of AdvancedSearchBox UI

A customer required several modifications wrt to the screen-behaviour of the standaard MOSS AdvancedSearchBox control:
  1. Hide the resulttype picker
  2. Initial display 3 property-filter rows, instead of the default 1
  3. Automatic pre-select in these 3 rows a specific property to filter on
  4. For properties with a limited and fixed set of allowed values, let the user select from these via a dropdown control
The changes are all cosmetic, the customer is satisfied with the query construction + evalution of the AdvancedSearchBox.
My first idea was to realize these cosmetic changes by inheriting from AdvancedSearchBox, and overload the rendering of the control. However, this approach is not possible because Microsoft made the class sealed. Effectively this prohibits to address the customer requirements via a server-side resolution. I shortly did consider to re-engineer the AdvancedSearchBox in an own dedicated webpart. However, such is non-advisable: it's no small thing to try delivering the total of the AdvancedSearchBox functionality, and make sure it's tested and robust. And moreover, why should you even build your own control when the SharePoint application platform provides you with this rich control ? Only makes you vulnerable for updates to the SharePoint platform - from service packs and patches, and next year with the upgrade to SharePoint 2010.
So I had to come up with another approach, in which still the AdvancedSearchBox is utilized, but its appearance and UI behaviour is slightly modified. Well, in essence the UI of the control runs within the browser context. Another possible approach is therefore to make the UI modifications on-the-fly clientside, applying JavaScript / JQuery to alter via the Document Object Model. The elegance of such approach is also that it preserves the out-of-the-box SharePoint platform functionality. I'm not allowed to expose the source code. But I can outline the essence:
  • attach an own method to the body.onload event
  • in this method, use client-scripting to hide the resulttype row, and initially display upto 3 property-filter rows. Pre-select in each row a specific property
  • runtime construct via the client DOM a Select object, and position it next to the standard displayed Input element used for entering the filter value
  • runtime overload the onchange callback-function of the property Select element, and extend it with logic to either display the standard property-value Input element, or the Select variant in case of fixed set of values for the active selected property
  • propagate the selection made in the property-value Select element to the standard property-value Input element
Example of the resulting modified UI + behaviour: