SharePoint Search Security explained with focus on the BDC and protocol handlers.
This is a repost of an original article I wrote for the SharePoint search site:
Business Data Catalog (BDC) – the BDC is a generic set of components provided with the SharePoint Portal that makes the integration of external LOB (line of business systems like a CRM solution) into SharePoint easier. The components the BDC provides are multiple configurable Web Parts for listing and displaying the data, a search indexer (protocol handler) and a search query security trimmer. The BDC operates off of xml definition files which are used to define the connections and entities in the LOB system. For an example, I want to integrate the standard MS Adventure Works database into SP. I would write an xml file that would include the following overly simplified information:
- Connection information – server, db, username, password
- Entity Definition
- Product
- · Finder Method (returns list of instances) - SELECT ProductID, Name, ProductNumber, ListPrice FROM Product
- · Specific Finder Method (returns exactly on instance) - SELECT ProductID, Name, ProductNumber, ListPrice FROM Product Where ProductID = @ProductID
- · IDEnumerator Method(enumerator for search crawler – same as finder in this case) - SELECT ProductID, Name, ProductNumber, ListPrice FROM Product
- · Access Checker Method (for query time security) - SELECT CAST(Rights as bigint) FROM Customers WHERE CustomerId = @CustomerId and UserName = @currentuser;
From the above information (and other information not shown) SP would be able to display products in a list web part using the Finder method and display a selected product in a form using the Specific Finder method. It uses the IDEnumerator method and Specific Finder methods to crawl the products and add them to the search index. During query time SP would take the returned search results and apply the Access Check method to each item to see if I have permission to access each item. NOTE: the BDC does not support indexing unstructured data like files in a document management system.
Protocol Handlers – protocol handlers are used by the SP indexer to connect to external and internal systems to crawl data and add it to the query index. Custom protocol handlers implement a standard iSearchProtocol COM interface and are created by 3rd parties like Hummingbird and Interwoven to allow SP to index their proprietary data systems. Microsofts internal protocol handlers (SharePoint, Lotus Notes, BDC, File System, HTTP, HTTPS) DO NOT implement the standard iSearchProtol COM interface and are proprietary.
Security Trimmers – A new addition to the SharePoint Search system (added in the final release in November) the iSecurityTrimmer interface is used primarily to apply security to BDC search results at this time. Security Trimmers are .NET dlls that are registered and associated with specific data in the search index. The SP query engine uses the Security Trimmers to check access rights of items right before they are returned to the user, and only for items that have a registered security trimmer. Since it is a new interface there are only a few that exist including the one for BDC data, but many companies like Interwoven may be exploring them as a means to ensure real time security on search results.
Discussion:
From the above definitions you should be able to get an idea of how the BDC works and what Protocol Handlers and Security Trimmers are, but a more detailed comparative discussion is definitely warranted.
The SP Search system implements two forms of security by which search results are trimmed:
The first form of security is by standard ACLs (Access control lists) which is the most familiar as it is how the Windows file system determines if you have access to a document or not. During the crawl process the ACL’s of items that are being added to the index are determined and added along with the item. The query engine uses these ACLs that are stored in the search database to determine quickly if a user should be allowed to see that item in the results. This security method has been the standard for awhile in the MS Seach products and is very fast. When connecting new systems (like Documentum) a custom protocol handler would be created that knows how to map the security in Documentum to standard Windows ACL. For instance if a user in Documentum is allowed to access a particular document then the protocol handler will need to map that user id to a valid Active Directory user and create a read privilege ACL for that user and add it to the items security ACLs. All users and groups would need to be mapped also and added for each item to ensure proper security. Note: This security model has a flaw in that if the security changes it will not be picked up until the next incremental crawl which may not happen for hours or days. Also the BDC (Business Data Catalog) does NOT support ACL based security and prior to the addition of the next form of security, the real time Security Trimmer, the BDC had no security for its search results.
The second form of security is called real time Security Trimming and is completely separate from the ACL based security above. It can be applied in conjunction with ACL security to provide an added check to ensure that changes to security since the last crawl are adhered to, or as in the case of the BDC it provides the primary and only means of security. Basically after search results are compiled during a query the items are individually compared to a set of rules to determine if they have a Security Trimmer registered. The ones that do have a security trimmer registered (as in BDC items) are grouped into an array and passed into their respective Trimmers. The Security Trimmers validate the security of the items and return back an array with a simple true or false for each item indicating whether they are allowed or not. Depending on how the Security Trimmers were written they can be a source of performance contention, as in the case of the BDC one where each and every item is individually validated which could mean hundreds of database queries. As search results can have combined results from multiple sources there may be more than one Security Trimmer involved in each query.
Del.icio.us |
Digg It |
Technorati |
Blinklist |
Furl |
reddit |
DotNetKicks