SHAREPOINTSearch.com

Welcome to SHAREPOINTSearch.com Sign in | Join

in Search
Skip Navigation Links
Home
Resource CenterExpand Resource Center
Research CenterExpand Research Center
BIG Resource List
Blog Zone
Forums
RequestsExpand Requests
About Us

Looking to search custom metadata in PDFs

Last post 08-02-2007 7:30 AM by Bruce28277. 2 replies.
Page 1 of 1 (3 items)
Sort Posts: Previous Next
  • 07-31-2007 9:44 PM

    Looking to search custom metadata in PDFs

    I have PDFs with custom metadata on network file share. I have MOSS set up to crawl the folder. How can I enable users to search for custom metadata embedded in each PDF?

  • 08-01-2007 11:44 PM In reply to

    Re: Looking to search custom metadata in PDFs

    You will need an ifilter that can extract the xmp metadata that you created. ifiltershop has one: http://www.ifiltershop.com/pdfplusfilter.html.

    I am rambling here sorry, but a think a pdf brain dump is necessary: The default adobe ifilter doesn't extract xmp data like dublin core. You can trick the hidden ifilter that gets installed with the latest adobe reader to be the sharepoint ifilter, but it isn't that stable. Foxit software (listed on our site) has a new pdf ifilter but there is nothing in their docs about supporting xmp metadata.  Also. The adobe ifilter is single instance which means only one pdf file can be filtered at a time, killing crawl performance. So go with the ifiltershop pdf filter and if performance is an issue ask them to release a wrapper xmp version that works with the foxit one. 

    Once you crawl a few documents, stop the crawl. Go into managed properties and create or map existing ones to your newly crawled metadata. Now recrawl fully.

    Now that you have the metadata in the search index you can customize the advanced search webpart to include it. Here are the steps:

    1. Go to advanced search form and edit the web part.

    2. Edit the properties xml string and add your new managed metadata properties like <PropertyRef Name="MYManagedProp"/> . NOTE: it is case sensitive.

    You should now be able to search on your fields.

    I would recommend using Ontolica if you want more capabilities in customizing the interface. You can create lookups and picklists fairly easily.

    Christopher Even
    SHAREPOINTSearch.com founder
    Filed under: , , ,
  • 08-02-2007 7:30 AM In reply to

    Re: Looking to search custom metadata in PDFs

    Thanks for the excellent reply, no rambling at all...Your description of the resolution is spot on. After doing some additional seaching and emailing a few contacts, I was able to find out the following:

    According to Foxit Tech support, their iFilter is currently not capable of emitting custom metadata. I suspect that Adobe's iFilter does not support this feature either; however I have not contacted them to find out.

    There is an iFilter, PDF+ iFilter, from ifltershop.com that DOES support emitting custom XMP data. They have directions in their help/readme file to make it work.

    "In order to set up PDF+ IFilter to index custom XMP schema you will need to add information about this schema to the IFilter registry settings as described in "Support for custom XMP schemas" section of README file. "

    Those directions in that section were not as clear as I would like, But no fear, they were willing to take a sample PDF from me and create the REG file required. Still waiting on getting that back to test, but it looks promising,

    Once the PDF+ iFilter is properly configured and I have forced another full crawl, I should be able to map to the newly exposed metadata items and proceed from there.

    I'm going to post a followup once the iFilter has been configured.

Page 1 of 1 (3 items)
SHAREPOINTSearch.com is not affiliated with or endorsed by the Microsoft Corporation.See our Terms, Conditions and Privacy Statements
SharePoint is a trademark of the Microsoft Corporation.