Archive for the 'solr' Category

 

Solr integration with Nutch ..

Aug 20, 2009 in nutch, Search, solr

“requestHandler” notes for the solrconfig.xml file:

— Fields are defined here:

<str name=”hl.fl”>text features name</str>

— Field values are defined here:

<str name=”f.name.hl.alternateField”>name</str>
<str name=”f.name.hl.fragsize”>0</str>
<str name=”f.text.hl.fragmenter”>regex</str>

— The alternate ‘nutch’ configuration is:

(See http://www.lucidimagination.com/blog/2009/03/09/nutch-solr/)

— Fields:

<str name=”hl.fl”>title url content</str>

— Field values:

<str name=”f.content.hl.fragmenter”>regex</str>
<str name=”f.title.hl.alternateField”>title</str>
<str name=”f.title.hl.fragsize”>0</str>
<str name=”f.url.hl.alternateField”>url</str>
<str name=”f.url.hl.fragsize”>0</str>

— To map a parser to a file type,

— Map mime type for the file to a plugin in conf/parse-plugins.xml .

— Define new mime type for the file in conf/mime-types.xml .