Solr facet distinct count

With Solr, sometimes you want to know the distinct facet count for e.g. to be able to paginate over the facet results.

The distinct count of the facets is available with Solr 4.0 (as of writing).

The Solr patch SOLR-2242 would aid to provide with the number of facet terms.

Solr – Tomcat UTF-8 Encoding

To Support Unicode Characters with Solr and Tomcat, you need to have additional settings :-



<Connector ... URIEncoding="UTF-8"/>
...



Solr – Clean up Index


Sometimes you want to delete all the records from the Solr index without deleting the index directory.
This can be done by executing an http url :-
http://host:port/solr/core/update?stream.body=*:*&commit=true



Or by posting data xml data :-

*:*

Cleaning data using Solrj :-

SolrServer server = null;
try {
server = new CommonsHttpSolrServer(masterIndexUrl);
server.deleteByQuery("*:*");
server.commit(true, true);
server.optimize(true, true);
} catch (Exception e) {
try {
server.rollback();
} catch (Exception e1) {

}
}

Solr Sort feature

User usually like to Solr of Fields such as Document title and do not get the expected results.

Few key things to take into account when using fields for Sorting in Solr –

  • Sorting doesn’t work good on multivalued and tokenized fields. (multivalued=”false”)
  • The field should be marked as indexed to enable sorting. (indexed=”true”)

Documentation – 
Sorting can be done on the “score” of the document, or on any multiValued=”false” indexed=”true” field provided that field is either non-tokenized (ie: has no Analyzer) or uses an Analyzer that only produces a single Term (ie: uses the KeywordTokenizer)







Chaining Solr copyField

Solr does not allow chaining of copyfields and it does not recurse.


e.g.

   









Solr Documentation @ http://wiki.apache.org/solr/SchemaXml#Copy_Fields quotes

The copy is done at the stream source level and no copy feeds into another copy.


So a copyfield cannot be a source of other copyfield tag.
The copyfield source must be an actual field, which has some value and does no cascade.