Susan and Allahbaksh work for Infosys Technologies Ltd. in Bangalore, India.
Apache Solr (a name derived from "Searching on Lucene") is an indexing and searching server with a web-services-like API. It is an enterprise-ready Lucene-based search server that supports faceted searching, hit highlighting, index replication, and distributed search, and provides search results in multiple output formats (XML/XSLT and JSON). Solr uses Representational State Transfer (REST) by default for indexing and searching, and the latest version of Solr is 1.4.0.
Advantages of Solr
Solr is an open source Apache project that uses Lucene internally and posseses the following properties:
- Solr can replicate an index on multiple servers. Replication is OS independent so you can replicate the same index on a Windows platform as well as in Linux. Moreover, the indexes can be configured to operate in sync: If any of the indexes is modified, the other copies of that index get updated automatically.
- Solr supports distributed search by performing indexing in various machines and merging the results.
- Solr Indexing and search can be done simultaneously.
- Solr features an XML-based schema for managing indexed fields.
- Solr supports facet searching (sub-search on a result).
In this article, we discuss about the basic needs in using Solr, like setup, indexing and querying etc. The advanced topics like facet search, replication and distributed search will be covered in another separate article.
Setting up Solr
Download Solr from http://apache.inetbridge.net/lucene/solr/1.4.0/apache-solr-1.4.0.zip. Solr installation merely requires unzipping the file and extracting the zip contents into a folder, which will be referred to by the environment variable SOLR_HOME.
Solr by default comes with a jetty servlet container. We can start the jetty server by executing the start .jar from the "example" sub-folder in SOLR_HOME in command prompt:
SOLR_HOME here is D:\apache-solr-1.4.0.
This execution starts the jetty servlet container on port 8983 (default port). Later, we can access the solr interface at http://localhost:8983/solr/admin. Solr stores its config and index details by default in the "example\solr" folder.
We can also deploy Solr as a web app in any servlet container such as Jetty, Resin, Tomcat etc. Assuming JETTY_HOME is the location where jetty is installed, the procedure to deploy in Jetty will be:
- 1. Copy solr.war from SOLR.HOME/example/webapps to JETTY_HOME/webapps.
- 2. Start the jetty server along with specification of java system property
-Dsolr.solr.home. pointing it to the configuration and index location of Solr as in Figure 1