Latest Tweets

Magento - Apache Solr Integration - Part II (setup)

Continuing with the short series about integrating Magento with Apache Solr (started in this post), we'll now see how to set up and put Solr to work together with Magento.

Installing Solr

Integration with Solr has been available since version 1.8 Enterprise of Magento. For now, this is only available for the Enterprise version, not for the Community version.
First thing to do is to download Apache Solr. Grab it from here and extract the file (a typical location - for Linux - would be /usr/local/share/).

Note: to be able to run Solr, you will need Java installed in your server. I'm assuming you already have that. If you don't, just google how to do it, it is everywhere...

Once Solr is downloaded and extracted, we can test if it runs, just go to [solr-path]/example, and type the following in a terminal:

solr1.png

After executing that line, the terminal should start spitting information similar to this:

solr2.png

What's this? This is Solr running inside of Jetty. It comes that way when you download it. You can make it run in other servlets containers (such as Tomcat), but that is outside of the scope of this post. And, hey, Jetty is as cool as Tomcat, okay? :)

This terminal must be kept open while we use Solr. Of course that, in your production environment, you wouldn' t start Solr this way, but will make it automatically available each time the server starts. Take a look here for more info on how to do that.

The last verification to ensure that Solr is running in your machine: go to localhost:8983/solr in your browser. You should get this:

solr3.png

Like it says: Welcome!

Configuring Solr for Magento

Great, Solr is working. Now, what do we need to do to make it work with Magento? One of the keys when working with Apache Solr, is defining the right XML config files. The two basic files are:

  • solrconfig.xml: contains most of the parameters for configuring Solr itself.
  • schema.xml: contains all of the details about which fields your site is using, how these fields should be added to the index, and how they should be returned for queries.

These files need to configured according to the job we want Solr do to. Fortunately, the Magento team has already prepared them for us. The only thing we need to do is to copy the Solr conf directory from our Magento Enterprise installation (remember, 1.8 or higher), and replace the original Solr conf directory with it.

In Magento, the folder is located in: [magento-instance-root]/lib/Apache/Solr/conf.
In Solr, the folder is located in [Solr-instance-root]/example/solr/conf.

The trick is just to copy the directory from Magento, and replace the one in Solr. That's it! If you look closely, you'll see that the directory contains not just these two files, but also a bunch of other files, most of them called "protwords_??.txt", or "spellings_??.txt" (replace the ?? for two digit languages codes, such as EN, ES, etc). These are files Solr uses to handle searches in specific languages, and allow you to do a lot of fine tuning in your searches. The Magento config, out of the box, comes with the settings to use many different languages with Solr. 

After replacing the directory, stop Solr (if it was still running) by clicking CTRL + C in the terminal. Then, start it again (java -jar start.jar). This time, you'll see some new stuff:

solr4.png

Those are all good signs. Solr has got its new configuration, and now it's ready to dialogue with Magento.

Enabling Solr In Magento

Solr is ready, it knows about Magento's dialects and tastes. It is ready and eager to work with Magento, but still Magento isn't aware of Solr's existence! Lets change all that and make them good friends.
That is done in Magento admin site. Go to the System menu, then the Configuration option. In the options in the left panel, click on Catalog. And finally, in the options that appear in the central panel, choose Catalog Search. You'll get this screen:

Selection_006_0.png

 

Here, we can tell Magento to use Solr, by selecting it in the Search Engine dropdown. Once we do that, the options change to this:

Selection_008_0.png

The configuration is very simple. Just provide the right info about Solr's server, port, and (if needed) authentication information. We are running it in the same machine as Magento, that's why we left the default options selected (localhost, port 8983). Running Solr in a dedicated server could be a very good idea if your site has lots of traffic.

Now, let's click on that colorful Test Connection button:

Selection_009_0.png

Good stuff! We have now Magento relying on Solr, and Solr ready for Magento. We are just one step away from offering a much better search experience to our customers.

Indexing Magento information in Solr

The data must be sent to Solr so it work its magic. That is very simple! We just need to recreate the Magento indexes in the usual way. Go to System, Index Management, and recreate them. 
In the Solr console, you'll see a lot of activity while the reindexing goes, with things like this:

root%40abressan-nb%3A-usr-local-share-so

It is not crucial to understand what Solr is saying here, but it is reassuring to see that there is movement. That's the signal that the information is being sent to Solr.
Go and grab a cup of coffee while Magento and Solr share the info. 

Checking stuff in the Front End

Now, go to your store page, and start searching using Solr. If your catalog is very large, you should notice the performance improvement right away. And you will also have new features available (assuming you enabled them in the search configuration earlier). For example, this is a screenshot of the Suggestions feature:

Selection_011.png

Out of the box, you get the following features working with Magento:

  • The Product search
  • The Navigation search (if you click "Furniture", Solr can return the products in this category).
  • The Faceted Search
  • The Suggestions
  • Search Recommendations (you can associate terms to another terms. Good for a "You may also like" feature for your customers).

Oddly, it seems that the search autocomplete is not going through Solr. I guess the Magento team could not implement that on time, but I imagine that will be coming in the near future (can anyone on the Magento team confirm this?)

Hoping this has been helpful!

Coming up in this series, we'll publish how to index custom information in Solr from Magento, and how to create an Ajax UI to allow the users to search that info. Stay tuned!

Author

Managing Partner
Aldo works as a general mentor for the development teams, keeping in direct contact with programming and design.

Comments

I'd like to see this available for Community Edition. Any ideas on getting this to work?

this is great tutorial for solr with magento Thanks, P.Kannan

Hi Aldo, Thanks for this post. ITs very helpful. I have question regarding replacing conf directory of Solr with Magento Solr conf directory. You have mentioned that: In Magento, the folder is located in: [magento-instance-root]/lib/Apache/Solr/conf. In Solr, the folder is located in [Solr-instance-root]/example/solr/conf. We need to copy Magento's conf dir to Solr. However, our Solr server is separate and also needs to be integrated with other systems. If I copy Magento's conf dir to Solr conf dir, will it affect the other systems ? Thanks,
aldo's picture

Hi Mruni! Yeah, this could interfere with other systems. I'm not sure how are you guys planning that 'sharing' of Solr between several systems (will you have a single index with the data of all of them? will you separate the info through queries, or do you actually want to share the index between all of them?), but you definitely can separate things and avoid this problem. This is the way I have done it: a) Copy the directory [Solr-instance-root]/example, with all its content, in the same folder it is, but with a different name. For example: [Solr-instance-root]/index-magento b) Just to clean things up, and not have the previous index again, you can delete all the content of the index directory, located in (following my example) [Solr-instance-root]/index-magento/solr/data. You'll find there two more directories, index and spellchecker: delete them both. Solr will generate this afresh when you start it. c) Now, we need to make this 'index-magento' to start in a different port than the original version, so we don't clash with it. That is done through this file: [Solr-instance-root]/index-magento/etc/jetty.xml. Open it, look for the previous port (if it was not changed, it should be 8983), and change it to a new, different port. The line to modify is this: <Set name="port"><SystemProperty name="jetty.port" default="8984"/></Set> In this example, I already changed it to 8984. d) Replace [Solr-instance-root]/index-magento/conf with the conf folder from Magento. e) Now, you can start this Solr instance side by side with the original one, and point your application(s) to it, through its specific port. Hope this helps! Thank you!

Thanks for the detailed step wise solution. I will try it out.

Hi, so I set Solr up and everything worked PERFECTLY - it's awesome. Then after a week, it stopped returning results all of a sudden. Anyone know why this would happen? Is there some sort of script that updates the index? I'll appreciate any feedback.

Hi Joel! I'm thinking some possible causes: are you starting Solr in your server boot up process? Could it be that the server went down for some reason and the Solr server was not started again? I would try checking in the admin if the Solr server can be contacted (the colorful button part in this post). It is weird that you don't get any results, though: Magento tries to get the search results from Solr, but if Solr does not answer, Magento falls back to the default mysql search. You should still have results even if Solr is down (slower, of course). The Solr-specific features ("did you mean...", related searches, etc) stop working.

Hi, I have a custom Vendor Extension on my system. I have configured SOLR for Magento and it is working fine for products. However, it is not search information related to Vendor or CMS pages. What steps do I need to take so that SOLR includes Vendor related info? Thanks.
aldo's picture

Hi... well, how to make Solr handle custom info is pretty much what this series of posts is about. Take a look specially to part III and IV. Hope that helps a bit, at least about the overall idea of what it takes.

Hi Aldo, The only fields that Magento seems to be sending to Solr are: status, timestamp, store_id, id, short_description, sku, price, name, in_stock, description_en, fulltext_en, attr_select_tax_class_id, attr_select_status. Plenty of other fields that are available through SOAP such as URL, Images, key words, categories, etc. are not available in solr. Do you have an idea on how to enable them so they show up in the Solr index? Thanks. Max

Hi Aldo, Thanks for this post. Everything is fine with me as per your post as mentioned above. But the final step (Checking stuff from Front-end) while searching from front-end is not working for me, even it doesn't show suggestions as well. As my magento app & Solr app are located in two different server location. Can you please assist me on this, is it necessary to map the databse here ? Looking forward for your response here. Thanks Rajendra

Hi, folks. We ended up doing the exact same thing for Magento CE version 1.4. Using a custom Java app, we extracted what we needed directly from Magento's MySQL database tables and inserted them into Solr after having configured the Solr schema to reflect Magento's product and category fields. We also hooked into Magento's event-observer system so that some PHP code is run whenever anyone performs a catalog reindex (from the Magento admin site). All the PHP code does is write a particular file to a particular directory on the filesystem. A daemon watches for the presence of this file and, if it sees it, triggers a Solr reindex (and also memcached flush, in our case). It's working great so far in our testing and we hope to go live with it on our new site within the next month. One extra note: we configured our Solr schema to store (but not index) certain fields like product URL and product thumbnail, so that when presenting search results we don't need to make any extra calls to either Magento or MySQL.

Add comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.