Rails Against The Machine

Just a mind dump. Why are you even reading this?

Thursday, 17 January 2008

 

Acts as solr (reprise)

Ok time to revisit /acts_as_solr .

prerequisites:

* A functioning Ruby on rails application e.g. the depot one.
* Java sdk 1.5 or above

Part 1: download and install acts as solr

Go to the root of your rails application e.g. cd ...\depot and then type:

ruby script/plugin install svn://svn.railsfreaks.com/projects/acts_as_solr/trunk


Part 2: get solr running.

Now according to the tutorial on the acts_as_solr web site all you do is type rake solr:start however on a windows machine you will probably see the following error message:

rake aborted!
Bad file descriptor - connect(2)
(See full trace by running task with --trace)


This is because the existing rake task uses ‘fork’ which is not available on windows.
There are two solutions to this. The first is to start the solr servlet manually.

depot\vendor\plugins\acts_as_solr\solr directory\
java -jar start.jar



the servlet should start and can be seen at: http://localhost:8982/solr/admin/


The longer term solution is provided by the Web on rails blog provides a new rake task which works starts on windows. The hack is as follows:

  desc 'Starts Solr. on windows . Options accepted: RAILS_ENV=your_env, PORT=XX. Defaults to development if none.'
task :start_win do
begin
n = Net::HTTP.new('localhost', SOLR_PORT)
n.request_head('/').value

rescue Net::HTTPServerException #responding
puts "Port #{SOLR_PORT} in use" and return

rescue Errno::EBADF #not responding
Dir.chdir(SOLR_PATH) do
exec "java -Dsolr.data.dir=solr/data/#{ENV['RAILS_ENV']} -Djetty.port=#{SOLR_PORT} -jar start.jar"
sleep(5)
puts "#{ENV['RAILS_ENV']} Solr started sucessfuly on #{SOLR_PORT}, pid: #{pid}."
end
end
end


Add it to vendor/plugins/acts_as_solr/lib/taks/solr.rake, and start solr server on windows by issuing the following command.
rake solr:start_win 



Ok lets roll and add search to some models


class Job < ActiveRecord::Base
acts_as_solr :include => [:category], :fields=>[:name,:description,:resolution]
belongs_to :category
belongs_to :user
end

class Category < ActiveRecord::Base
has_many :jobs
acts_as_solr :include => [:jobs]
end


What is neat about this is that it does model association indexing, which means you can include any :has_one, :has_many, :belongs_to and :has_and_belongs_to_many association to be indexed: So when we search for jobs we can search for those matching a category.

Lets try it:

The default behaviour of acts_as_solr is to index model objects automatically upon save or update of a record. But if you have existing data you need to go to the rails console and rebuild the index
ruby script/console
Job.rebuild_solr_index
Category.rebuild_solr_index


if you see =>true everything is ok if not something is wrong.
Is the port that solr is running on the same as the one specified in config/solr.yml ?

So let now try to search
@results = Job.find_by_solr("code monkey")


So everything works in principle now lets add a search box to our application by adding require 'solr_pagination' to config/environment.rb.

paginating search

add the following file to lib in your Rails project and make sure it is run by adding require 'solr_pagination' to config/environment.rb.

module ActsAsSolr
module PaginationExtension

def paginate_search(query, options = {})
options, page, per_page = wp_parse_options!(options)
pager = WillPaginate::Collection.new(page, per_page, nil)
options.merge!(:offset => pager.offset, :limit => per_page)
result = result = find_by_solr(query, options)
returning WillPaginate::Collection.new(page, per_page, result.total_hits) do |pager|
pager.replace result.docs
end
end


end
end

module ActsAsSolr::ClassMethods
include ActsAsSolr::PaginationExtension
end


This is adapted from a similar hack for acts_as_ferret what it does is create a new WillPaginate::Collection as defined here. Subsequent calls then retrieve a new collection but with a different offset.

Our controller is something like:

  def search 
begin
@jobs =Job.paginate_search params[:query_string], :page => params[:page], :per_page => 10
@query_string=params[:query_string]
rescue
#handle any errors here
#flash[:notice] = 'There was a problem with your query'
@jobs = Job.paginate :page => params[:page]
end
end


while our view is something like:

<% form_tag :action => 'search' do %>
<%= text_field_tag :query_string , @query_string %>
<%= submit_tag "Search" %>
<% end %>
<p><b><%= pluralize @jobs.total_entries, 'job' %></b> found.</p>
<table>
<%= render :partial => 'job', :collection => @jobs %>
</table>

<%= will_paginate @jobs, :params =>{:query_string=> @query_string} -%>


The only issue is we need to make sure that will_paginate passes the query string as additional parameter otherwise we will get no results when we click 'next page'.

will_paginate @jobs, :params =>{:query_string=> @query_string} 


We can stop here if we want straight forward search but lets be a little more interesting and try to implement live search. To do this we change the controller action to include a response to an ajax request.

  def search 
begin
@jobs =Job.paginate_search params[:query_string], :page => params[:page], :per_page => 3
@query_string=params[:query_string]
rescue
#handle any errors here
#flash[:notice] = 'There was a problem with your query'
@jobs = Job.paginate :page => params[:page]
end
respond_to do |format|
format.html do
#If the request is a live search just return the rendered search results
#rather than the whole page
render :action => 'live_search.html.erb', :layout =>false if request.xml_http_request?
end
end
end


So it basically just returns another view called "live_search" so we just set an observer on the search field so that it calls this action to replace the search results.

 <%= observe_field(:query_string, :url =>{ :controller => :jobs, :action => :search }, :frequency => 0.5, :update => :search_results, :with => "'query_string=' + escape(value)") %>


The problem with this is that you get a call to your server and solr every time the user types!! Which isn't exactly sustainable. So what we really want is local search over the existing results every time the user types and then solr search if

1)The user presses enter
2)Local search returns no results
3)Perhaps when the user presses space

Something like: http://pushrod.wordpress.com/2007/12/18/solving-the-live-searchslow-mongrel-process-problem/

one last tip, if you want fuzzy matching (which will be robust to typo's) append a tilde to your query string "~"

search_results=Product.find_by_solr(query+"~", :scores => true)

Yay!

btw don't be a dufos and before you deploy make sure your Solr server cannot be accessed from outside!

Comments: Post a Comment

Subscribe to Post Comments [Atom]





<< Home

Archives

July 2007   August 2007   September 2007   December 2007   January 2008   February 2008   March 2008   April 2008   June 2008   July 2008   August 2008   October 2008   November 2008   January 2009  

This page is powered by Blogger. Isn't yours?

Subscribe to Comments [Atom]