Rails Against The Machine

Just a mind dump. Why are you even reading this?

Thursday, 17 January 2008

Acts as solr (reprise)

Ok time to revisit /acts_as_solr .

prerequisites:

* A functioning Ruby on rails application e.g. the depot one.
* Java sdk 1.5 or above

Part 1: download and install acts as solr

Go to the root of your rails application e.g. cd ...\depot and then type:

ruby script/plugin install svn://svn.railsfreaks.com/projects/acts_as_solr/trunk

Part 2: get solr running.

Now according to the tutorial on the acts_as_solr web site all you do is type rake solr:start however on a windows machine you will probably see the following error message:

rake aborted!
Bad file descriptor - connect(2)
(See full trace by running task with --trace)

This is because the existing rake task uses ‘fork’ which is not available on windows.
There are two solutions to this. The first is to start the solr servlet manually.

depot\vendor\plugins\acts_as_solr\solr directory\
java -jar start.jar

the servlet should start and can be seen at: http://localhost:8982/solr/admin/

The longer term solution is provided by the Web on rails blog provides a new rake task which works starts on windows. The hack is as follows:

  desc 'Starts Solr. on windows . Options accepted: RAILS_ENV=your_env, PORT=XX. Defaults to development if none.'
  task :start_win do
    begin
      n = Net::HTTP.new('localhost', SOLR_PORT)
      n.request_head('/').value 

    rescue Net::HTTPServerException #responding
      puts "Port #{SOLR_PORT} in use" and return

    rescue Errno::EBADF #not responding
      Dir.chdir(SOLR_PATH) do
          exec "java -Dsolr.data.dir=solr/data/#{ENV['RAILS_ENV']} -Djetty.port=#{SOLR_PORT} -jar start.jar"
        sleep(5)
        puts "#{ENV['RAILS_ENV']} Solr started sucessfuly on #{SOLR_PORT}, pid: #{pid}."
      end
    end
  end

Add it to vendor/plugins/acts_as_solr/lib/taks/solr.rake, and start solr server on windows by issuing the following command.

rake solr:start_win

Ok lets roll and add search to some models

class Job < ActiveRecord::Base
  acts_as_solr :include => [:category], :fields=>[:name,:description,:resolution]
  belongs_to :category
  belongs_to :user
end

class Category < ActiveRecord::Base
  has_many :jobs
  acts_as_solr :include => [:jobs]
end

What is neat about this is that it does model association indexing, which means you can include any :has_one, :has_many, :belongs_to and :has_and_belongs_to_many association to be indexed: So when we search for jobs we can search for those matching a category.

Lets try it:

The default behaviour of acts_as_solr is to index model objects automatically upon save or update of a record. But if you have existing data you need to go to the rails console and rebuild the index

ruby script/console
Job.rebuild_solr_index
Category.rebuild_solr_index

if you see =>true everything is ok if not something is wrong.
Is the port that solr is running on the same as the one specified in config/solr.yml ?

So let now try to search

@results = Job.find_by_solr("code monkey")

So everything works in principle now lets add a search box to our application by adding require 'solr_pagination' to config/environment.rb.

paginating search

add the following file to lib in your Rails project and make sure it is run by adding require 'solr_pagination' to config/environment.rb.

module ActsAsSolr
  module PaginationExtension   

    def paginate_search(query, options = {})
      options, page, per_page = wp_parse_options!(options)
      pager = WillPaginate::Collection.new(page, per_page, nil)
      options.merge!(:offset => pager.offset, :limit => per_page)
      result = result = find_by_solr(query, options)
      returning WillPaginate::Collection.new(page, per_page, result.total_hits) do |pager|
        pager.replace result.docs
      end
    end


  end
end

module ActsAsSolr::ClassMethods
  include ActsAsSolr::PaginationExtension
end

This is adapted from a similar hack for acts_as_ferret what it does is create a new WillPaginate::Collection as defined here. Subsequent calls then retrieve a new collection but with a different offset.

Our controller is something like:

  def search 
      begin
        @jobs =Job.paginate_search params[:query_string], :page => params[:page], :per_page => 10
        @query_string=params[:query_string]
      rescue
        #handle any errors here
        #flash[:notice] = 'There was a problem with your query'
        @jobs = Job.paginate :page => params[:page]
      end  
  end

while our view is something like:

<% form_tag :action => 'search' do %>
<%= text_field_tag :query_string , @query_string %>
<%= submit_tag "Search" %>
<% end %>
<p><b><%= pluralize @jobs.total_entries, 'job' %></b> found.</p> 
<table>
<%= render :partial => 'job', :collection => @jobs %>
</table>

<%= will_paginate @jobs, :params =>{:query_string=> @query_string} -%>

The only issue is we need to make sure that will_paginate passes the query string as additional parameter otherwise we will get no results when we click 'next page'.

will_paginate @jobs, :params =>{:query_string=> @query_string}

We can stop here if we want straight forward search but lets be a little more interesting and try to implement live search. To do this we change the controller action to include a response to an ajax request.

  def search 
    begin
      @jobs =Job.paginate_search params[:query_string], :page => params[:page], :per_page => 3
      @query_string=params[:query_string]
    rescue
      #handle any errors here
      #flash[:notice] = 'There was a problem with your query'
      @jobs = Job.paginate :page => params[:page]
    end  
    respond_to do |format|
      format.html do
      #If the request is a live search just return the rendered search results
      #rather than the whole page
      render :action => 'live_search.html.erb', :layout =>false  if request.xml_http_request?
      end
    end
  end

So it basically just returns another view called "live_search" so we just set an observer on the search field so that it calls this action to replace the search results.

 <%= observe_field(:query_string, :url =>{ :controller => :jobs, :action => :search }, :frequency => 0.5, :update => :search_results, :with => "'query_string=' + escape(value)") %>

The problem with this is that you get a call to your server and solr every time the user types!! Which isn't exactly sustainable. So what we really want is local search over the existing results every time the user types and then solr search if

1)The user presses enter
2)Local search returns no results
3)Perhaps when the user presses space

Something like: http://pushrod.wordpress.com/2007/12/18/solving-the-live-searchslow-mongrel-process-problem/

one last tip, if you want fuzzy matching (which will be robust to typo's) append a tilde to your query string "~"

search_results=Product.find_by_solr(query+"~", :scores => true)

Yay!

btw don't be a dufos and before you deploy make sure your Solr server cannot be accessed from outside!

posted by ~J # 14:13

Comments: Post a Comment

Subscribe to Post Comments [Atom]

<< Home

Rails Against The Machine

Thursday, 17 January 2008

Acts as solr (reprise)

Archives