Extensible Search with Data::SearchEngine

Searching Is Important

Search is the special sauce in the webscale sauce. Wait, that's not right. That's NoSQL. Search is the feature that makes the glorious expanse of the internet usable. It's come to the point that typing a keyword into your search input box and pressing enter is faster than typing in a domain name. As such, search is an important feature in the microcosms we create as web developers. It's one of those things that we've likely all tried to implement to varying degrees in our careers. I've implemented quite a few search backends in my time. Those culminated in the use of Solr for work projects. Even with Solr in my toolbelt I occasionally find myself in need of a quick and dirty search that has no outside dependencies. This means I'll have to roll something with SQL and maybe Search::QueryParser. Regardless of how I choose to implement the search in my web application I know that I'll abstract it using Data::SearchEngine.


Data::SearchEngine has two simple goals:

Stay out of your way when implementing a search.
Provide a consistent API for using search results.

Accomplishing this goal requires some buy in on your part. It's a bit more expensive up front but I can tell you from personal experience that your search backend will grow and change in any reasonably web application. It's a good investment to take some time to design yours well.


We'll throw together a dead simple search using DBIx::Class to illustrate how easy it is to build a new backend for Data::SearchEngine. Then we'll demonstrate using it in a web application so you can see the advantages of using an abstraction layer atop your search implementation.

  package Data::SearchEngine::MySearch;
  use Moose;
  use Data::Page;

  with 'Data::SearchEngine';

  sub search {
    my ($self, $query) = @_;

	# Simple search!
	my $rs = $resultset->search({ name => { 'like' => '%'.$query.'%' } });

	# Make a results object from the rows we get back!
	my $result = Data::SearchEngine::Results->new(
        query => $query,
        pager => Data::Page->new(
			# You should really customize these based on the results of the
			# query, but this will do for now
			current_page => 1,
			entries_per_page => 10,
			total_entries => $rs->count,

	# Add the "hits" to the results object!  Note that the data goes into a
	# 'values' hashref.
	while(my $row = $rs->next) {
            values => {
                name => $row->name,
                description => $row->description
			# You could customize this, if necessary!
            score => 1

	# Give back the results object!
    return $result;

That's it! The code is really simple but the gist is: Get a resultset from DBIC and iterate over it, adding each row to the result!


I'll admit: that wasn't super-exciting. The power comes in using the search results in your application. Here's a simple TT example:

    [% FOREACH item = results.items %]
    <li>[% item.get_value('name') %]</li>
    [% END %]

The Item objects are stored in a simple list so that you can iterate over them. You get data from them using a get_value method. This allows you to add arbitrary fields to fit your implementation.

What's the Big Deal?

The big deal is that if you want to add a new search feature, backend implementation or refactor your code you don't have to change anything in your interface. As the author of a Data::SearchEngine implementation you can even implement common features such as faceting or spellchecking and expose a common interface. When your needs outgrow your homegrown DBIC implementation you can move up to Solr and not have to change any code in your view!

It should also be noted that – through the power of MooseX::Storage – the Results object is totally serializable! This means it's easy to cache!


Search implementations often start out simple and grow over time. Eventually this growth yields a messy pile of code that nobody remembers how to read any more. Data::SearchEngine can help you add search to your web application in a forward thinking way.

AUTHOR gphat: Cory G Watson <gphat@cpan.org>