A classically trained search engine

Image source: http://www.flickr.com/photos/jazbeck/

No, Amazon, when I say “Beethoven’s 2nd” I am not looking for a kids film about a freaking St Bernard.

As most of us who have ever tried it will know, searching for classical music online is a hit-and-miss affair.  There’s a reason for this.  Most search engines decide which results are relevant to you by matching words in your search query with words that appear in a particular web page, document or record.  The more words matched within a given document, the more ‘relevant’ a search engine believes it to be and the higher up the list of search results it will go.

The problem with this type of search is that it is fundamentally ‘dumb’. The search engine has no built-in understanding of the words you use in your search query.  It doesn’t know that a “piano concerto” is a musical form, that “Mozart” is a composer, that “A major” is a key signature, and that the “A major piano concerto by Mozart” is not the same thing as “a major piano concerto by Mozart”.

This is not terribly surprising.  Most search engines have no idea who you are, or what you are going to type into them.  In the US last year, the top ten Google searches included “Boston marathon”, “government shutdown”, “VMAs”, “new pope” and “Mayweather vs Canelo”.  That’s a pretty diverse range of topics, and no search engine could be expected to have an intelligent understanding of each of them.

The problem is worsened quite substantially when numbers are introduced into a query. This is pretty much impossible to avoid with classical music, whether you’re looking for “beethoven piano concerto 5“, “schoenberg 5 pieces for orchestra” or “corelli’s 12 violin sonatas op 5“.  Because a search engine doesn’t really understand the terms you type into it, it cannot deduce from the context what sort of number you’re looking for.  Instead, it just searches for matching numbers within its database.

The result is that a search for “beethoven symphony no 5” might yield an album of (say) Beethoven‘s Symphony No. 7 coupled with the Piano Concerto No. 5.  From the search engine’s perspective, that’s not a bad match.  From yours, it could definitely be bettered.

This inability to intelligently comprehend what you are looking for is why traditional search engines don’t really work for classical.  Completely unambiguous search terms such as the following yield completely different sets of results, even on the two biggest online music retailers:

 

 

Search Results

 

Good luck guessing which of those results sets has the particular recording you’re looking for in it!

Solving this problem has been one of Lelio’s key aims over the last 18 months.  Our search engine was built with classical music in mind from the very beginning.  We were able to do this because, unlike a traditional search engine, we know what type of information our users are likely to be looking for.  It is then just a matter of programming the search engine to intelligently interpret queries about that information.

So for instance, when you type “beethoven symphony 3″ into the Lelio search box, it understands that Beethoven is a composer, a symphony is a musical form, and 3 (in this context) probably refers to a particular instance of that musical form.  If you typed in “beethoven symphony 55″, however, it knows that there is no 55th symphony and instead checks to see whether there is a matching symphony under Op. 55 – which, in this case, there is.

Key to the efficacy of this model is ensuring that every term in our database is unique.  It’s no good having different records for “Tchaikovsky”, “Tschaikowski” and “Tchaik” (as his mates call him), when they are all synonyms for the same composer.  Likewise, Lelio will accept any of the following variants of the word “symphony”:

  • symphony
  • sinfonie
  • symphonie
  • sinfonia
  • sin
  • sinf
  • sym
  • symph

As far as Lelio’s search is concerned, all of these terms mean the same thing (and in the next release, it will also recognise misspelled variants just in case you have chubby fingers like me.) This is what allows us to return such a consistent set of results.

Why is this consistency so important, you ask?  Well, remember how the other week I blogged about Lelio’s mission to revive classical music sales?  Well, it turns out that in the world of sales, positioning is everything.  Just look at this graph showing the clickthrough rate of Google’s search results:

Optify analysis of Google top 20 search results click-through rate

What this graph demonstrates is that someone is almost three times as likely to click on the first search result they get offered than on the second – and over 16 times more likely than on the tenth.  If someone wants to buy something, it pays to ensure you’ve positioned it smack in the middle of their sight line.

With Lelio, we take things one step further. If our search engine considers your search to be wholly unambiguous, we don’t even display a list of search results: we just take you straight to the item you’re looking for.  At the moment, this is limited to composers or works, but artists and recordings will be coming in the next release.  The idea is to save you valuable time which can be better spent deciding whether you want the Furtwängler recording or the Klemperer.  (Top tip: get both!)

At any rate, we think this technology works really well and removes a big degree of randomness from the process of searching for classical music.  As I said in my last post, however, we know our prototype search engine is extremely limited – but we hope you enjoy playing around with it anyway and seeing what it could be capable of in the future.  If you have any feedback, we’d love to hear it – and please do share this blog using the buttons at the top of the post if you like what we’re doing.

2 thoughts on “A classically trained search engine

  1. focus t25 workout video

    Hey would you mind stating which blog platform you’re working with?
    I’m planning to start my own blog soon but I’m having a hard
    time deciding between BlogEngine/Wordpress/B2evolution and Drupal.
    The reason I ask is because your layout seems different then most
    blogs and I’m looking for something unique. P.S Apologies for getting off-topic
    but I had to ask!

Comments are closed.