Category Archives: Search Engines

Coming soon to a classical search engine near you. . .

I thought I’d just write a quick note to tell you what we have planned for the site in the next couple of weeks.

Our development team is currently working on the latest version of our prototype website, which represents a significant upgrade to the pretty limited version of the site currently at leliomusic.com.  The new version will look superficially similar, but much has gone on under the hood.  Here’s what you can expect.

  • A significant increase in the number of recordings available.  We currently list about 150 recordings.  This will increase to around 2,000 in the next release.  This is still a very long way away from being comprehensive – and you may notice that the actual selection is a little eccentric! – but it’s a step in the right direction.  More to the point, we will be able to add more albums on a regular basis.
  • Search for a much wider range of compositions.  The current prototype only lets you search for symphonies; the new release will open searches up to all musical forms, with the exclusion of songs and opera (which have some slight technical challenges involved).
  • iTunes links: We will now feature buy links to both Amazon and iTunes!  If you use them, we will actually make a very small amount of money – imagine that!
  • Support for misspellings: At the moment, if you type Mozarf instead of Mozart, you get no results back.  In the new version, misspelled words will automatically take you to the correct page (with an option to go back to a general search results page if we got it wrong).
  • Artist imagery (we hope!): At the moment the site only features images of composers.  We are hoping to get hold of a stack of artist publicity photos to use on the site.  It won’t be entirely comprehensive, but should go some way towards making the site more visually appealing.
  • Better album titles: Currently, album titles are being pulled from Amazon’s data, leading to situations such as a listing for Beethoven’s Symphony No. 3, “Choral”.  The new site will replace these titles with automatically generated names based on the works featured and our own data.  This should be much cleaner and more accurate.  (You would not believe the amount of SQL code sitting behind this seemingly simple task, though!)
  • Lots of minor look-and-feel tweaks: Stuff you won’t even notice, but which will make the site look just a little more polished.

We’re expecting to release all of this in early February, so stay tuned for more.  And, as always, if you like what we’re doing, please share this blog on Facebook or Twitter or really any social media outlet you like.

A classically trained search engine

Image source: http://www.flickr.com/photos/jazbeck/

No, Amazon, when I say “Beethoven’s 2nd” I am not looking for a kids film about a freaking St Bernard.

As most of us who have ever tried it will know, searching for classical music online is a hit-and-miss affair.  There’s a reason for this.  Most search engines decide which results are relevant to you by matching words in your search query with words that appear in a particular web page, document or record.  The more words matched within a given document, the more ‘relevant’ a search engine believes it to be and the higher up the list of search results it will go.

The problem with this type of search is that it is fundamentally ‘dumb’. The search engine has no built-in understanding of the words you use in your search query.  It doesn’t know that a “piano concerto” is a musical form, that “Mozart” is a composer, that “A major” is a key signature, and that the “A major piano concerto by Mozart” is not the same thing as “a major piano concerto by Mozart”.

This is not terribly surprising.  Most search engines have no idea who you are, or what you are going to type into them.  In the US last year, the top ten Google searches included “Boston marathon”, “government shutdown”, “VMAs”, “new pope” and “Mayweather vs Canelo”.  That’s a pretty diverse range of topics, and no search engine could be expected to have an intelligent understanding of each of them.

The problem is worsened quite substantially when numbers are introduced into a query. This is pretty much impossible to avoid with classical music, whether you’re looking for “beethoven piano concerto 5“, “schoenberg 5 pieces for orchestra” or “corelli’s 12 violin sonatas op 5“.  Because a search engine doesn’t really understand the terms you type into it, it cannot deduce from the context what sort of number you’re looking for.  Instead, it just searches for matching numbers within its database.

The result is that a search for “beethoven symphony no 5” might yield an album of (say) Beethoven‘s Symphony No. 7 coupled with the Piano Concerto No. 5.  From the search engine’s perspective, that’s not a bad match.  From yours, it could definitely be bettered.

This inability to intelligently comprehend what you are looking for is why traditional search engines don’t really work for classical.  Completely unambiguous search terms such as the following yield completely different sets of results, even on the two biggest online music retailers:

 

 

Search Results

 

Good luck guessing which of those results sets has the particular recording you’re looking for in it!

Solving this problem has been one of Lelio’s key aims over the last 18 months.  Our search engine was built with classical music in mind from the very beginning.  We were able to do this because, unlike a traditional search engine, we know what type of information our users are likely to be looking for.  It is then just a matter of programming the search engine to intelligently interpret queries about that information.

So for instance, when you type “beethoven symphony 3″ into the Lelio search box, it understands that Beethoven is a composer, a symphony is a musical form, and 3 (in this context) probably refers to a particular instance of that musical form.  If you typed in “beethoven symphony 55″, however, it knows that there is no 55th symphony and instead checks to see whether there is a matching symphony under Op. 55 – which, in this case, there is.

Key to the efficacy of this model is ensuring that every term in our database is unique.  It’s no good having different records for “Tchaikovsky”, “Tschaikowski” and “Tchaik” (as his mates call him), when they are all synonyms for the same composer.  Likewise, Lelio will accept any of the following variants of the word “symphony”:

  • symphony
  • sinfonie
  • symphonie
  • sinfonia
  • sin
  • sinf
  • sym
  • symph

As far as Lelio’s search is concerned, all of these terms mean the same thing (and in the next release, it will also recognise misspelled variants just in case you have chubby fingers like me.) This is what allows us to return such a consistent set of results.

Why is this consistency so important, you ask?  Well, remember how the other week I blogged about Lelio’s mission to revive classical music sales?  Well, it turns out that in the world of sales, positioning is everything.  Just look at this graph showing the clickthrough rate of Google’s search results:

Optify analysis of Google top 20 search results click-through rate

What this graph demonstrates is that someone is almost three times as likely to click on the first search result they get offered than on the second – and over 16 times more likely than on the tenth.  If someone wants to buy something, it pays to ensure you’ve positioned it smack in the middle of their sight line.

With Lelio, we take things one step further. If our search engine considers your search to be wholly unambiguous, we don’t even display a list of search results: we just take you straight to the item you’re looking for.  At the moment, this is limited to composers or works, but artists and recordings will be coming in the next release.  The idea is to save you valuable time which can be better spent deciding whether you want the Furtwängler recording or the Klemperer.  (Top tip: get both!)

At any rate, we think this technology works really well and removes a big degree of randomness from the process of searching for classical music.  As I said in my last post, however, we know our prototype search engine is extremely limited – but we hope you enjoy playing around with it anyway and seeing what it could be capable of in the future.  If you have any feedback, we’d love to hear it – and please do share this blog using the buttons at the top of the post if you like what we’re doing.

Lost in translation

The former president of EMI Classics once told me that classical music had a natural advantage over other forms of music: since a lot of it is instrumental, it travels extremely well.  There are no language barriers.  Beethoven’s Fifth Symphony is just as comprehensible in São Paolo or Kyoto or Reykjavik as it is in Vienna, where it was composed.

Putting aside obvious exceptions such as opera for a moment, I tend to agree with this idea.  Classical music is the most abstract of all the arts.  Literature, fine art, theatre, dance and film all started out as essentially representational art forms.  Music has always appealed to some other, less literal part of our brains.

So the task of having to describe it in literal terms is somewhat antithetical to its nature.  For most of classical music’s existence, composers have rebelled against the notion of ascribing concrete meaning to their music.  That’s the reason a work as elementally powerful as Beethoven’s Fifth is known by its form, Symphony, and its number, 5, rather than by some reductive nickname such as “Fate” or “On Deafness” or “The Napoleonic Wars Really Suck“.

However, that essentially abstract nature doesn’t help us much when we’re trying to find music, because the way we communicate – with search engines, just as with one another – is through the medium of verbal language.  With search engines we don’t even have the advantage of being able to ask them what the name of that piece that goes “da-da-da-daaa” is – at least not yet!

So we’re stuck with written language, and that’s where things start go awry.  The other day, a friend told me I should check out Jos van Immerseel’s recording of Rimsky-Korsakov’s Scheherazade.  So this morning on my way home from the gym, I thought I’d try and find it on Spotify and give it a listen.

I was looking for it on my iPhone, which gives me the option of searching Tracks, Albums or Artists.  As I wanted to listen to the whole album, I searched under Albums.  No results.  What a shame, thought I.  But then it occurred that perhaps I should look under Tracks.  Still no results, but this time a suggestion: “Did you mean immerseel sheherazade?

I had to look closely at my screen to see how that differed from what I’d asked for.  When I tapped the suggestion it brought up the following set of tracks:

  • Shéhérazade, Op. 35: La mer et…
  • Shéhérazade, Op. 35: Le réci…
  • Shéhérazade, Op. 35: Le jeune…
  • Shéhérazade, Op. 35: La fête à…

etc.

It was clearly a French release of the album that had been left untranslated for an English-speaking audience.  The problem with this is that Shéhérazade is, to the English-speaking world, a completely different piece by Ravel, a set of orchestral songs not to be confused with the symphonic suite by Rimsky-Korsakov.

It didn’t help that Spotify’s iPhone app had also handily removed Jos van Immerseel’s name from the metadata.  Instead, underneath each track was the mysterious term “Anima Eterna“.  In other words, if I hadn’t known the opus number of Rimsky-Korsakov’s piece and the fact that Jos van Immerseel’s orchestra is named Anima Eterna, I’d have had absolutely no idea that I’d actually found what I was looking for.

This is a classic example of how badly classical music suffers from shoddy metadata.  In Lelio‘s master database, we have a way around this: every piece of data is multilingual.  We don’t have Scheherazade in there yet, but try searching for something like “Die Geschöpfe des Prometheus”.  You should get The Creatures of Prometheus Overture by Beethoven.  Once we turn on multilingual support, German speakers will be able to do this in reverse.

In the meantime, do check out Immerseel’s Scheherazade or Shéhérazade or, as Rimsky himself would have known it, Шехерезада.  It’s worth waging battle with a search engine for.