hits counter

Brian Whitman @ variogr.am Brian Whitman @ variogr.am

Scroll to Info & Navigation

Why music ID resolution matters to every music fan on Facebook

Facebook’s music announcement a couple of weeks ago was a huge land grab, an audacious move to get itself ensconced as the nexus of that music platform I’ve been talking about. On paper and on stage the service looked game changing: all your music players and services all in one place, neatly collected with your friends to help you navigate the massive world of music. Myself and my engineering team at The Echo Nest have probably spent as much time thinking about that massive world of music as anyone on earth, so I thought I’d try putting it through its paces.

Facebook’s recently-launched music service now shows every music fan why the crazy and complicated world of music ID resolution matters to all of us. The more social our music activity becomes, the more music data becomes relevant to music fans every day.

Though Facebook music has only been live for a couple of weeks, Facebook is clearly struggling with some well-worn challenges in music ID resolution — problems I’ve been dealing with for many years now. Below are some examples highlighting the promise of Facebook music and some common music ID resolution problems they’ll need to fix to really deliver on the promise…

The holy grail — ‘universal song ID’

The only way social music features can work is if the song you want to hear actually plays.

If you’re listening to a song in Spotify, it will broadcast to Facebook and all your friends see what you are listening to in real time0. If they want to play along, your friends click a simple play button to hear it themselves. No Spotify? No problem— Facebook launched with a huge array of music content partners (with some conspicuous elephants missing) and, if that song is available in your choice of player, it will play with no “friction.” You’ll see something nice like:

Which opens the song in your selected player (in this case, it’s Rdio, which just needs to be open in your browser.) This is inconceivably great for music consumers— there’s a lot of great competing services, all with unique features and cost structures, and giving more choice is always good. I may want to use MOG because it has excellent Echo Nest-powered discovery features. Or Rdio or Spotify because they allow 3rd party mobile apps. Or Slacker or iHeartRadio for a radio experience.

What song is this?

This is all music-world changing stuff, if it worked. When I first played with Facebook Music, I tried a simple example. I put on the most terrible popular song I could think of in my Spotify player:

And then I asked Facebook to “Play in Rdio.” I heard something sort of like it but not exactly:

Here Facebook has decided that Rdio’s version of “Your Body is a Wonderland” is a sounds-like version from “The Hit Crew.” I cannot think of a worse fate: hearing something worse than John Mayer when you have to click on a link that says John Mayer. (Consider clicking on a Google search result for your dentist’s office phone number and getting your ex-girlfriend instead.)

At The Echo Nest, we know “The Hit Crew” all too well; they crank out “soundslike” versions of tunes. This is a great example of a basic music resolving problem: every song in any reasonably sized catalog has dozens of karaoke versions, covers, instrumentals, yoga mixes, etc. For Facebook to resolve a top 20 single to its sounds-like version is pretty ugly. What’s going on?

Moving on, that Glasser song up top? Later that day I clicked on another friend listening to the same song, but it was different this time:

Wait a minute, what’s the difference? I obviously know what’s going on — the Spotify “Apply” is from a compilation while the Rdio version is on the full length release. If you click on the release version on Spotify, you can resolve the song to any other service for your friends and get recommendations, or participate in the social listening experience. But if you click on the compilation version (which is the default when typing Glasser Apply in the Spotify search box), you get nothing. The result: the song you hear might as well have been something you recorded in your basement last night, even though Rdio has what you were hoping for:

This is another common ID resolution problem. Facebook likely isn’t working from a canonical database of songs or artists, rather using loose references to them from their own data and partners. And the glue linking those ID structures together is brittle, making for risky connections and some strange user experiences when translating across services.

Why it matters

Accurate resolving is the necessary backbone of social music.

These examples show that ID resolution isn’t just the plumbing underneath a social music experience — it is the foundation of any good music service that allows sharing. If songs don’t play when they should or link to the wrong song, people can’t talk about them.

Even imagining a world in which there is only Spotify in Facebook, consider the following realities: any real music fan will want to connect with other types of services: radio players like iHeartRadio, video services like VEVO, reviews, blogs, biographies, artist photos, games, publishing platforms like Soundcloud, and so on.

We obsess about this problem. I’m guessing Facebook is obsessing about it now, because it introduces friction millions (or billions) of times per day into what Facebook wants to be a “frictionless” experience. The more social your music activity, the more you’ll agree that any decent social music service service needs to know that two slightly differently spelled artists may be the same artist. Or that the radio edit of a song can be played in place of the single version.

As implemented, the v.1 Facebook music experience is like comparing snowflakes with a ruler. Right now it impacts the user experience, but the effects could definitely get worse as more users and more services join the fray.

For me, clicking around Facebook Music these days is tough. It’s rare that anything I’d want to hear gets resolved properly:

Of course, EN knows the song exists in Rdio, as does Rdio’s API. Why doesn’t Facebook?

Artist pages and context

As you can probably tell, song resolving is bothering me enough; but the Facebook music application I was most excited about was the addition of content to its massive database of context. Facebook users or page owners can tag artist names in their wall posts and events and Facebook will helpfully make that artist playable if it knows about it:

Here Facebook made a hover link for the band tapes ‘n tapes on a page for the venue Brighton Music Hall playable by choosing a random music service that can play songs by that artist for you. This is very similar to Google’s old “Music OneBox” which aggregated MySpace, iLike, LaLa and a lot of other websites you don’t use anymore. Great for listeners, (maybe) great for services, great for the bands. But here’s another area where ID resolution problems make the user experience fall down.

It doesn’t take long to find bands that Facebook simply doesn’t “know” about, which is fascinating given the breadth and depth of their user entered and maintained “community page” and fan page structures. For example, one post down on the Brighton Music Hall page is a note about the great Dirty Beaches (Echo Nest JSON):

Where is the player? Spotify has a lot of Dirty Beaches, as do many of the services I tried. It appears that any relatively recent or independent band1 simply does not get the player, no matter what services can support playback. This is very sad for the musicians and or listeners like me.

It’s clear and not particularly surprising that Facebook has trouble determining the identity of musicians on its own site— even those that have well groomed artist pages supplied by management with download widgets and tour details.

One of the Echo Nest rites of passage is for an engineer to uncover details automatically of the classic rock group “The Band" from the internet— and our recent push to know all we can about Facebook artists and their pages was a two month 3 engineer effort that uncovered a cascading series of Facebook pitfalls in which “The Band” was actually an easy one. One Echo Nest employee wrote me a surreal late night email asking me to make sure he was still sane as the various Facebook data gathering APIs appeared to be non-deterministic: successive calls would return completely different results2. I was grimly delighted to see that Facebook’s own engineers faced the same problems we did. For some reason, it looks like we do a much better job at resolving Facebook page IDs. Again, this could definitely change as the Facebook service matures — they got a lot done in a short amount of time and Facebook music just launched. Here’s our Facebook data about “The Band” returned from a simple search for “The Band” that links to the very professional and official:

Facebook seems to want to take you here, a twilight dead letter zone of people talking about “the band” in other contexts. It’s not a page or a community page and therefore does not let cross-service resolving or context work. It looks like a SEO trap and seems to have conned 2,503 confused people:

You can see this in their music app whenever you see a band name all alone with no other information at the top. They have trouble with another favorite around here, artist resolving 101 pop quiz post-punk 80s candidate The The3:

These are not easy problems to solve. A huge class of Facebook artist resolving issues seems to come down to “merges” — artists that may be known as different names, aliases, nicknames, side projects or foreign language names. We maintain a huge database of musical aliases (“Led Zep,” “BEP” etc) as well as collaboration names and misspellings as perfect resolving against text and search is something we work very hard on. But Facebook doesn’t like seeing “Tom Jobim” because they only know him as “Antonio Carlos Jobim”:

For example, any music service worth its salt has spent countless hours debating whether to assign Tom Petty and Tom Petty and the Heartbreakers the same database ID (the answer is no, by the way.)

But when I listen to “Free Fallin’” in Spotify (Petty solo) Spotify gets it right but I am not allowed to hear it in Rdio because it doesn’t match up to their (incorrect) assignment of the song to the heartbreakers:

And this is “Free Fallin’”, a song that is taught to sixth graders and I am pretty sure is the state anthem of California. Same goes for other popular artists who’ve performed with and without named backing groups:

A real world example of this4: a particular dear sensitive friend in London was having a late night Jason Molina bender broadcasted on his Facebook feed. This is an inspiring use of social music and where Facebook will eventually shine— seeing what my friend is listening to, I can listen along and maybe tell him to put down the bottle of gin and go to sleep. However, I can’t hear it:

Why? Rdio had tons of Molina and of course that particular song. It’s because Facebook didn’t do a good job of resolving5 — to them, Jason Molina is: “Songs: Ohia,” the name of one of his side projects. And of course neither Rdio nor any other service has a song called “Get Out Get Out Get Out” by Songs: Ohia, because it doesn’t exist. This is a Jason solo song. Here’s how Facebook got so confused:

Facebook seems to rely strongly on Wikipedia for much of its artist data (their “Community Pages” are CC licensed WP copies), and Wikipedia’s editors auto-redirect Songs:Ohia to Jason. So somewhere in the depths of Facebook’s graph database is a pointer that goes the other way. This pretty much invalidates Jason’s chances of ever getting social music love on Facebook. I doubt it’ll get fixed any time soon. But maybe this one will:

That’s Selena Gomez. She’s not Facebook Music compatible, for a different reason — she has a double life. Selena is listed in Facebook as an “Actor/Director,” not a “Musician/Artist.” You can’t click through from a friend’s listen to a music service, and you can see her page from any stream activity. Can you guess why? It’s To Facebook, she doesn’t make music. This affects a lot of edge-straddling pop stars, with some notable exceptions. I noticed that “Glee Cast” was manually fixed, as was Kraftwerk (who were a “Local Business” until a month ago6) but comedians and musicals still are denied access to the social music party.

What could they do and what should happen

Quite a lot of Facebook’s resolving issues could be fixed by ingesting catalog in a “musical” way — not just treating strings such as artist names or song titles as database IDs as they seem to be doing. There are some pretty well-known approaches Facebook could take to fix these problems. They can use audio fingerprinting, for example (of course, I know of a couple, even an open source one.) They can work on mapping artist names and song titles together a bit more intelligently as we’ve been doing for awhile: one of The Echo Nest’s main services is project Rosetta Stone, an “ID space” resolution system that can quickly identify songs or artists in any platform: generate an eMusic ID from a Spotify URL, or a MOG ID from an audio fingerprint, or any combination possible. Facebook could have merged the data feeds in some slightly intelligent way to match songs across releases. Or some of their millions of users could edit or automatically de-reference the metadata7. It’s clear that none of this happened.

Social music is the future of music. Facebook is pushing this future forward more than anyone. It’s clear that Facebook has some trouble ahead of them in the resolving space and I’m sure they are obsessing about it as much as I do. They are going to have to get down with the one big database of music scenario sooner than later. Facebook has more users and data than anyone and I’d love to see a concerted effort to build a true world of connected music via the new Open Graph. But there’s a lot more work to do before that promise is realized.

- brian@echonest.com / @bwhitman

Thanks to EVB and PBL and JL and MO for editing help

0 Spotify actually had the nerve to call it “scrobbling,” which I doubt the inventor of the scrobble, Last.fm, are very happy about as Facebook’s integration is a clear competitor

1 I can’t prove it yet, but I think this feature may be limited to artists appearing on an older dump of MusicBrainz— which is too bad as they only have 600,000 artists and are relatively slow to update

2 Try this (failing) query for The The. You’ll either get The Bible or the Simpsons as the top result, which I found very apropos. Maybe the engs pit memcache servers against each other "I don’t like fighting either. Get here first" style

3 I’ve heard this is a Google interview question and is still to this day one of the first bands I type into a new music service after Blitter and Various Artists.

4 as no one really needs help listening to Tom Petty, it is somewhat an inherent instinctual property of the world that Tom Petty vibrates speakers

5 You can see Facebook’s resolved name if you click on a song title in Facebook Music after hovering over a music listen status

6 they surely have a fan in Facebook engineering

7 It’s likely this may be happening to some extent. If I play a song in Rdio that Facebook did not previously “know” about without Facebook’s help the next time I try it in Facebook it appears as an option. I’m not sure if this persists to other users.