hits counter

Brian Whitman @ variogr.am Brian Whitman @ variogr.am

Scroll to Info & Navigation

The audio fingerprinting at the Echo Nest FAQ

One of the most popular APIs at the Echo Nest is for our two open audio fingerprinting systems — ENMFP or Echoprint. Between them both, we currently process queries for over 40 million songs a week— almost 70 fingerprints a second. We announced ENMFP two years ago and Echoprint one year later and it’s been going very fast since then. The world seemed to need an open music identification service and we were happy to provide it.

I co-developed both FP technologies (ENMFP using Tristan’s Echo Nest Analyze and Echoprint with Dan Ellis at Columbia) and often get the collected questions from our developers, customers and interested parties. There’s a lot of good questions that we haven’t answered very well anywhere else on our website or developer docs, so I thought I would collect them all in one place in sort of a living document on my blog.

Echo Nest fingerprinting FAQs

Last updated: July 22 2012

Some definitions to help:

  • code generator (or often “codegen”): a piece of software that runs on a computer, mobile device or server that computes “codes” from an audio stream meant for sending to a fingerprint lookup server for a possible match.
  • query server: a stack that receives codes and searches through data to find a match
  • matching data: the set of resolvable song codes that a query server can match, given query codes. Matching data must be computed on the whole song as query codes can be from anywhere in the song.

What fingerprinting technologies does the Echo Nest provide?

We have two: ENMFP and Echoprint.

How much do they cost? Can I use them anywhere? Commercially? What about the data or the server?



Can they fix my (or my users’) metadata on files on a hard drive?

For ENMFP, absolutely. You have a matching database of most music available for you. You can quickly test this out on your computer’s music. First, download the ENMFP codegen and run this:

Then, run this python script or something like it to look up each code:

(Obviously, you’d want to thread this call out in real world use.) There, you’ve built a scan and match service! Sell it for $50 each on the Mac App store.

For Echoprint, yes, although the matching database is still small.

I have a ton of audio (I’m a streaming service or something). Can I just run my own Echoprint server and codegen and compute my own codes and never talk to you again?

Yes. In fact, a lot of very big companies are doing this.

How much of the original audio do I need to match? Do I have to start at the beginning?

For ENMFP, we suggest 50 codes worth of audio, which is usually between 15 and 20 seconds. For Echoprint, we suggest 20 seconds worth of audio. The audio can be from anywhere within the song.

Once I have audio, how fast can I identify a file?

ENMFP’s codegen computes in roughly 20x real time — for a 30s sample it will take 1.5 seconds, for example. Echoprint’s codegen is roughly 1000x real time.

The server side varies on load and database side, but we aim to keep both within a 500ms response and support 50 queries a second per server. Those booting their own servers can easily shard or mirror multiple servers in a load balancer.

Can I make it work like Shazam or SoundHound?

This usually means “will over-the-air queries work,” as in a bar or noisy place. For ENMFP, no. ENMFP is only supported to work on files or “clean audio.” For Echoprint, this is the intent, although as of right now we do not support that.

Echoprint is newer. Will ENMFP still be supported?

Yes. The only thing that may change is the name, as it’s getting annoying to talk about two separate fingerprints. Even once we have a real catalog backing Echoprint, ENMFP still has some good properties that are useful for a lot of our customers.

Are there any whitepapers or more information on how they work?

Very short ones: here’s Echoprint’s and here’s ENMFP’s.

Can I run ENMFP or Echoprint on a mobile device?

There are two ways to run a fingerprint on mobile: one is to compute the codes on the device and query the server, another is to send a low bandwidth audio stream to another server, which computes the codes and then sends it to the server. Both ENMFP and Echoprint can run in the latter.

ENMFP is too compute-intensive to run the codegen on the device and we do not provide a codegen that runs on any mobile.

Echoprint can easily compute codes on a device, and we provide an example Xcode project for iOS.

I tried song/identify and got no results!!

The main problems are almost always one of the following:

  1. You have Echoprint codes but are not specifying the version # (has to be 4.0 or over) in the call
  2. You have ENMFP codes but are specifying a >4.0 version number.
  3. You are running the example codegen for either and your version of FFMpeg is not set up to decode MP3 files.
  4. You are using Echoprint and the match is not in our test 200,000 song database.

What type of audio does fingerprinting work on?

ENMFP and Echoprint were originally intended to provide fingerprinting for musical audio, and that is primarily what they are used for (for some definition of “music”). However, several of our users have also reported success stories about using Echoprint on speech audio.

Can audio fingerprinting be used to detect cover versions or live versions of songs? What determines the same song?

No. ENMFP and Echoprint are designed to identify the “same” rendition or recording of a particular song. For the purposes of audio fingerprinting, cover versions and live versions are considered to be “different”, and hence queries will be identified as such.

A song is defined by being the same recording or having the same master recording. Both fingerprints will consider a remastered version the same song, will consider a radio edit the same song, and will consider different encodings (e.g. MP3 at different bitrates) the same song.

Where can I go for help?

If you are using either FP system, best to:

  1. The server stack between ENMFP and Echoprint is very similar and it’s very possible a dedicated developer can make one work like the other. But not supported.