The Echo Nest joins Spotify

We’re very excited to announce that The Echo Nest is joining Spotify, starting today! We can’t imagine a better partner for our next chapter. Spotify shares the intense care for the music experience that was the founding principle of our company, and it’s clearly winning the hearts and minds of music fans around the globe. Our dedicated team of engineers, scientists, music curators, business, and product people are utterly electrified with the potential of bringing our world-leading music data, discovery, and audience understanding technology directly to the biggest music streaming audience out there.

Together, we’re going to change how the world listens.

We started this company nine years ago in a kitchen at the MIT Media Lab, our dissertation defenses looming. We never wanted to do anything but fix how people were discovering music. None of the technologies in those days were capable of understanding music at scale. We both were working on our separate approaches, that, when combined, could really do that. All the while, we were watching the world of music change around us. We knew some version of Spotify was to come, and that the real power was in that beautiful moment when you found a new band or song to love. Every decision we’ve made since then, including today’s announcement, was made from that vantage point of care and often insane passion.

Starting a company is a bit crazy. You get the idea you can build a family from scratch and let them loose on the problem that drives you. We moved into an empty room in Somerville, MA in 2005, were soon joined by our CEO Jim Lucchese, and then grew a team of around 70 people, all through the power of communicating our one big idea. It’s hard to overstate how special this place is. With the team we have, we always have every expectation we can do whatever it takes in the service of music. We’ve written a lot of code, we’ve invented technology that will power the future of music for decades to come, we manage reams of data, and we work with everyone in the business. But the true power of this place stems from the people: an amazing family, fully dedicated to building the future of music.

We had such great help on the way. Tristan & Brian’s advisor at MIT and one of the fathers of computer music, Barry Vercoe, supported us through seed investment when we graduated, and when Jim joined, we brought on our dear friends Andre and Dorsey Gardner at Fringe Partners. As we grew, we tapped the great support of Elliot at Commonwealth, Antonio at Matrix and then Jeff at Norwest. And in between was the help and support from dozens of family and friends. We couldn’t have done it without them.

Obviously, moving from behind the curtain to the front stage comes with its own share of questions and challenges. We’ve been lucky enough to work with a wide range of creative companies and independent developers who showed the world what could be done with our technology. They helped us craft and refine our product to where it is today. We look forward to working with partners to embrace the new opportunity to build apps and services using The Echo Nest and Spotify. As we explore this new direction, we’ll help each other move forward.

When we began talking with our longtime friends at Spotify about working together, it became clear how much they share our vision: care for the cause of music at scale. We spent our first weeks together just giddy at the potential of all that special Echo Nest magic working directly with the world’s best place for music. You’re about to see some great stuff from the new Echo Nest-enabled Spotify, and we’re excited to hear what you think. We’re all staying in town, our API stays up, and every single person at our company will continue to focus on building the future of music. Talk to you soon; we’ve got some work to do.

For more information, see our press release.

Brian, Tristan, and Jim

with Aaron, Elissa, Tim, Paul, Matt, Mark, Joe, Eliot, Kurt, David, Dave, Amanda, David, Connor, Shane, Ned, Owen, Ellis, Andreas, Glenn, Joe, Dan, Nick, Aaron, Chris, Aaron, Stu, Kevin, Jason, Ajay, Michelle, Jyotsna, James, Hunter, Erich, Andrew, Nicola, Scott, John, Matt, Matt, Eric, Dylan, Eli, Michael, Adam, Alex, Colin, Jonathan, Marni, Smith, Krystle, Eric, Ben, Conor, Victor, Ryan, Bo, Michael, Athena, Chris, Gurhan, Peter, Kate, Bo, Scott, Jared, Darien, Matt, and Wayne.

Talk about A Singular Christmas at the Automatic Music Hackathon

I gave a talk about my A Singular Christmas at the Automatic Music Hackathon last week. Here’s what it looked like and what I said.

A Singular Christmas

Pretend that you’re new here, and you want to know what a bird is. You’re lucky: lots of people know what a bird is. They can show you a bird. This is Hilary Putnam’s linguistic division of meaning, semantic externalism. If you see enough things labeled, Bird, you start to get a handle on what makes a bird a bird. They’ve got a beak or a certain color, they land on a branch, they spread their wings and fly.

A Singular Christmas

The way I’ve ever understood anything is by endlessly imagining all its forms and presentations. Watch what’s similar and what surprises you. See enough of the same thing, and you can make a little machine to describe it. Snowflakes maybe are circles, except when they’re not. Sometimes fractal edges, sometimes straight, sometimes a number describing the fractalness. Sparkles on the edges, a water droplet from the microscope? So any new snowflake is a set of machines you can add up. Circle plus fractal edge plus sparkles equals your own snowflake.

A Singular Christmas

We’ve all done this. We treat pictures like this, movies, touch. And of course every sound you hear these days is a series of multipliers of a basis function, spit out a speaker so fast you can’t hear the buzz. Add a bunch of component bits together to get your creativity or expression. Rehydrate the vectors into a speaker or screen again, and you probably don’t even notice.

A Singular Christmas

It’s in Pentland’s eigenfaces, so many years ago. You probably walked in the path of a dozen cameras trying this trick on your own face on the way over here tonight. Your phone has it built in, tries to tell if you’re smiling or maybe if you’re someone the government should know about.

A Singular Christmas

But it fails more often than you can imagine. Vision guys call this registration. For a computer to get what something is, it’s got to line up. Keep the eyes in the same pixel. If someone is bigger than someone else. Or an outlier, like Facebook deciding your fishbowl is your grandmother. This is where we’re still better. We don’t normally confuse people with objects, and you only need to do that once. It turns out computers like skipping over repetitive things, and we appreciate those. It turns out computers get confused by loud noises.

A Singular Christmas

I try to make this work better. I like when it fails, often, better than when it works really well. The algorithm annealing into a steady state has to be our culture’s greatest art. That we even had the hubris to encode our senses into a square floating point matrix of numbers. And that we even think that representation is good enough to understand the underlying thing.

A Singular Christmas

I mostly do it with music. People know pretty much everything about every song ever, and there’s databases where you can get the pitch of the tenth guitar note, and what people said about it. Imagine the entire universe describing a song. And then you have the audio, too, and some computer-understandable description of all the events in the song.

A Singular Christmas

I’ve been doing it for a while, this was 2003, a thing called “Eigenradio.” it took every radio station I could get in a live stream at the time, at once, and figured out how to do basis computation and resynthesis in a sort of live stream back. The idea was to be “computer music.” Not music made on a computer, because everything is. But music for computers. What they think music actually is. It mostly sounded like this:

A Singular Christmas

It took a lot of effort to do something like this. I taught myself how cluster computing worked, and scammed MIT into spending far too much money on something would be a free tier on a cloud provider these days. The power kept going out. But the project was my favorite kind of irony, the one where the joke is nowhere near as funny as the reality it pokes at.

A Singular Christmas

I have this whole other life that I’m not going to get into, but it involves knowing about music. Consider Christmas song detection. Thought experiment: imagine someone that doesn’t know Christmas, and you play them a bunch of Christmas songs, will they see a connection? Is there something innately Christmas about the music? Bells? Wide open melodies like a rabbit hopping on a piano? My theory was, if I could synthesize Christmas music from an analysis of all the Christmas music I could find, and people thought it sounded Christmas-y, we’ve cracked the code, we can have a Singular Christmas.

A Singular Christmas

A Singular Christmas

Do you want to know the magic trick? But doing this taught me one important lesson: synthesis is just fast composition. Computer people love to hate themselves because everything is so easy. But we all make things, often beautiful things, even if we didn’t mean to. Even if “the data did it” or you just threw a bunch of Matlab functions together or it only started sounding good when you started panning the sine waves into different channels. You’re composing.

A Singular Christmas

A Singular Christmas

This thing got everywhere. By far the most successful creative thing I’ve done. I was on the BBC on Christmas Eve, exasperatedly spelling out “eigenanalysis.” Pitchfork reviewed it, I got 4 stars. The MIT sysadmins and I had a big fight over its bandwidth. This excited Canadian man, on the radio.

A Singular Christmas

My favorite things are the emails. Every December, right around now, they start slowly rolling in. How this album is the only thing they listen to during the holidays. How it means Christmas to them. I’m still working on this stuff, as a sort of hedge against my more mundane realities. I want to show the world there’s beauty in the act of understanding.

Very large scale music understanding talk @ NAE Frontiers

A few years ago I gave this talk at the very impressive NAE “Frontiers of Engineering” conference via invitation of my more successful academic friends, and noticed they had published the transcript. A rare look at one of the reasons The Echo Nest exists, from my perspective:

Presented at NAE Frontiers of Engineering, 2010

Front

Scientists and engineers around the world have been attempting something undeniably impossible– and yet, no one could ever question their motives. Laid bare, the act of “understanding music” by a computational process feels offensive. How can something so personal, so rooted in context, culture and emotion, ever be discretized or labeled by any autonomous process? Even the ethnographical approach – surveys, interviews, manual annotation – undermines the raw effort by the artists, people who will never understand or even perhaps take advantage of what is being learned and created with this research. Music by its nature resists analysis. I’ve led two lives in the past ten years– first as a “very long-tail” musician and artist, and second as a scientist turned entrepreneur that currently sells “music intelligence” data and software to almost every major music streaming service, social network and record label. How we got there is less interesting than what it might mean for the future of expression and what we believe machine perception can actually accomplish.

In 1999 I moved to New York City to begin graduate studies at Columbia working on a large “digital government” grant, parsing decades of military documents to extract the meaning of the acronyms and domain specific words. At night I would swap the laptops in my bag and head downtown to perform electronic music at various bars and clubs. As much as I tried to keep them separate, the walls came down between them quickly when I began to ask my fellow performers and audience members how they were learning about music. “We read websites,” “I’m on this discussion board,” “A friend emailed me some songs.” Alongside the concurrent media frenzy on peer to peer networks (Napster was just ramping up) was a real movement in music discovery– technology had obviously been helping us acquire and make music, but all of a sudden it was being using to communicate and learn about it as well. With the power of the communicating millions and the seemingly limitless potential of bandwidth and attention, even someone like me could get noticed. Suitably armed with an information retrieval background alongside an almost criminal naiveté regarding machine learning and signal processing I quit my degree program and began to concentrate full time on the practice of what is now known as “music information retrieval.”

The fundamentals of music retrieval descend from text retrieval. You are faced with a corpus of unstructured data: time-domain samples from audio files or score data from the composition. The tasks normally involve extracting readable features from the input and then learning a model from the features. In fact, the data is so unstructured that most music retrieval tasks began as blind roulette wheels of prediction: “is this audio file rock or classical” [Tzanetakis 2002] or “does this song sound like this one” [Foote 1997]. The seductive notion that a black box of some complex nature (most with hopeful success stories baked into their names– “neural networks,” “bayesian belief networks,” “support vector machines”) could untangle a mess of audio stimuli to approach our nervous and perceptual systems’ response is intimidating enough. But that problem is so complex and so hard to evaluate that it distracts the research from the much more serious elephantine presence of the emotional connection underlying the data. A thought experiment: the science of music retrieval is rocked by a massive advance in signal processing or machine learning. Our previous challenges in label prediction are solved– we can now predict the genre of a song with 100% accuracy. What does that do for the musician, what does that do for the listener? If I knew a song I hadn’t heard yet was predicted “jazz” by a computer, it would perhaps save me the effort of looking up the artist’s information, who spent years of their life defining their expression in terms of or despite these categories. But it doesn’t tell me anything about the music, about what I’ll feel when I hear it, about how I’ll respond or how it will resonate with me individually and within the global community. We’ve built a black box that can neatly delineate other black boxes, at no benefit to the very human world of music.

The way out of this feedback loop is to somehow automatically understand reaction and context the same way we could with perception. The ultimate contextual understanding system would be able to gauge my personal reaction and mindset to music. It would know my history, my influences and also understand the larger culture hovering around the content. We are all familiar with the earliest approaches to contextual understanding of music – collaborative filtering, a.k.a. “people who buy this also buy this” [Shardanand 1995] – and we are also just as familiar with its pitfalls. Sales or activity based recommenders only know about you in relationship to others– their meaning of your music is not what you like but what you’ve shared with an anonymous hive. The weakness of the filtering approaches become vivid when you talk to engaged listeners: “I always see the same bands,” “there’s never any new stuff” or “this thing doesn’t know me.” As a core reaction to senselessness of the filtering approaches I ended up back at school and began applying my language processing background to music– we started reading about music, not just trying to listen to it. The idea was that if we could somehow approximate even one percent of the data that communities generate about music on the internet– they review it, they argue about it on forums, they post about shows on their blog, they trade songs on peer to peer networks– we could start to model cultural reaction at a large scale. [Whitman 2005] The new band that collaborative filtering would never touch (because they don’t have enough sales data yet) and acoustic filtering would never get (because what makes them special is their background, or their fanbase, or something else impossible to calculate from the signal) could be found in world of music activity, autonomously and anonymously.

Alongside my co-founder, whose expertise is in musical approaches to signal analysis [Jehan 2005], I left the academic world to start a private enterprise, “The Echo Nest.” We are now thirty people, a few hundred computers, one and a half million artists, over ten million songs. The scale of this data has been our biggest challenge: each artist has an internet footprint of on average thousands of blog posts, reviews, forum discussions, all in different languages. Each song is comprised of thousands of indexable events and the song itself could be duplicated thousands of times in different encodings. Most of our engineering work is in dealing with this magnitude of data– although we are not an infrastructure company we have built many unique data storage and indexing technologies as a byproduct of our work. The set of data we collect is necessarily unique: instead of storing the relationships between musicians and listeners, or only knowing about popular music, we compute and aggregate a sort of internet-scale cache of all possible points of information about a song, artist, release, listener or event. We began the company with the stated goal to index everything there is about music. And over these past five years we have built a series of products and technologies that take the best and most practical parts from our music retrieval dissertations and package them cleanly for our customers. We sell a music similarity system that compares two songs based on their acoustic and their cultural properties. We provide tempo, key and timbre data (automatically generated) to mobile applications and streaming services. We track artists’ “buzz” on the internet and sell reports to labels and managers.

The core of the Echo Nest remains true to our dogma: we strongly believe in the power of data to enable new music experiences. Since we crawl and index everything, we’re able to level the playing field for all types of musicians by taking advantage of the information given to us by any community on the internet. Work in music retrieval and understanding requires a sort of wide-eyed passion combined with a large dose of reality. The computer is never going to fully understand what music is about, but we can sample from the right sources and do it often enough and at a large enough scale that the only thing in our way is a leap of faith from the listener.

References

Is your movie and music preference related?

Heart of glassHeart of Glass

I’m a music person: I’m a musician, I pack up all my life experiences through the lens of records and bands, and I’ve spent 15 years of my life building the world’s best automated music recommender. I think there’s something terribly personal about music that other forms of “media” (books, movies, television, articles and – recent entry alert – applications) can’t touch. A truly great song only takes a minute and forty four seconds to experience, and then you can hit the repeat button. I can hear “Outdoor Miner” 31.7 times on my walk to work every morning if I wanted to. But I can’t watch one of my favorite movies, Werner’s “Heart of Glass,” even once on my walk to work, and to be honest, more than once a year is a bit much. I’d have to be staring at my phone or through some scary glasses. And it’s a distracting world, far too much to fit into the diorama of the brain: dozens of actors, scenes, sounds, props and story. I don’t know if I attach memories or causal emotion to movies: they try to explicitly tell me how to feel, not suggest it obliquely or provide a soundtrack to a reality. And worst of all, it’s a mood killer to give a fledgling romantic partner a mix “DVD-box-set.”

But certainly, my preference in film (or that I even call them films – like some grad student) has to tell me something about myself, or even my other tastes. If we knew someone’s movie preference, could we build a better music playlist for them? Or can we help you choose a movie by knowing more about your music taste? I recently poked out of my own bubble of music recommendation and music retrieval to see if there were any correlations we could make use of.

Recommending in general

//platform.twitter.com/widgets.js

The way the Echo Nest has done music recommendation is actually quite novel and deserves a quick refresher: we don’t look at what most other companies or technologies do. Amazon, Last.fm, iTunes Genius and many others use statistics of your activity to determine what you like: if you listen to Can, and so does a stranger, but that stranger also loves Cluster and the system presumes you don’t know about them, you might get recommended Cluster. But that doesn’t know anything about music and it constantly fails in its own naïve way:

Britney vs. Powell
Colin Powell recommendation from Britney Spears

Instead of relying on that brittle world of correlated activity, we’ve first built a music understanding system that gets what music is: what people say about it and what it sounds like, and that platform also happens to recommend things and give you playlists. We use all of that data to power top rate discovery for tons of services you use every day: Rdio, Sirius XM, Nokia’s MixRadio, iHeartRadio, MTV, the Infinite Jukebox. We don’t just know that you like a song, we know what the key of that song is, how many times people called it “sexy” in the past week on blogs, and what instruments are in it. We also know, through the anonymized Taste Profile: how often you, and the world, listened, what time of day, and what songs you like to listen to before and after and how diverse your taste is.

The reason this is useful is we don’t want to just build a thing that knows that “people that like The Shins also like Garden State,” we want to go deeper. We want our models to understand the underlying music, not just the existence of it. We also want to show correlations between styles and other musical descriptors and types of films, not just artists. Facebook could (and it probably tries to) build a music “recommender” by just checking out the commonalities of what people like, but we want to look deeply at the problem, not the surface area of it.

Experimental setup

The Echo Nest is currently pulling in hundreds of musical activity data points a second, through our partners and our large scale crawls of the web and social media. A recent push on our underlying Taste Profile infrastructure nets us new data on the listeners themselves – specifically, with anonymously collected and stored demographic and non-music media preferences. Through all of this we know the favorite artists and movies for a large set of Taste Profiles (if you’re a developer, you can store non-musical data using our Taste Profile Key-Value API and manipulate and predict new features using our alpha Taste Profile predict API.) For the purposes of this experiment, we limited our world to 50,000 randomly chosen Taste Profiles that had movie and music preference data.

Musical attributes for ABBA
Musical attributes for ABBA

Each artist was modeled using Echo Nest cultural attributes: a sparse vector of up to 100,000 “terms” that describe the music in the Taste Profile, weighted by their occurrence. If someone constantly listens to the new James Holden record, and I mean, over and over again, kind of annoyingly, we weight terms like “bedroom techno” and “melodic” along with the acoustically derived terms – its energy, danceability and so on – higher than songs they’ve just heard once or twice. The output vector is a human-targeted cultural description of their favorite music, with helpful floating point probabilities P(X|L) for each term denoting: “How likely would it be for this listener to describe their taste as ‘X’”?

The movie data was a bit harder, noting for the record that we are a music data provider run by some musicians who happened to be good with computers. I deployed a small crack team (the CTO and his imaginary friends) to build a mini “Echo Nest for movies,” cataloging (for now) 5,000 of the most popular films along with their descriptors culled from descriptions and reviews in a similar way as we’ve done for music. I determined their genres, lead actors, key attributes and cultural vectors to train models against.

Top movie attributes for "The Godfather"
Movie attributes for The Godfather

Predictions

By training thousands of correlative models between the sparse music vectors and the various target ground truth of the movie attributes (which were in reality far less diverse and dense) we are able to quickly view high affinity between various types of music and types of movies.

KL divergence doing its thing, from Wikipedia
KL divergence doing its thing

I used a multi-class form of the Support vector machine, regularized least-squares classification, which you can read about in an old paper of mine to train the thousands of models. RLSC is fine with sparse vectors and unbounded amounts of output classes, and we also ended up with a linear kernel which made the training step very light – likely due to the low rank of the movie features.

I evaluated the models in two ways: the first I’ll call a “discriminant classifier” – this will list the most useful sources of information (KL divergence) for a given music source, and the second is a “ranked classifier” – given popularity features, what would give the least surprise for the classifier. There’s good reasons for the two methods: the former is more statistically correct, but ignores that most people have never heard of most things, while the latter gives us safe bets that give less explicit information.1 As we see every day with music, a computer’s idea of “information” rarely has little to do with things like the success of “Fast & Furious 6.”

For example, I am able to ask it both: “If an average person likes Jay-Z, what are their favorite movies” (ranked) and “Which movie can I assume predicts the liking of Jay-Z”? (discriminant). They are:

Ranked Discriminant
Toy Story
Step Brothers
Buddy The Elf
Harry Potter (series)
Jackass
Superbad
Fight Club
Get Rich or Die Tryin’
Paid in Full
Scary Movie 4
Shottas
Juice
New Jack City
Friday After Next

Movie predictions for fans of Jay-Z

You can see the difference: the left side is the safe bets (everyone likes Toy Story! everyone likes Jay-Z!) and the right side is the less known but more useful results. So you don’t think I’m pulling a Shawshankr2 on you, here’s the list for a different artist:

Ranked Discriminant
Dirty Dancing
Toy Story
The Blind Side
Twilight (series)
The Notebook
Finding Nemo
Dear John
Pure Country
8 Seconds
Country Strong
Valentine’s Day
Sweet Home Alabama
Letters to Juliet
The Vow

Movie predictions for fans of Tim McGraw

We can also bulk this up by features of the movie, here are the top musical artists correlated with movies with a heavy crime element:

Ranked Discriminant
Jimi Hendrix
The Beatles
The Rolling Stones
Jay-Z
The Who
Bob Dylan
Pink Floyd
Ghostpoet
Amazing Blondel
Ian Anderson
Doseone
Young Gunz
Mandrill
Pato Banton

Artist predictions for fans of crime movies

Seeing the Amazing Blondel there just amazes me: we track two and a half million artists and it’s those guys that like crime movies? The data can’t lie.

The Amazing Blondel

We also looked up which movies our term computations considered “pornographic” or “adult” (they know it when they see it:)

Ranked Discriminant
Linkin Park
The Beatles
The Rolling Stones
Deftones
Limp Bizkit
Korn
Rage Against the Machine
The Receiving End of Sirens
Haste the Day
The Dillinger Escape Plan
The Mars Volta
Far * East Movement
Rediscover
Imogen Heap

Artist predictions for fans of adult movies

Fans of “Christian metalcore”-rers Haste the Day and Imogen Heap, we’re onto you. We don’t judge.

Overall

We did a lot more analysis, more of which you can see over on The Echo Nest’s new Musical Identity site, including breakdowns of different genres of films:

Sci-fi vs. Fantasy
Sci-fi vs. Fantasy

The goal of all of this is to understand more about music and the identity we express through our affinity. We’re getting closer with a lot of these large scale analyses of different forms of media and demographic and psychographic predictions from solely preference. But it’s also going to help us elsewhere: being able to recommend you that one song or artist with not much information is what we do, and the more we can predict from what we used to think of as orthogonal sources, the better.


  1. For the scientists getting mad: the ranked classifier applies a smoothed weight of terms by their document frequency – the number of times we saw a movie being mentioned. 

  2. The more precise movie recommender with the worst recall 

How music recommendation works — and doesn’t work

When you see an automated music recommendation do you assume that some stupid computer program was trying to trick you into something? It’s often what it feels like – with what little context you get with a suggestion on top of the postmodern insanity of a computer understanding how should you feel about music – and of course sometimes you actually are being tricked.

Amazons recommendations for Abbey Road

Amazon’s recommendations for Abbey Road

No one is just learning that if they like a Beatles album, they may also like five others. Amazon is not optimizing for the noble work of raising independent artists’ profiles to the public, and they’re definitely not optimizing for a good musical experience. They’re statistically optimizing to make more money, to sell you more things. Luckily this is the fruit fly of music recommendation, the late night infomercial quality of a music discovery experience that also might dry your lettuce if you spin it fast enough. And I doubt Amazon would ever claim otherwise.

The rest of the field has gotten pretty far since then and we’ve now got tons of ways to discover music using actual qualities of the music or social cues of what your friends are listening to. But I still hate seeing examples like the above. I hate thinking there are forces at work in music discovery that don’t have listeners’ best interests at heart and I want to make them better. I want to walk through all the ways music recommenders work or don’t, and concentrate of course on the one I know best – The Echo Nest’s – which you’re probably using even if you don’t already know it. And most importantly, I want to talk about what we can do next.

Before I get into it, a brief history of who I am: I’ve been working on music recommenders and music retrieval since 1999, academically and in industry. In 2005 I started The Echo Nest with my co-founder Tristan. We power most online music services’ discovery using a very interesting series of algorithms that is sort of the Voltron-figure of our two dissertations and the hard work of our 50 employees in Boston, SF, NYC and London. And we’ve been on a bit of a tear – just in the past year alone we’ve announced that we’re powering music discovery features for eMusic, Twitter, EMI, iHeartRadio, Rdio, Spotify, VEVO and Nokia – with some new heavy hitters yet announced – to add to our existing customer base that includes MTV, MOG, and the BBC. And through our API we have tens of thousands of developers making independent apps like Discovr, KCRW, Muzine, Raditaz, Swarm, SpotON and hundreds more.

We’ve been a quiet company for a while and with all this great news comes a lot of new confusion about what we do and how it compares to other technologies. Journalists like to pin us as the “machine” approach to understanding music next to the “human” of our nearest corollary (not competitor) in the space – Pandora. This is somewhat unfair and belies the complexity of the problem. Yes, we use computer programs to help manage the mountains of music data, but so does everyone, and the way we get and use that data is just as human as anything else out there.

I’ll go into technologies like collaborative filtering, automatic content based recommendation, and manual approaches used by Pandora or All Music Guide (Rovi). I’ll show that no matter what the computational approach ends up being, the source data – how it knows about music – is the most important asset in creating a reliable useful music discovery service.

What is recommendation? What is it good for?

Musicians are competing for an audience among millions of others trying just as hard. And it’s not the listener’s fault if they miss out on something that will change their lives – these days, anyone can gain access to a library of over 15 million songs on demand for free. To a musician turned computer scientist (as I and so many of my colleagues are) this is the ultimate hidden variable problem. If there was something “intelligent” that could predict a song or artist to a person, both sides (musician and listener) win, music is amazing, there’s a ton of data, and it’s very far from solved.

But anyone in the entire field of music technology has to treat music discovery with respect: it’s not about the revenue of the content owner, it’s not about the technology, it’s not about click through rates, listening hours or conversion. The past few years have shown us over and over that filters and guides are invaluable for music itself to coexist with the new ways of getting at it. We track over 2 million artists now – I estimate there are truly 50 million, most of them currently active. Every single one of them deserves a chance to get their art heard. And while we can laugh when Amazon suggests you put a Norah Jones CD in your cart after you buy a leaf blower, the millions of people that idly put on Pandora at work and get excited about a new band they’ve never heard deserve a careful look. Recommendation technology is powering the new radio and we have a chance to make it valuable for more than just the top 5 percent of musicians.

When people talk about “music recommendation” or “music discovery” they usually mean one of a few things:

  • Artist or song similarity: an anonymous list of similar items to your query. You can see this on almost any music service. Without any context, this is just a suggestion of what other artists or songs are similar to the one you are looking at. Formally, this is not truly a recommendation as there is no user model involved (although since a query took the user to the list, I still call these a recommendation. It’s a recommendation in the sense that a web search result is.)
  • Personalized recommendation: Given a “user model” (your activity on a service – plays, skips, ratings, purchases) a list of songs or artists that the service does not think you know about yet that fits your profile.
  • Playlist generation: Most consumers of music discovery are using some form of playlist generation. This is different from the above two in that they receive a list of items in some order (usually meant to be listened to at the time.) The playlist can be personalized (from a user model) or not, and it can be within catalog (your own music, ala iTunes Genius’s or Google’s Instant Mix playlists) or not (Pandora, Spotify’s or Rdio’s radio, iHeartRadio.) The playlist should vary artists and types of songs as it progresses, and many rely on some form of steering or feedback (thumbs up, skips, etc.)
Where popular services sit in discovery
Personalized Anonymous
Playlist Pandora Rdio radio
Suggestions Amazon All Music Guide

These are three very different ways of doing music discovery, but for every technology and approach I know of, they are simply applications on top of the core data presented in different ways. For example, at the Echo Nest we do quite a bit to make our playlists “radio-like” using our observed statistics, acoustic features and a lot of QA but that sort of work is outside the scope of this article – all three of our similarity API, taste profiles (personalized recommendation) API and playlist API start with the same knowledge base culled from acoustic and text analysis of music.

However, the application means a lot to the listener. People seem to love playlists and radio-style experiences, even if the data driving both that and the boring list of songs to check out are the same. One of the great things about working at the Echo Nest is seeing the amazing user interfaces and experiences and people put on top of our data. Listeners want to hear music, and they want to trust the service and have fun doing it. And conversely, a Pandora completely powered by Echo Nest data would feel the same to users but would have far better scale and results and thus add to the experience. Because of this very welcome sharding of discovery applications, it’s helpful less to talk about these applications directly and more to talk about “what the services know” about music – how they got to the result that Kreayshawn and Uffie are similar, no matter where it appeared in the radio station or suggestion or what user model led them there. We can leave the application and experience layer to another lengthy blog post.

My (highly educated, but please know I have no direct inside information except for Echo Nest of course) guesses on the data sources are:

How popular services know about music
Service Source of data
Pandora Musicologists take surveys
Songza Editors or music fans make playlists
Last.fm Activity data, tags on artists and songs, acoustic analysis[1]
All music guide Music editors & writers
Amazon Purchase & browsing history
iTunes Genius Purchase data, activity data from iTunes[2]
Echo Nest Acoustic analysis, text analysis

There are many other discovery platforms but this list covers the widest swath of approaches. Many services you interact with use either these platforms directly (Last.fm, Echo Nest, AMG all license data or give away data through APIs) or use similar enough approaches that it’d be not worth going into in detail.

From this list we’re left with a few major music knowledge approaches: (1) activity data, (2) critical or editorial review, (3) acoustic analysis, and (4) text analysis.

The former two are self-describing: you can learn about music by the activity around it (listens, plays, purchases) – Kreayshawn and Uffie are considered similar if the same people buy their singles or rate them highly – or you can learn about music by critical review, what humans have been pretty good at for some time. Of course, encoding activity (via collaborative filtering or taste mining) and critical review (via surveys or direct entry) into a database is a relatively recent art.

The latter two, acoustic and textual analysis, were developed by the field as a reaction to the failures of the first two. I’ll go into much greater detail on those as it’s how Echo Nest does its magic.

Care and Scale

The dominating principle of the Echo Nest discovery approach from day one has been “care and scale.” When Tristan and I started the company in 2005 we were two guys with fresh PhDs on music analysis and some pretty good technological solutions; Tristan’s in the acoustic analysis realm (a computer taking a signal and making sense of it) and mine in the data mining and language analysis space (understanding what people are saying and doing with music.) We surveyed the landscape at the time for discovery and found that almost every one suffered from either a lack of care or scale, sometimes (and often) both. The entire impetus of doing a startup (not an easy choice for two scientists and anyone that has met us knows we are not the “startup type”) was that we thought we had something between the two of us that could fix those two problems.

Care & Scale

Scale is easy to explain: you have to know about as much music as possible to make good recommendations. If you don’t know about an up and coming artist, you can’t recommend them. If you only analyze or rate or understand the popular stuff you by default fail at discovery. Manual discovery approaches by their nature do not scale. We track over two million artists and over 30 million songs and there is no way a manually curated database can reach that level of knowledge. Even websites that can be volunteer or community edited run against the limits of the community that takes part – we count only a little over 130,000 artist pages on Wikipedia. Pandora recently crossed the 1 million song barrier, and it took them 10 years to get there. Try any hot new artist in Pandora and you’ll get the dreaded:

Pandora not knowing about YUS

This is Pandora showing its lack of scale. They won’t have any information for YUS for some time and may never unless the artist sells well. This is bad news and should make you angry: why would you let a third party act as a filter on top of your very personal experiences with music? Why would you ever use something that “hid” things from you?

Activity data approaches (such as Last.fm and Amazon and iTunes Genius) also suffer from a slighter scale problem that manifests itself in a different way. It’s trivial to load a database of music into an activity data-based discovery engine (such as collaborative filtering or social tags.) I’ve often gone after such naive approaches to music discovery publicly. If a website or store has a list of user data (user A bought / listened to song Y at time Z) any bright engineer will immediately go into optimization mode. There’s almost a duplicitous ease of recommending music to people poorly. I recently was shopping for a specific type of transistor for a project on a parts and components website and found they, too, had turned on the SQL join that allowed “recommendations” on their site based on activity data:

Pathological filtering

Pathological filtering

Other than activity not making much sense in a discovery context, by default these systems suffer a “popularity bias,” where a lot of music simply doesn’t have enough activity data yet collected to be considered a recommendation match. Activity based systems can only know what people have told them explicitly, and this often makes it hard for less-popular artists to be recommended.

Care is a trickier concept and one we’ve tried very hard to define and encode into our engineering and product. I translate it as “is this useful for the musician or listener?”A great litmus test for care in music discovery is to check the similar artists or songs to The Beatles. Is it just the members of the Beatles and their side projects? For almost all services that use musical activity data, it will be:

Top artist similars are all members of the Beatles

Top artist similars are all members of the Beatles

Certainly a statistically correct result[3], but not a musically informative one. There is so much that user data can tell you about listening habits, but blindly using it to inform discovery belies a lack of care about the final result. Care is neatly handled by using social, manual or editorial approaches, as humans are pretty good at treating music properly. But when using more statistical or signal processing approaches that know about more music at scale, care has to be factored in somehow. Most purely signal processing approaches (such as Mufin here) fall down as badly on care as activity data approaches do:

Mufin expressing so little care about Stairway to Heaven

Mufin expressing so little care about Stairway to Heaven

Care is a layer of quality assurance, editing and sanity checks, real-world usage and analysis and, well, care, on top of any systematic results. You have to be able to stand by your results and fix them if they aren’t useful to either musicians or listeners. Your WTF count has to be as low as possible. We’ve spent a lot of time embedding care into our process and while we’re always still working, we’re generally pleased with how our results look.

Without both care and scale you’ve got a system that a listener can’t trust and that musician can’t use to find new fans. You’ve failed both of your intended audiences and you might as well not try at all.

Care & Scale of common approaches

Text Analysis

Echo Nest Cultural vectors

Echo Nest Cultural vectors

I started doing music analysis work in 1999 at the NEC Research Institute in Princeton, NJ (I had scammed them into an internship and then a full time job by being very persistent.) NEC was then full of the top tier of data mining, text retrieval, machine learning and natural language processing (NLP[4]) scientists; I had the great fortune to work with guys like Steve Lawrence, Gary Flake, David Waltz and even Vladimir Vapnik moved into my tiny office after I left for MIT.

I was there while figuring out what to do with myself after abruptly quitting my PhD program in NLP at Columbia. I was a musician at the time, playing a lot of shows at various warehouse spaces or the lamented late “Brownies,” places where 20 people might show up and 10 would know who you were. There was a lot of excitement about “the future of music” – far more than there is today, as somehow we felt that the right forces would win and quickly. I logged onto Napster for the first time from a DSL connection and practically squealed in delight as a song could be downloaded faster than the time it would take to listen to it. It was a turning point for music access, but probably a step back for music discovery. We were still stuck with this:

Napster in 2001

Napster in 2001

The search was abysmal: a substring match on ID3v1 tags (32 characters each for artist, title, release and a single byte for genre) or filename (usually “C:MUSICMYAWES~1RAPSONG.MP3”) and there was no discovery beyond clicking on other users’ names and seeing what they had on their hard drives. I would make my music available but of course, no one would ever download it because there was no way for them to find it. A fellow musician friend quickly took to falsely renaming his songs as “remixes” by better known versions of himself: “ARTIST – TITLE – APHEX TWIN MIX” and reported immediate success.

At the time I was a member of various music mailing lists, USENET groups and frequent visitor of a new thing called “weblogs” and music news and review sites. I would read these voraciously and try to find stuff based on what people were talking about. To me, while listening to music is intensely private (almost always with headphones alone somewhere), the discovery of it is necessarily social. I figured there must be a way to take advantage of all of this conversation, all the excited people talking about music in the hopes that others can share in their discovery – and automate the process. Could a computer ‘read’ all that was going on across the internet? If just one person wrote about my music on some obscure corner of the web, then the system could know that too.

This is scale with care: real people feeding information into a large automated system from all different sources, without having to fill out a survey or edit a wiki page or join a social network.

After almost ten years of data mining, language and music research (first at NECI, then a PhD at MIT at the Media Lab) The Echo Nest currently is the only music understanding service that takes this approach. And it works. We crawl the web constantly, scanning over 10 million music related pages a day. We throw away spam and non-music related content through filtering, we try to quickly find artist names in large amounts of text and parse the language around the name. Every word anyone utters on the internet about music goes through our systems that look for descriptive terms, noun phrases and other text and those terms bucket up into what we call “cultural vectors” or “top terms.” Each artist and song has thousands of daily changing top terms. Each term has a weight associated, which tells us how important the description is (roughly, the probability that someone will describe music as that term.) We don’t use a fixed dictionary for this, we are able to understand new music terms as quickly as they are uttered, and our system works in many Latin-derived languages across many cultures.

On top of this statistical NLP, we also pull in structured data from a number of partners and community access sites like Wikipedia or Musicbrainz. We apply the same frequency and vector approach to this knowledge-base style data: if Wikipedia lists the location of an artist as NYC and the label partner as New York, NY and their Facebook page has “EVERYWHERE ON TOUR 2012”, we have to figure out which is the right answer to index. Often the cultural vectors on structured data become a synthesis of all the different data sources.

When a query for a similar artist or a playlist comes into our system, we take the source artist or song, grab its cultural vectors, and use those in real time to find the closest match. This is not easy to do at scale, and over the years we’ve done quite a lot of “big data” work to make this tractable. We don’t cache this data because it changes so often – the global conversation around music is very finicky and artists make overnight changes to their sound.

A lot of useful data naturally falls out of cultural analysis of music: the quantity of conversation is used to inform our “hotttnesss” and familiarity data points, representing how popular the artist is now on the internet and overall how well known they might be. We can use the crawled text anonymously as sort of a proxy for listener data without having to get it from a playback service. And the index of documents that relate to artists or songs is of course valuable to a lot of our customers in a feed or search context – showing news or reviews about artists their users are interested in.

Acoustic Analysis

Echo Nest acoustic analysis view

Echo Nest acoustic analysis view

The internet is not the Library of Babel we envision it to be, and quite often many “lower-rank” (less popular) musicians are left out of the “cultural universe” we crawl. Also, the description of music necessarily leaves out things that actually describe the music – a Google blog search for Rihanna illustrates the problem well: many popular artists’ descriptions are skewed towards the celebrity angle and while this is certainly a valid thing to know about a musician, it’s not all we need to know. Lastly, internet discussion of music tends to concentrate on artists, not songs (although there is sometimes talk of individual songs on music blogs.) These three issues (and common sense) require us figure out if we can understand how a song sounds as well as how the artist and song is represented by listeners. And if we are going to follow care and scale, we’ve got to do this automatically, with a computer doing the job of the careful listening.

Can a computer really listen to music? A lot of people have promised it can over the years, but I’ve (personally) never heard a fully automated recommendation based purely on acoustic analysis that made any sense – and I’ve heard them all, from academic papers to startups to our own technology to big-company efforts. And that has a lot to do with the expectations of the listener. There are certain things computers are very good and fast at doing with music, like determining the tempo or key, or how loud it is. Then there are harder things that will get better as the science evolves, like time signature detection, beat tracking over time, transcription of a dominant melody, and instrument recognition. But even if a computer were to predict all of these features accurately, does that information really translate into a good recommendation? Usually not – and we’ve shown over the years that people’s expectation of “similar” – either in a playlist or a list of artists or songs – trends heavily towards the cultural side, something that no computer can get at simply by analyzing a signal.

But it does turn out that acoustic analysis has a huge part to play in our algorithms. People expect playlists to be smooth and not jump around too much. Quiet songs should not be followed with loud metal benders (unless the listener asked for that.) For jogging, the tempo should steadily increase. Most coherent mixes should keep the instrumentation generally stable. Songs should flow into one another like a DJ would program them, keeping tempo or key consistent. And there’s a ton we haven’t figured out yet on the interface side. Could a “super dorky query interface” work for music recommendation, where a listener can filter by dominant key or loudness dynamics? Maybe with the right user experience. An early product out of the Echo Nest[5] was an “intelligent pause button” that Tristan whipped up that would compose a repeating segment out of the part of the song you were in or just play the song roughly forever (check out an automated 10 minute MP3 re-edit of a Phoenix song) – which a few years later became Paul’s amazing Infinite Jukebox – these experiments are fascinating precursors to a new listening experience that might become more important than discovery itself.

The Echo Nest audio analysis engine (PDF) contains a suite of machine listening processes that can take any audio file and outputs both low-level (such as the time of when every beat starts) and high-level (such as the overall “danceability”) information for any song in the world. We analyze all the music we work with, and developers can upload their own audio to see everything we compute on a track via our API. Our analysis starts by pretending it was an ear: it will model the frequencies and loudness of a musical signal much the same way perceptual codecs like MP3 or AAC do. It then segments the audio into small pieces – roughly 200ms to 4s, depending on how fast things are happening in the song. For each segment we can tell you the pitch (in a 12-dimensional vector called chroma), the loudness (in an ADSR-style envelope) and the timbre, which is another 12 dimensional vector that represents the sound of the sound – what instruments there are, how noisy it is, etc. It also tracks beats across the signal, in subdivisions of the musical meter called tatums, and then per beat and bar, alongside larger song-level structure we call sections that denote choruses, intros, bridges and verses.

That low level information can be combined through some useful applications of machine learning that Tristan and has team have built over the years to “understand” the song at a higher level. We emit song attributes such as danceability, energy, key, liveness, and speechiness, which aim to represent the aboutness of the song in single floating point scalars. These attributes are either heuristically or statistically observed from large testbeds: we work with musicians to label large swaths of ground truth audio against which to test and evaluate our models. Our audio analysis can be seen as an automated lead sheet or a computationally understandable overview of the song: how fast it is, how loud it gets, what instruments are in it. The data within the analysis is so fine grained that you can use it as a remix tool – it can chop up songs by individual segments or beats and rearrange them without anyone noticing.

We don’t use either type of data alone to do recommendations. We always filter the world of music through the cultural approaches I showed above and then use the acoustic information to order or sort the results by song. A great test of a music recommender is to see how it deals with heavy metal ballads – you normally would expect other ballads by heavy metal bands. This requires a combination of the acoustic and cultural analysis working in concert. The acoustic information is obviously also useful for playlist generation and ordering, or keeping the mood of a recommendation list coherent.

What’s next

I’ve used every single automated music recommendation platform, technology or service. It’s obviously part of my job and it’s been astounding to watch the field (both academically and commercially) mature and test new approaches. We’ve come a long way from RINGO and while the Echo Nest-style system is undoubtedly the top of the pack these days as far as raw quality of automated results go, there’s still quite a lot of room to grow. I’ve been noticing two trends in the space that will certainly heat up in the years to come:

Social – filtering collaborative filtering

This is my jam

This is my jam

Social music discovery as embodied by some of my favorite music services such as This is my jam, Swarm.fm, Facebook’s music activity ticker and real time broadcasting services like the old Listening Room and its modern progentior Turntable.fm often have no or little automated music discovery. Friend-to-friend music recommendations enabled by social networks are extremely valuable in music discovery (and I personally rely on them quite often), but are not recommendation engines as they can not automatically predict anything when bereft of explicit social signals (and so they fail on scale.) The main amazing and useful feature of recommendation systems are that it can find things for you that you wouldn’t have otherwise come across.

Even though that puts this kind of service outside the scope of this article, that doesn’t mean we should ignore the power of social recommendations. There’s something very obvious in the social fabric of these services that makes personal recommendations more valuable: people don’t like computers telling them what to do.

There are some interesting new services that mix both the social aspects of recommendation with automated measures. This is my jam’s “related jams” is a first useful crack at this, as is Spotify’s recent “Discover” feature. You can almost see these as extensions of your social graph: if your friends haven’t yet caught onto Frank Ocean there might be signals that show they will get there soon, and using cultural filtering can get us there. And a lot of the power of social recommendations – that it comes from your friends – can tell the story better than just a raw list of “artists we think you would like.”

Listener intelligence

When do you listen to music? Is it in the morning on your way to work? Is it on the weekends, relaxing at home? When you do it, how often do you listen to albums versus individual songs in a playlist? Do you idly turn on an automated radio station, or have your own playlists? Which services do you use and why? If it’s raining, do you find yourself putting on different music? The scariest thing about all the music recommendation systems I’ve gone over (including ours as of right now) is none of them look at this necessary listener context.

There is a lot going on in this space, both internal stuff we can’t announce yet at Echo Nest, and new products I’m seeing coming out of Spotify and Facebook. We’re throwing our weight behind the taste profile – the API that represents musical activity on our servers. It’s sort of the scrobble 2.0: both representing playback activity as well as all the necessary context around it: your behaviors and patterns, your collection, your usage across services and maybe even domains. We’re even publishing APIs to do bulk analysis of the activity to surface attributes like “mainstreamness” or “taste-freeze,” the average active year of your favorite artists. This is more than activity mining as collaborative filtering sees it: it’s understanding everything about the listener we can, well beyond just making a prediction of taste based on purchase or streaming activity. All of these attributes and analysis might be part of the final frontier of music recommendation: understanding enough to really understand the music and the listener it’s directed to.

– Brian (brian@echonest.com)

Thanks to EVB for the edits


  1. I know they do a ton of acoustic analysis there but I don’t know if it’s used in radio or similar artists/songs. Probably.  ↩

  2. They use Gracenote’s CDDB data as well but I don’t know to what extent it appears in Genius.  ↩

  3. You can even claim that since you are listening to the Beatles, you are also listening to the individual members at the same time but let’s not get too ahead of the curve here  ↩

  4. Whenever I say NLP I mean the real NLP, the natural language processing NLP, not the creepy pseudoscience one.  ↩

  5. Released eventually as the Earworm example in Echo Nest Remix.  ↩

Music data talk at Velocity EU

I gave a talk at Velocity EU in London a couple of weeks ago on how some pieces of the Echo Nest work, operationally:

//speakerdeck.com/assets/embed.js

(Direct link on speakerdeck)

We started Echo Nest seven years ago and I couldn’t imagine a stranger time to begin a technology-focused venture. We started with 2 self-built 2U rack machines in a closet at MIT and within two years started moving everything to first dozens and then almost thousands of virtualized hosts running on a few different cloud providers. And then a year or two ago we pulled it all back to physical again. We’re in a weird spot between offline data processor and real time API provider and none of the oft-repeated hype platforms ever worked for us.

We’ve been around before and during much of Hadoop, Solr, “NoSQL”, EC2, API-as-PR, the rise of mobile, Apple on Intel, Hacker News and sharded mySQL as a key-value store. And we’re still at it, making money in an industry that would rather keep it to itself. If it’s not obvious, I’m really proud of what we built and the small team that works their ass off to keep it working while making some really cool new stuff.

There’s a lot of stories to tell and this is just the first one, on how we’ve abused text indexing to do quite a lot of our work. If you ever wondered how some of the musical data souffle avoids burning, here it is. Thanks to John & Kellan for giving me a chance to present this.

1980s pen plotters of the future

Early 1980s pen plotters are amazing tools that are still very useful in today’s world. There’s something completely transfixing about a mechanical device moving an actual pen on paper versus the smelly black box of the laser printer. And if you’re trying to draw lines or curves or like the effect of actual ink touching paper (not sprayed on in microdots) there’s no other way. Luckily, there’s some great tools out there for making plotters work on modern hardware and using modern file formats (PDFs) and the hardware itself, while finicky and aging, is cheap.

Hardware

You’re going to have to start with the right hardware. The Chiplotle! plotter FAQ is great for this, I’ve added my notes below:

  • USB – serial interface or a serial port on your computer. I have trouble with Chiplotle! with the very common Prolific serial adapters (it might be the drivers) but if you have something that works it’ll probably be fine. I use an ex-Keyspan one.
  • Any HPGL compatible pen plotter. eBay is almost always your best bet, unless you live near the MIT Flea Market. Make sure the plotter supports HPGL via serial connection. If it says HPIB or GPIB, do not get. Very common eBay finds are the Roland DXYs or the HP 7475a. If in doubt, check the Chiplotle! list of supported plotters (although keep in mind similar model #s will also work; for example the DXY-1150 and 1150a act as a DXY1300 in Chiplotle!.) I normally pay $50 or so for a single sheet plotter like the DXY-1150. Make sure you get a power supply with the plotter, that’s often the hardest thing to find and none of them use anything standard.
  • A plotter serial cable. The only place you’ll need to pull out a soldering iron. You can buy plotter serial cables on eBay but it’s easier to just make your own from a DB25 male to DB9 female cable. You have to re-route a few wires but it’s easy to do.
  • Pens & paper. Your eBay plotter likely will come with a plastic bag full of dried out pens or, if you’re lucky, boxed ones. There’s a huge variety of pen types (for different paper or thicknesses, felt vs. fine point) or you can fashion your own if you’re wily. For paper, the average desktop plotter can take up to 11 x 17" paper or just normal printer paper (make sure to set the DIP switches on the back if you don’t use the full size.) I have a nice stack of artist vellum paper with nice vellum pens.

Have all that? Now, find a place to put the plotter, run the plotter cable from the serial port on back of the plotter to your USB adapter, and load up some paper. Depending on the plotter, the paper might be held in electrostatically or it may be magnetized and expect a metal tab to hold in the paper (which you obviously did not get in the eBay shipment, you can use a metal ruler or something similar instead.) Or maybe you will have to tape the paper on. If it’s a wide format plotter the paper will have to be in a roll and the vertical axis is done via the roller mechanism like a receipt printer (these are notriously fiddly, I would avoid them unless you really need 3 foot wide paper.) You then load up a pen (in the pen holder off to the side, not the plotter head – the beginning of every HPGL file tells the plotter to pick up the pen from the holder.) Now, let’s plot.

Software

In 2008 I was bitten by the plotter bug all of a sudden. I was trying to draw a smooth bezier curve robotically and was looking at various servo or motor solutions when I stumbled on the community of folks that have adapted Roland pen plotters into vinyl cutting CNC machines. I found myself intensely bidding on my first plotter against a familiar eBay username. After I lost, I confirmed my suspicions: I was in competition with my dear friend Douglas Repetto of CMC & dorkbot fame. And not only was he also independently plotter crazed, he was working on a Python module for HPGL control called, well, Chiplotle!. Maybe there was something in the water that week.

Chiplotle!

Chiplotle is obviously the best and only way to reliably control a plotter from a modern computer. It does quite a lot of work for you: it manages the commandset of your plotter, buffers output so it doesn’t overflow and start drawing random straight lines, provides a interactive terminal where you can “live draw” and a bunch of other necessary stuff. Although you can import chiplotle into your program and programatically control your plotter, I tend to use just one commandset of Chiplotle – the plot_hpgl_file script that it installs.

HPGL files are like the wizened ancestor of PDF. It is simply an ASCII file of text commands to draw lines, curves, choose pens, etc and is the plotter’s native language. If you want, you can ignore Chiplotle! altogether and just cat an HPGL file to your serial port at 9600 baud, 8N1. This will work fine for the first few commands but eventually the plotter’s internal buffer (mine is 512 bytes) will overflow. plot_hpgl_file takes care of all of this. The first time you run it, it will attempt to detect which plotter you have on which serial port. Then it will slowly spit out the HPGL commands and make sure the plotter is acknowledging them.

My workflow for the project I am on now is to generate PDF files programatically using the amazing ReportLab python PDF toolkit and then convert them to HPGL using pstoedit and plot it. It is as simple as

python your_pdf_generator.py file.pdf
pstoedit -f hpgl file.pdf > output.hpgl
plot_hpgl_file.py output.hpgl

Obviously I could have my python program directly control the plotter, but as ink and paper add up, you will want a step in between to make sure your art is OK, and the PDF step is natively viewable on any platform. Since PDF and HPGL both share a lot of common ancestry, the curveTos, lineTos and moveTos are kept consistent with no loss of quality. There’s no rasterizing step: if you generate curves programatically with ReportLab, they will be the same curve on the paper in the plotter.

Owls