web statistics

notes.variogr.am

My name is Brian Whitman. I am a lapsed scientist and sound artist currently co-founder/CTO at The Echo Nest, a music intelligence company in Somerville, MA. As I work on various scaling and media search problems with detours into art projects I'll be posting details here in the hopes that I can learn from others. I'd always like to hear from you if you are working on similar things.

May 26th, 2009 @ 6:01 pm

Why is NLTK so slow people

AMZN small instance (snail style)

 ### Took 92.35s to parse 10005 words 351 sentences (76.64% passed.) 0.26s per sentence.

Mac Pro

### Took 26.47s to parse 10005 words 351 sentences (76.64% passed.) 0.08s per sentence.

Where “parse” is pos_tag, an NP chunker (RegexpParser w/ our own grammar.) Most of the work is in pos_tag.

Archive · RSS · Theme by Novembird