Solr Tips Of The Day

(Part 1 of many of my Fake Solr Consultancy Service, I plan to call it Lexatexy or BARISMO) oh and I will only consult to coffee companies or yacht club membership sites. In my dream world LEXATEXY does not do actual work, he says Hm a lot and is not under any deadlines or commitments. He is the text retrieval kin of Steve Miller’s titular “Joker"ย 

YOUR COMMIT RULE OF THUMB

LEXATEXY: Add a single test document to your index. Commit. Does it take longer than 10s?

Happy customer: No, it took 10 minutes.

LEXATEXY: Hm. How many documents do you have

Happy customer: 10 million.

LEXATEXY: That’s a lot for a single index, but it still shouldn’t take 10 minutes. Do you ever optimize?

Happy customer: I tried that once and i couldn’t query for four hours.ย 

LEXATEXY: Rsync your data somewhere else, boot a new server on it, optimize it there, then sync it back.

Happy customer: Wow that is annoying. Any hey when I do it I get merge exceptions.ย 

LEXATEXY: Yeah, that sucks but now your commits are better I bet. I bet what happened is the server crashed and restarted while you were adding lots of documents, and you have duplicates and the only way out is to re-index.ย (Did you know this bug is fixed as of Solr 1.3? That wasn’t my problem today.) You can try theย java org.apache.lucene.index.CheckIndex tool with -fix on. Thenย you should be fine for a while. Until it happens again.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s