Monday, September 24, 2012

Riak Sanity Check, Take One

Before we can officially recommend Riak to the management team, we executed a series of sanity checks to make sure that we will make a sound recommendation and won't come back and bite us in the rear.

Testing Scenario

Our sanity testing environment consists of 20 million objects, running on a single physics node. After 20 million objects are loaded into the Riak 1.1, we executed a series of get, link walking, and free text search with 10 concurrent threads.

We are not interested in the actual performance number, but are looking for obvious bottlenecks and abnormal behaviors.

Test Result

We are surprised at the poor results exhibit by our sanity checks, Riak consuming 100% of CPU and 75% of memory and queries didn't return in any reasonable fashion.

Needless to say, we are somewhat concerned. This is where Riak's excellent support jumps into the play. Riak's Develop Advocate Brian worked with us and came up with an excellent diagnosis;

"LevelDB holds data in a "young level" before compacting this data into numbered levels (Level0, Level1, Level2….) who's maximum space grow as their number iterates. In pre Riak 1.2 levelDB, the compaction operation from the young level to sorted string table's was a blocking operation. Compaction and read operations used the same scheduler thread to do their work so if a compaction operation was occurring that partition would be effectively locked. Riak 1.2 moves compaction off onto it's own scheduler thread so the vnode_worker (who is responsible for GET/PUT operations) will not be waiting for compaction to complete before read requested can be serviced by the levelDB backend (write requests are dropped in the write buffer, independent of compaction).

In your scenario, the bulk load operation caused a massive amount of compaction. Most likely, because of the large amount of objects you loaded, there was compaction occurring on all 64 partitions for a while after your write operation completed. The input data sat in the write buffer and young level and eventually compaction moved them to their appropriate levels but during this time read operations will timeout.  "

So our bulk loading led to a contention on a single thread, which is responsible for compacting and querying at the same time.

We can't upgrade to 1.2 just yet because a bug in search so we will have to wait for the next release. 

Summary

Overall we are not pleased with our sanity checks but am satisfied with the explanation.

1 comment:

  1. Good artcile, but it would be better if in future you can share more about this subject. Keep posting. Can you guess how much these celebrities are worth? Test your knowledge with Celebrity net worth.

    ReplyDelete