Monday, September 24, 2012

Riak Cluster, Sanity Check, Part Two

In this blog, we continued with our sanity check on Riak Cluster.

Setting up Riak Cluster

I have used HBase, Cassandra, MongoDB in the past and I was preparing for a long, laborious efforts to setup a five-node Riak cluster but I was pleasantly surprised on how easy it was to set up the cluster. I found the command line tools are extremely helpful at diagnosing the cluster status, etc.

Sanity Check

Our sanity check consists of loading 20 million objects in the cluster and executes a series of Get, link walking, and free text search over 10 concurrent threads, while physical nodes are brought down and brought back up in the middle of performance testing.

Use HA Proxy for load balancing and failover

We couldn't get Java client's cluster client to work for us so we switch to use HA proxy for failover and load balancing purpose and it works out really well for us. 

When we brought down a Riak node during the performance testing, some of the in-flight queries failed but succeeded immediately after we retried with the same method call.

Test Result

Our sanity check performed flawless when physical nodes were brought down during the testing. Inflight queries failed but recovered right after we retried with the same query and there is little degradation when losing a physical node.

However, a big surprise came when we brought a physical node backup during the performance testing. It increased the overall performance time by 6 times. Baffled, I reached out to Brian at Basho and there is his explanation,
"The reason I ask is the http/pb API listener will start up before Riak KV has finished starting. So, while these nodes will be available for requests they will be slow and build up queues as they start up needed resources. You can use the riak-admin wait-for-service riak_kv <nodename> command from any other node to check if riak_kv is up."

My follow-on question, 
"Yes, I have my performance test running, through a HA proxy that maps to 5 nodes. I understand that the initial response from the starting up node will be slow, but it is impacting the entire performance, which consists of 10 testing threads. I would rather have the node not accepting request until it is ready than accepting requests but queuing them up to drag down the cluster performance."

And Brian's response,
"I completely agree, there is currently work being done on separating the startup into two separate processes. One which will reply with 503 to all requests and then dies when KV is started and ready to take requests. Currently, even if a request is not sent to a node directly, in the case of a PUT, the data will be forwarded to the node even though KV is not ready to accept it. Depending on your W quorum this could result in a long response time for requests as the coordinating node waits for a response from the starting up node.  Currently there is no method to make a node 'invisible' to other nodes in the cluster without bringing it down or changing the cookie in vm.args."

Summary

We are pleased with our overall sanity check results but there are definitely kinks need to be worked out and I am glad Riak team is on top of things.

No comments:

Post a Comment