Storm's Major Components
Storm consists of four major components,- Zookeeper
- zeroMQ
- jzmq (java bridge for zeroMQ)
- Storm
Start here if one wants to read more about Storm.
Installing Storm in Development Environment
Follow the instruction here if one wants to set up Storm in development environment.
Install Storm Cluster on Mac
Our challenge, however is to install storm cluster on a single Mac. Here is a good article on how to accomplish that, with one single exception. Instead of installing jzmq mentioned in the article, one must install this branch of jzmq for Mac.
More on this topic in later sections.
Start and Test Storm Cluster
Start ZooKeeper
sudo ./zkServer.sh startStart Storm Cluster
Depending on where one configures storm to store its temporary data, one might need to start storm through sudo.Start Nimbus
sudo ./storm nimbusStart Supervisor
sudo ./storm supervisorStart UI
sudo ./storm uiCheck Storm UI and Verify Storm cluster is up and Running
Go to http://localhost:8080 and one should see the following screen,Build and Deploy Storm-Starter to Storm Cluster
Download Storm-Starter
Download storm-starter from Githut,git clone https://github.com/nathanmarz/storm-starter.git
Build and Package Storm-Starter
cd storm-startermvn -f m2-pom.xml package
This will produce storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar under target directory.
Deploy and Run a Topology
storm jar storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar storm.starter.WordCountTopology WordCountTopologyIf everything deploys correctly, one should see the following messages,
[main] INFO backtype.storm.StormSubmitter - Jar not uploaded to master yet. Submitting jar...
10 [main] INFO backtype.storm.StormSubmitter - Uploading topology jar storm-starter-0.0.1-SNAPSHOT.jar to assigned location: /usr/local/var/run/zookeeper/data/nimbus/inbox/stormjar-19e2b1f9-df68-4105-b409-b4190d3d4efa.jar
76 [main] INFO backtype.storm.StormSubmitter - Successfully uploaded topology jar to assigned location: /usr/local/var/run/zookeeper/data/nimbus/inbox/stormjar-19e2b1f9-df68-4105-b409-b4190d3d4efa.jar
76 [main] INFO backtype.storm.StormSubmitter - Submitting topology wordCountToplogy in distributed mode with conf {"topology.workers":3,"topology.debug":true}
431 [main] INFO backtype.storm.StormSubmitter - Finished submitting topology: wordCountToplogy
Now, go back to storm ui and we should see the newly deployed topology shows up.
Even though the topology shows up in the ui, this doesn't mean storm is working properly. We need to drill down at the actual topology level to verify that storm is working properly by verifying that the number of emitted and transformed messages are greater than zero.
Stop or kill a Topology
./storm kill wordCountTopology
Where things could go wrong
There are many things could go wrong when setting up a storm cluster, from incompatible zeroMQ/jzmq, to file permission issue that can cause countless hours of frustration and searching the Internet. Here are some of the problems I ran into and how I managed to resolve them.Log files are your Friend
The log files for nimbus, supervisor, and ui are placed under STORM_HOME/logs directory. In addition to nimbus.log, supervisor.log, ui.log, one should find worker-6700.log, worker-6701.log, worker6702.log etc. Go through these log files to make sure there is no error or exception in the log files. If there is a file permission related error, one should be able to spot it in one of the log files.
zeroMQ and jzmq compatibility
The other hard to track down issue is mostly related to zeroMQ and jzqm bridge.
Only Use zeroMQ version 2.1.7
If one gets an invalid parameter exception for zeroMQ, the wrong zeroMQ version is used.
Only Use the Correct jzmq for Mac
Instead of installing jzmq mentioned in the article, one must install this branch of jzmq for Mac.
What happens when my worker thread keeps crashing
If the supervisor keeps reporting back that the worker thread keeps getting killed, and there is no exception in any of the log files, please check for hs_err_pid.log (pid is the process id) under STORM_HOME/bin directory or the directory where storm is launched. If one finds hs_err_pid.log files, chances are there is some incompatibility between the zeroMQ server and jzmq bridge. The hs_err_pid.log should have all the details.
The other options is to manually start the worker command by hand and see whether it can start up or crashes.