Overview
Introduction
On the road of ressource optimisation and cluster efficiency, mesos is good move for distribution like Hadoop or MPI (see the paper from Berkeley). However what about other technologies like Storm and real time processing ? It seems that storm can run on top of mesos with a custom storm distribution published by Nathan Martz : storm-mesos. Let’s see how to do it.
Note : another framework is under development at Yahoo and the prototype is available on github : storm-yarn.
Pre-requisites
- Make sure you have JDK 1.6 (see these notes)
- A mesos 0.11.0 cluster up and running (see these notes)
- A quorum of zookeeper 3.4.5 (see admin notes)
Environnement
- OS : Debian 6.0.6 x64
Download
- storm-0.8.2.zip (projet site) : distributed and fault-tolerant realtime computation framework
- storm-mesos (from github) : the distribution to run Storm on top of Mesos
- lein (from github) : a shell script to manage Leiningen, a tool to automate Clojure projets
Quick download to your storm home directory :
$ wget https://dl.dropbox.com/u/133901206/storm-0.8.2.zip
$ git clone git://github.com/isnoopy/storm-mesos.git
$ wget https://raw.github.com/technomancy/leiningen/stable/bin/lein
Install lein
- Place it on your $PATH :
$ mv ~/lein ~/bin/.
$ echo "export PATH=$PATH:/home/storm/bin" >> ~/.bashrc
$ source ~/.bashrc
- Set it to be executable :
chmod 755 ~/bin/lein
- Install leiningen (v2.2.0) :
lein self-install
Prepare storm-mesos
Copy mesos-0.11.0.jar and protobuf-2.4.1.jar from mesos build in the lib/ folder :
$ cd storm-mesos
$ mkdir lib
$ cp ../mesos-0.11.0/build/protobuf-2.4.1.jar lib/.
$ cp ../mesos-0.11.0/build/src/mesos-0.11.0.jar lib/.
- Copy the storm distribution in the lib/ folder:
$ cd ../storm-0.8.2.zip lib/.
-
Update the description (version number etc.) of storm-mesos in
project.clj
-
Update the storm configuration file :
storm.yaml
. For this post, every daemons are running on localhost and the configuration is the following :
## Default configuration for standalone mode (every daemons on one node with default settings)
# Path to mesos distribution built properly
java.library.path: "native:/usr/local/mesos-0.11.0/build/src/.libs"
# hostname:port of the mesos master node
mesos.master.url: "localhost:5050"
# in cluster ENV, change it to a globally accessible directory (HDFS or NFS etc.)
mesos.executor.uri: "/usr/local/storm-mesos-0.8.2-SNAPSHOT.tgz"
# hostname:port of zookeeper nodes (default port 2181)
storm.zookeeper.servers:
- "localhost"
# hostname of nimbus node
nimbus.host: "localhost"
# full path of storm local working directory
storm.local.dir: "/usr/local/storm-local"
Note : at this moment the storm-mesos-0.8.2-SNAPSHOT.tgz
is not yet build. However the storm configuration file requires to set its location now in the mesos.executor.uri. The file storm-mesos-0.8.2-SNAPSHOT.tgz
is the same for every node. In standalone mode, local path is used but in a cluster mode, a path accessible globally (HDFS or NFS for example) is required.
- Update dependencies :
$ lein deps
- Create the pom.xml :
$ lein install
- Compile with maven :
$ mvn clean compile install
- Build storm-mesos :
$ ./bin/build-release.sh lib/storm-0.8.2.zip
- Deploy the distribution to the path declared in
mesos.executor.uri
ofstorm.yaml
:
$ sudo cp storm-mesos-0.8.2-SNAPSHOT.tgz /usr/local/.
- Unpack the distribution to run the Nimbus (this operation is done only on the master node running the Nimbus). For standalone mode, unpack it in the home directory:
$ cd ~
$ tar xzvf storm-mesos/storm-mesos-0.8.2-SNAPSHOT.tgz
- Ensure that zookeeper is running on localhost and run the nimbus
as root
:
$ cd ~/storm-mesos-0.8.2-SNAPSHOT
$ sudo ./bin/storm-mesos nimbus
- Mesos UI overview (standalone) : http://localhost:5050
- Run the storm ui :
$ cd ~/storm-mesos-0.8.2-SNAPSHOT
$ ./bin/storm-mesos ui
- Storm UI overview (standalone) : http://localhost:8080