More on the Raspberry Pi cluster – concluding post


Alright, so, I have graduated as of writing this post. I should have completed my last story about Raspberry Pis and how it went in more detail but I have decided to write a little more about it today and conclude.

Cluster of Raspberry Pis

Fig 1. Cluster of Raspberry Pi and Artik boards setup in the lab

So, we were able to setup a cluster of around 10 Raspberry Pis, 9 of which were slaves and one was the master with a public IP. All were connected to a 1G switch and the setup was not as complicated as we thought in the beginning, partly because I had some prior experience setting up a cluster. We ran our algorithms (I/O intensive and CPU intensive) on the cluster using dummy data and recorded the results. Next, we decided to virtualize the Pis and see its effect on performance. Our initial plan was to install KVM and spin up virtual machines. So it turned out, you can not install a hypervisor on a Pi (not supported at the kernel level). Next we explored Dockers and got it to work on a single Pi. This was nice! We started working on getting Docker Swarm up and running, and this turned out to be complicated. Getting Docker containers to talk to each other in a cluster is not trivial and surprisingly, most of the documentation we found was not much helpful. Eventually, we were able to run the same algorithms on the Pis each running a docker instance. We expected the performance to take a significant hit but that was not the case as the numbers we got with and without virtualization were comparable.

The last part of our project included measuring the energy consumed in watts in running the cluster of Pis and what can be done to minimize it. Most of our PCs are tuned to run at the maximum CPU speed. So, for example, a PC with a CPU capable of performing 1,800,000,000 clock cycles per second will always be tuned to run at that speed and this is defined in the kernel. Running at the maximum CPU cycle also means consuming more energy. Btw, this method of varying the power and speed on a computer is called DVFS. A Pi3 model can run at a maximum CPU speed of 1.2 GHz but does not allow setting CPU speeds other than 600 MHz and 1.2 GHz. Hence, we decided to do our testing on the Artik boards (discussed in the previous post). Artik boards have a lower power DVFS control hardware block that allows frequency variation in continuous steps between 500 MHz to 1.3 GHz. We decided to vary this and note the energy consumed using a tool call WattsApp Pro. We ran the Spark code on the Artik cluster at 500 MHz to 1.3 GHz (interval of 100 MHz) and noted the time taken to finish the jobs along with the energy consumed. We found that running the Spark cluster at 1.1 GHz was the saddle point where it consumed least power and ran all the jobs on the Spark cluster in almost the same time as running at the top speed.

Our class paper can be found HERE. If you have any questions about anything in this project, please feel free to email me. I have not gone into the details of the project in this post but happy to answer any specific questions you may have.