H1B denial and stuff


Hey guys!

Been a relatively long time since my last post. I have been a bit busy – moved to a new city and a new team at NetApp. Why did I do that? I had to since USCIS denied my H1B visa for my last role at NetApp. Why did they do that? My theory is that it’s mostly random. Two people at my company with the exact same job profile and the exact same lawyers – one person gets their H1B approved while the other person gets their H1B denied. You can probably guess that I am the person whose H1B got denied in the above story. My thoughts around all this is that USCIS doesn’t really follow any strict set criteria for approving or denying H1Bs. A lot depends on who the officer is processing your application. This is not ideal but I got no say in this! Anyway, I got my H1B approved this time with my new job profile.

I was going to join MongoDB when I got my H1B denial but later decided to stay back at NetApp. That was mostly because of my wonderful ex manger Jeff, who was super helpful and understanding and found me an equally good position within NetApp. Now I work on a data migration tool called XCP – think of it as rsync on steroids. I live in Santa Clara and my office is not too far either.

But the worst part of living in the bay area is the rent you end up paying! My rent shot up by almost 2.5x compared to what I was paying in NC. So, that kind of sucks but otherwise, I am still exploring the bay area and it isn’t too bad 🙂

More later! I am at a coffee house writing this post and just drank some really good coffee!

p.s. Anyone got questions around H1B denials and stuff, I am your man!



“The sea’s only gifts are harsh blows, and occasionally the chance to feel strong. Now I don’t know much about the sea, but I do know that that’s the way it is here. And I also know how important it is in life not necessarily to be strong but to feel strong. To measure yourself at least once. To find yourself at least once in the most ancient of human conditions. Facing the blind deaf stone alone, with nothing to help you but your hands and your own head.”

~Christopher McCandless

Coin change problem


This question has been asked a lot in interviews and is one of the classical dynamic programming questions.

Problem statement : Given a bunch of coins of various denominations, find out the number of ways they can sum up to a given value. Assume that there’s an infinite number of each coin available.

For example:
Denominations = {1, 2, 3} and Sum = 4. Possible ways = {1, 1, 1, 1}, {1, 2, 1}, {1, 3}, {2, 2}.

Without going into any more details, I am just going to give you the code on how to do that using:

    1. Brute force
    2. Memoization (top-down)
    3. Tabulation (bottom-up)

    LINK to the code.

    If you have any questions, do let me know in the comments section.

    Happy coding!

Kadane’s algorithm


If you have seen interview questions like find the maximum sum in an array using any contiguous sub-arrays in it then you should read further.

We have discussed a few dynamic programming questions on this website. This post talks about yet another one (arguably). Let’s take an example – in the array {-2, -3, 4, -1, -2, 1, 5, -3}, what is the maximum sum that you can obtain by summing up any contiguous elements in it? Well, you can compare each element one by one and build an O(n*n) solution to do that. But can we do better? Let’s see what Mr. Kadane proposed.

For every i, let’s try to calculate the maximum sum possible till that i.

Array = {-2, -3, 4, -1, -2, 1, 5, -3}; i will go from 0 to 7. Let’s keep two variables:
sum_at_i = 0
max_sum = 0

At i = 0;
sum_at_i = sum_at_i + array[i] = -2
Since sum_at_i < 0, we re-initialize:
sum_at_i = 0
max_sum = 0

At i = 1;
sum_at_i = sum_at_i + array[i] = -3
Since sum_at_i < 0, we re-initialize:
sum_at_i = 0
max_sum = 0

At i = 2;
sum_at_i = sum_at_i + array[i] = 4
Since sum_at_i > 0 and sum_at_i > max_sum; we make:
sum_at_i = 4
max_sum = 4

At i = 3;
sum_at_i = sum_at_i + array[i] = 4 – 1 = 3
Since sum_at_i > 0 but sum_at_i < max_sum; we don’t change max_sum:
sum_at_i = 3
max_sum = 4

At i = 4;
sum_at_i = sum_at_i + array[i] = 3 – 2 = 1
Since sum_at_i > 0 but sum_at_i < max_sum; we don’t change max_sum:
sum_at_i = 1
max_sum = 4

At i = 5;
sum_at_i = sum_at_i + array[i] = 1 + 1 = 2
Since sum_at_i > 0 but sum_at_i < max_sum; we don’t change max_sum:
sum_at_i = 2
max_sum = 4

At i = 6;
sum_at_i = sum_at_i + array[i] = 2 + 5 = 7
Since sum_at_i > 0 and sum_at_i > max_sum; we make:
sum_at_i = 7
max_sum = 7

At i = 7;
sum_at_i = sum_at_i + array[i] = 7 – 3 = 4
Since sum_at_i > 0 but sum_at_i < max_sum; we don’t change max_sum:
sum_at_i = 4
max_sum = 7

And there we go! Our answer is 7. (But can you also find the indices of the sub-array?)

You can find the corresponding C++ code for the above algorithm HERE.

Now that you have found the maximum sum, can you also find the minimum sum? Give it a try.


1. GeeksforGeeks 2017. “Largest Sum Contiguous Subarray”. https://www.geeksforgeeks.org/largest-sum-contiguous-subarray/. [Online; accessed 27-March-2018].
2. Kartik Kukreja 2013. “Everything Under The Sun”. https://kartikkukreja.wordpress.com/2013/06/17/kadanes-algorithm/. [Online; accessed 27-March-2018].

How to Control Your Emotions During a Difficult Conversation


Today I read an article on how to maintain your composure and keep calm while being in an argument. With whatever little experience I have, I have seen people lose their cool more often than not, which actually comes back to bite them in their arse later.

Have a read in free time and see if it helps you. I think no matter where you are in your career or personal life, you can learn a few things from the article. It won’t take more than 5-6 minutes to read the entire piece – LINK.

I won’t back down


There are some songs that you instantly connect with, that pump you up, that makes you have goosebumps. But I also think that the effect of a song on any person depends largely on the mood or the circumstance they are in or going through. Anyway, while watching Grey’s Anatomy (I think season 8, last episode), I found one such song.

Ugly Numbers


Ugly is a number whose only prime factors include 2, 3, and 5. 1 is considered to be ugly by convention.

For example, 4 (2×2), 6 (2×3), 20 (2 *2 * 5) are ugly numbers but 7 (1x 7), 28(2x14x7) are not.

A simple way to calculate if a number is ugly or not would be to keep dividing a given number by 2, 3, and 5 and if we get 1 at the end, the number is ugly. Consider 300 for example.

300 – keep dividing by 2 till it can possibly be divisible -> 150, 75
75 – keep diving by 3 till it can possibly be divisible -> 25
25 – keep dividing by 5 till it can possibly be divisible -> 5 -> 1

Since we got 1 at the end, 300 is an ugly number.

So, what’s the time complexity of such an algorithm? It would O(nxn) with O(1) space complexity.

Can we do better? Whenever optimization comes to my head, I think of one of the two things – dynamic programming or graphs. Let’s take another look at this problem.

1. What’s the first ugly number? It’s 1. Alright, good enough.

2. What’s next? It would be one of the three – 1×2, 1×3, 1×5 i.e. (2, 3, 5). What’s the smallest? It’s 2. Thus, 2 is our next ugly number.

3. What’s next? It would be one of the three – 2×2, 1×3, 1×5 i.e. (4, 3, 5). What’s the smallest? It’s 3. Thus, 3 is our next ugly number. Note that we didn’t multiple 2 by 1 but instead we multiplied 2 by 2. This is because we had already used 2 in step 2. Now that we have used 3 in this step, we would do the same if we have to use 3 again in the future.

4. What’s next? Minimum of (2×2, 3×3, 1×5) = 4. Thus, 4 is our next ugly number.

5. We keep repeating the procedure till we get what nth ugly number we are looking for.

HERE is some code that solves it using both brute force and dynamic programming.

If you want to see the time difference between the two, figure 1 shows the time to calculate 1500th ugly number using brute force (more than 36 seconds) vs. dynamic programming (less than 2 seconds).

fstab file

Fig 1. Time to find out the 1500th ugly number using brute force vs. dynamic programming

More on dynamic programming later. If you have any questions, let me know in the comments section.


1. GeeksforGeeks 2017. “Ugly Numbers”. https://www.geeksforgeeks.org/ugly-numbers/. [Online; accessed 12-Feb-2018].
2. jmty0083, LeetCode 2017. “Ugly Number II”. https://leetcode.com/problems/ugly-number-ii/discuss/69364/My-16ms-C++-DP-solution-with-short-explanation. [Online; accessed 12-Feb-2018].

Get HTTPS for free!


There are several non-profit certificate authorities out there now that provide with free SSL/TLS certificates. On of them is Let’s Encrypt. I recently moved from HTTP to HTTPS using THIS that basically uses Let’s Encrypt’s certificate authority to issue free certificates.

You will need to edit your Nginx/Apache configuration files later on and also add a CNAME for the CAA etc. but it’s all worth it! After all, you get FREE HTTPS! 🙂

Stay secure!

Problem of Longest Increasing Subsequence (LIS)


Let’s explain the problem with an example. Assume the array is {10, 22, 9, 33, 21, 50, 41, 60}. In this case, there are several increasing subsequences like:

1. {10, 22, 33, 50, 60}
2. {10, 33, 50, 60}
3. {10, 21, 50, 50}
4. {10, 21, 41, 60}
and so on. The idea is to find the maximum length of the increasing subsequence in the main array.

The problem can be efficiently solved using dynamic programming but let’s first try to solve it using brute force. With brute force, we will make all the possible combinations and find out which one is the longest. Let’s start with the first element in our main array (10) and see how that goes.

Starting with 10, we either include it or we exclude it. Let’s include it first in which case the next element in the LIS has to be bigger than 10. We have 22, which is bigger than 10. We can have two cases here. One, we include 22 in our LIS and proceed further and two, we don’t include 22 and proceed further. Next, for each of the two cases above, we will check again for the next element, which is 9 in this case. Since 9 is smaller than 10, there is just one case here, which is to not include. Next, we have 33 in the main array. Again, we can have two cases here. One, we include 33 and proceed further and two, we exclude 33 from our LIS and proceed further. We keep doing this till we reach the end of the main array. At the end we take the maximum achieved by either including or excluding and return that as the answer. The time complexity of this brute force method will be O(2^n) and space complexity will be O(nxn).

The problem with the brute force method is that often, the same computation is being done more than once. We can solve this problem using dynamic programming (overlapping sub-problems and optimal sub-structure). Let’s see how we do that in THIS video.

HERE is some code that solves it using both brute force and dynamic programming. Note that there is another method using binary search that solves this problem in O(nlog(n)) time complexity and we may cover that in a later post.

To find out how to print the LIS elements as well, refer to THIS piece of code. The idea is similar just a little modification and keeping a tab of vector of vectors.


1. Tushar Roy 2015. “Longest Increasing Subsequence”. https://www.youtube.com/watch?v=CE2b_-XfVDk. [Online; accessed 24-Jan-2018].
2. vinod23, LeetCode 2017. “Longest Increasing Subsequence”. https://leetcode.com/problems/longest-increasing-subsequence/solution/. [Online; accessed 24-Jan-2018].
3. GeeksforGeeks 2017. “Dynamic Programming | Set 3 (Longest Increasing Subsequence)”. https://www.geeksforgeeks.org/longest-increasing-subsequence/. [Online; accessed 24-Jan-2018].

Dynamic Programming – Optimal Sub-structure


In THIS post, we covered the overlapping sub-problems part of dynamic programming. Now we will talk about what do we mean by “optimal sub-structure”?

Assume we have a graph like below. What is the shortest path from node A to node D? It is A->C->D. So, optimal structure property means that if a node (C) lies between the shortest path of two nodes (A and D) then the shortest path between the two nodes (A and D) will be the shortest path between the source node (A) to the intermediary node (C) plus the shortest path from the intermediary node (C) to the destination node (D). Note that if we were to find the longest path between two nodes then this property does not hold true (try drawing out a graph and see for yourself why?)

Shortest path example

Fig 1. Shortest path problem holds optimal sub-structure property of dynamic programming

In the next few posts, we are going to solve a few DP questions to understand our concepts better.


1. GeeksforGeeks 2017. “Dynamic Programming | Set 2 (Optimal Substructure Property)”. https://www.geeksforgeeks.org/dynamic-programming-set-2-optimal-substructure-property/. [Online; accessed 14-Jan-2018].

Persistent block device naming


In the last couple of months or so, I have been playing with building a hadoop cluster and stuff. This also involved dealing with lots of JBODs and configuring them properly, which is pretty straight forward. To make a new disk mounted permanently, you generally add an entry in fstab, all good till now. So, my fstab entry looked something like below. See the HDFS mount points.

fstab file

Fig 1. Entries for the mounted points in the fstab file

The problem started when sometimes I would reboot my servers and some of them won’t come back and stuck at boot time while entering a safemode kind of a thing. Only a root user could then go in and troubleshoot. Most often than not, if I would reboot that server again, it rebooted back just fine. But this was annoying, really, when a bunch of servers won’t come back up every time I rebooted. I decided to do a little research into this.

Storage devices are usually managed by the sd drivers. That’s why you see the naming like sda, sdb, etc for the disks. The letter that comes after “sd” is referred to as a major number and the numeric number that comes after that, (e.g. sda”1″) is referred to as a minor number. Whenever the sd driver detects a device, it assigns an available major number and a minor number to it. What this also means is that when you reboot a server, the sd driver may detect some disk earlier than when it detected it the last time you rebooted that server, and therefore, it can assign it a different major and a minor number this time. Thus, what was sda1 last time may become sdb4 this time and the entries you have in fstab won’t work causing the system to not boot up properly. This can happen because of various reasons mentioned in [1].

For all these reasons, you really shouldn’t use sd names when referring to devices in your fstab file. There are several other ways to map the correct device using:

  1. by-label
  2. by-uuid
  3. by-id and by-path
  4. by-partlabel
  5. by-partuuid
  6. Static device names with Udev

I am not going to discuss all of them here except by-uuid, which is what I ended up using. If you issue an “lsblk -f” on your server, you would be able to see NAME, FSTYPE, LABEL, UUID, MOUNTPOINT. Note down the UUID for the concerned device and add this to your fstab file. It would look something like below.

fstab file

Fig 2. Entries for the mounted points in the fstab file using UUID

Now, reboot your server all you want, the UUIDs for the disks won’t change, ever, and you will have persistent device naming. This at least made me realize how small things like these can help save you so much of time and make sure you are not going crazy when sda1 becomes sdb4 on its own 😀


1. RedHat 2017. “PERSISTENT NAMING”. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/persistent_naming. [Online;accessed 22-Nov-2017].

Dynamic Programming – Intro


I’ll confess – I always had (or maybe still have) issues solving dynamic programming (DP) questions. I mean, it’s just complicated, not as straightforward to understand. And how do we even know that a given question can be solved using dynamic programming?

There are two things to look for in order to understand if a given problem can be solved using dynamic programming.

  1. Are there “overlapping sub-problems”?
  2. Is there a defined structure or a “optimal sub-structure”?

If the above two conditions hold true for any given problem, it can usually be solved using dynamic programming. In this post, I will talk about overlapping sub-problems and the two common approaches to solve such problems. In the next post, I will talk about what is an optimal sub-structure.

Think of how a fibonacci series is calculated using recursion. Let’s take the example of fib(5). In calculating fib(5), we end up calculating fib(3), fib(2), fib(1) more than 1 time. If we were to calculate the fibonacci series of a much bigger number, there would be a lot of redundant calculations. If there was a way to calculate things only once and look them up as needed (look-up of O(1)), we would save a lot of time!

There are two ways to calculate things only once – a) Have a look-up table and build this look-up table on-demand, also called Memoization or top-down approach; or b) Have a look-up table and build this completely before hand, also called Tabulation or bottom-up approach.

HERE is some code that calculates the fibonacci series using recursion, memoization, and tabulation.

The figure below shows the time taken to calculate the same fibonacci number 45 using recursion, memoization, and tabulation. It took 11.32 seconds using recursion whereas only 1.71 seconds and 1.43 seconds using memoization and tabulation respectively. Indeed a great performance improvement!

Time to run fibonacci 45

Fig 1. Time taken to calculate fib(45) using 3 different approaches

RedHat 2017. “PERSISTENT NAMING”. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/persistent_naming. [Online;accessed 22-Nov-2017].

Reference: GeeksforGeeks 2017. “Dynamic Programming | Set 1 (Overlapping Subproblems Property)”. https://www.geeksforgeeks.org/dynamic-programming-set-1/. [Online; accessed 14-Jan-2018].

GoDaddy to Linode and LAMP to LEMP


So, I have been using GoDaddy’s servers to host this website for over two years now but I wanted to move away from it for several reasons. Some of them being lack of ssh access to a shared GoDaddy server, cost, and flexibility.

Specifically talking about the cost and other factors, below are some numbers.

Name Type Cost for 3 months CPUs Memory Network Storage
GoDaddy Economy Linux Hosting $24 1 512MB Unmetered (conditions applied) Unmetered (conditions applied)
Linode Linode 1GB $15 1 1024MB 1 TB 20 GB SSD

It was a no brainer to move to Linode and I was just being lazy to transfer the site but I did it finally. Yay! Also, I wanted to use a different web stack. So, use Nginx instead of Apache and MariaDB instead of MySQL. Honestly, I don’t know much about the advantages of one over the other but I still wanted to give it a try. Using Nginx over Apache did prove to be a bit of a learning curve and I had to manually tweak in some configurations but it wasn’t that bad.

Anyway, if you have any questions about the migration, do leave a comment. Btw, I just realized that I had disabled the comments section, which I am going to re-enable now.

Also, btw, I forgot to post about the fact that I started working for NetApp full-time as a Technical Marketing Engineer. Yeah, that position was new for me too but very briefly, I work on things like Hadoop, Spark, big data problems, etc. Anyway, more on that later.

What am I gonna live for?


I haven’t been doing anything productive I feel for the last so many weeks. This feeling has started to feel more intense of late, especially today. Feels likes I am wasting so much of my time when I could be using that to do things I really want! Anyway, watching Grey’s Anatomy continuously for several hours now, the last song in the last episode of Season 4 just gave me a kick to get off my lazy ass from that couch, listen to the song again, and do something productive! I think I have been missing my music in the last so many months. Good music really helps me I think and push. Anyway, here is the song!

Toilet: Ek Prem Katha


Recently, a friend of mine suggested that we watch this new movie called Toilet: Ek Prem Katha. I do like watching movies on social issues, especially those concerning India, so I agreed. At the end of it, I came out of the movie theater more baffled/irritated than anything else. Was it an entertaining movie with an incredible social message or a hidden propaganda and siding with one of the major political parties in India?

I felt uneasy when Toilet: Ek Prem Katha, a Reliance Company production, started by applauding the Prime Minister and naming him for his “Swachh Bharat Abhiyan”. I admire the initiative of “Swachh Bharat Abhiyan” but specifically naming the current Prime Minister and applauding such schemes as his individual effort seems a bit like buttering to me. It would have been easier to digest had the movie simply stated that it appreciates the steps taken by the government in spreading awareness among the citizens.

The movie started off as a romantic story between Akshay Kumar and Bhumi Pednekar with songs that I found to be too many and unnecessary. During the intermission (thankfully a much needed break for my train of thoughts), I discussed with my friend how Akshay Kumar had recently received a National Award for the movie Rustam. Akshay Kumar is one of the finest actors in Bollywood who has successfully maintained his fan base for decades. However, Rustam, arguably, is not the best piece done by Akshay Kumar, who also happens to be a Canadian citizen. It is true that to win a National Award, the person does not need to be an Indian citizen. But it raises questions when a person who receives the National Award for a sub-par performance in the same year when other hits like Dangal (Aamir Khan) were released is also seen as a big supporter of the ruling government (particularly in 2015 during the controversy of “award-wapsi”). Given the chain of events, I was thinking if it was fishy that Akshay Kumar, a Canadian citizen, awarded a National Award for a sub-par performance, who then is a lead character in a movie that commends and praises the Prime Minister for “his” “Swachh Bharat Abhiyan”?

Unfortunately, the second half of the movie does more to confirm my doubts that this movie is a pro-government propaganda by inserting “good ideas” like “Demonetization” and “6 millions toilets being built in last 3 years”. It must be a pure coincidence that our honorable Prime Minister was elected exactly 3 years ago too! And then of course there is a corruption scandal that is supposed to be addressed, a scam that they again specifically mention happened 4 years ago. Yes, there was a Toilet scam in the previous government and I would criticize that as much as I can but can you connect the dots here? A movie that was supposed to spread awareness among the people to not defecate in the open ends up endorsing the Prime Minister, even for unrelated things to the movie, and ridiculing the previous government? Toilet: Ek Prem Katha clearly abuses the power it holds to connect to millions of people and messes up with a chance to be the movie that stood up for a social cause without any prejudice.

I smell a rat when a company called Viacom-18 (owned 50% by Reliance), led by Mukesh Ambani, one of the wealthiest person in the world and supposedly a close aid to the Prime Minister, produces a movie led by a movie actor who received an award he did not deserve. No surprise that this movie is going to run tax-free in all the states led by the current government. I feel like a stupid, helpless, and gullible Indian when I watch mainstream movies like these that are meant to subtly manipulate us by endorsing some political party. We sure are better than this! Watching the movie reminded me of the quote “If you state a lie often enough it becomes the truth”. Was this movie supposed to create a good image of the government among its citizens? Was it an intentional pro-government propaganda movie produced by Reliance who is trying to maintain a productive relationship with the current government? It is quite irresponsible and a shame that a respected form of art, a mass media would side with one political party and actually try to exploit the audience though such movies.

I hope to be able to praise and give credit where it is due, and I hope that I can rationally criticize where it is needed as well. A bigger hope is to see my country people be smarter than just taking in whatever such movies feed us and start a respectful discussion. No doubt that the current government has done some good work but there are also a lot of areas it could have done much better. It is high time that we first become smart Indian citizens who read a lot, research, discuss, respect one another, and become smart voters rising above the boundaries of religion, region, and castes.

Some random quotes

I am about to rub my white board after a long time but before I do that, here are some of the quotes I like on there.

“The only place success comes before work is in the dictionary.”

Vince Lombardi

“Judge yourself by how hard you train not by how well you perform.”

Don’t know who said it

“Hardwork beats talent when talent fails to work hard.”

Kevin Durant

“When things go well, everyone can pretend to be fantastic.”


“Eventually you get what you want, but you must want it enough.”

Don’t know who said it

“When things go wrong in life that is when it truly shows who you are.”


“The axe forgets what the tree remembers.”

African Proverb

More on the Raspberry Pi cluster – concluding post


Alright, so, I have graduated as of writing this post. I should have completed my last story about Raspberry Pis and how it went in more detail but I have decided to write a little more about it today and conclude.

Cluster of Raspberry Pis

Fig 1. Cluster of Raspberry Pi and Artik boards setup in the lab

So, we were able to setup a cluster of around 10 Raspberry Pis, 9 of which were slaves and one was the master with a public IP. All were connected to a 1G switch and the setup was not as complicated as we thought in the beginning, partly because I had some prior experience setting up a cluster. We ran our algorithms (I/O intensive and CPU intensive) on the cluster using dummy data and recorded the results. Next, we decided to virtualize the Pis and see its effect on performance. Our initial plan was to install KVM and spin up virtual machines. So it turned out, you can not install a hypervisor on a Pi (not supported at the kernel level). Next we explored Dockers and got it to work on a single Pi. This was nice! We started working on getting Docker Swarm up and running, and this turned out to be complicated. Getting Docker containers to talk to each other in a cluster is not trivial and surprisingly, most of the documentation we found was not much helpful. Eventually, we were able to run the same algorithms on the Pis each running a docker instance. We expected the performance to take a significant hit but that was not the case as the numbers we got with and without virtualization were comparable.

The last part of our project included measuring the energy consumed in watts in running the cluster of Pis and what can be done to minimize it. Most of our PCs are tuned to run at the maximum CPU speed. So, for example, a PC with a CPU capable of performing 1,800,000,000 clock cycles per second will always be tuned to run at that speed and this is defined in the kernel. Running at the maximum CPU cycle also means consuming more energy. Btw, this method of varying the power and speed on a computer is called DVFS. A Pi3 model can run at a maximum CPU speed of 1.2 GHz but does not allow setting CPU speeds other than 600 MHz and 1.2 GHz. Hence, we decided to do our testing on the Artik boards (discussed in the previous post). Artik boards have a lower power DVFS control hardware block that allows frequency variation in continuous steps between 500 MHz to 1.3 GHz. We decided to vary this and note the energy consumed using a tool call WattsApp Pro. We ran the Spark code on the Artik cluster at 500 MHz to 1.3 GHz (interval of 100 MHz) and noted the time taken to finish the jobs along with the energy consumed. We found that running the Spark cluster at 1.1 GHz was the saddle point where it consumed least power and ran all the jobs on the Spark cluster in almost the same time as running at the top speed.

Our class paper can be found HERE. If you have any questions about anything in this project, please feel free to email me. I have not gone into the details of the project in this post but happy to answer any specific questions you may have.

Attempt to build a Spark cluster on Raspberry Pi1


We started off building a Spark cluster using 10 Raspberry Pi1 model B+ since that is what was available in the lab at that point. Just to inform, a Pi1 model B+ has 512M of memory, ARMv6 single core, and 700 MHz clock speed. We made one of these Pis as the Spark master and the rest as workers. Spark version v2.1.0-rc1 was used.

The initial tests included simple word count and sorting examples. We expected it to run slow because of memory constraints and that is what happened. Running word count and sorting algorithms on a text file of less than 10 words took like an hour. More complicated algorithms never ran to completion. We decided to halt our testing and order Pi3 models. Along with Pi3 models, we also ordered 5 x Artik10 boards. Each Artik10 board has 2G memory, 8 threads, 1.3GHz, and a ARMv7 Processor rev 3. Compared to this, a Pi3 has 1G memory, 4 threads, 1.2GHz, and a ARMv7 Processor rev 4 (v7l).

I am not diving into more technical, low level details on how we setup a Spark cluster but if you have specific questions, please leave comments. I would gladly help.

Building a Spark Cluster on Raspberry Pi


I have been a bit lazy of late to update this post. I have decided to write a little in a bunch of posts than one super long post since that scares me 🙂

As a part of my Operating Systems class that I took in the fall of 2016 under Dr. Butt, I worked in a team of 5 on a project involving Fog Computing. The exact title of the project was “Effects of Latency, Virtualization, Frequency, and Security on Fog Computing”. So, just a brief introduction about the project in this post. Will add more details in the next one.


Eraser: A Dynamic Data Race Detector for Multithreaded Programs


As a part of my OS class this semester (Fall 2016), we have been reading a bunch of very interesting papers. I wish to blog every paper I read for this class, but I don’t get time :/ Anyway, here is a brief summary of this paper for noobs.

Paper: Eraser: A Dynamic Data Race Detector for Multithreaded Programs

Authors: Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, Thomas Anderson

Date: 1997

What is it about? The basics….

  1. The paper attacks the problem of data race in multithreaded programs.
  2. Debugging multithreaded programs can be difficult. Basic errors in synchronization can produce timing-dependent data races that can take “several weeks” to debug.
  3. Previous work done by Lamport’s ‘happens-before’ relation checks that conflicting memory accesses from different threads are separated by synchronization events.
  4. The work in this paper improves upon the ‘happens-before’ algorithm by simply checking that all shared memory accesses follow a consistent locking discipline.
  5. The paper argues that for many programs, its approach is “simpler, more efficient, and more thorough” at catching races than the ‘happens-before’ algorithm.
  6. This work also improves upon the work done by Hoare (Monitors) in that they don’t protect against data races in programs with dynamically allocated shared variables.

Some key things….

  1. The paper proposes a new algorithm to detect data races called “The Lockset Algorithm”.
  2. “The Lockset Algorithm” enforces a locking discipline that some lock protects every shared variable. This means that any thread holds the lock whenever it accesses the variable.
  3. It improves upon the locking discipline and extends “The Lockset Algorithm” to accommodate initialization, read-shared data, and read-write locks.
  4. Developed a program annotation to allow users to eliminate false reports.
  5. Experimental stuff

    1. Eraser was implemented for the DIGITAL Unix operating system on the Alpha processor using the ATOM binary modification system.
    2. Applications typically slowed down by a factor of 10 to 20 while using Eraser.
    3. While programming some tests that contained common synchronization errors, Eraser detected the data race errors correctly.
    4. Eraser also successfully tackled large multithreaded servers written by researchers at DEC SRC: the HTTP server and indexing engine from AltaVista, the Vesta cache server, and the Petal distributed disk system.
    5. Eraser found undesirable race conditions in three of the four server programs, and in many of the undergraduate homework problems. It also produced false alarms that were suppressed using annotations.
    6. The authors mention, “Eraser’s testing is not very sensitive to the scheduler interleaving.” To prove this, they did two experiments on Ni2 and Vesta.
    7. The above is only a summary, and doesn’t contain all the details mentioned in the paper. For a thorough review, please go through the entire paper. If you have some questions, feel free to email me at fabidi89@vt.edu.

Old days!


Aata hai yaad mujhko guzra hua zamana
Woh baagh ki baharein vo sabka chehchahana

Aazaadiyaan kahan vo apne ghosle ki
Apni khusi se aana apni khushi se jaana

Lagti hai chot dil pe aate hain yaad jis dam
Shabnam ke aansuon par kaliyon ka muskurana

Vo pyaari pyaari soorat vo kaamni si moorat
Aabaad jiske dumse tha mera aashiyaana

Aati nhin sadaein uski mayray qafas mein
Hoti meri rihayi ai kaash mere bas mein

Kya badnaseeb hoon main ghar ko taras rha hoon
Saathi to hain watan mein main qaid mein parda hoon

Aayi bahaar kaliyaan phoolon kee hans rahi hain
Main is andhere ghar mein qismat ko ro rha hoon

Iss qaid ka ilaahi dukhrda kise sunau
Dar hai yahin qafas mein main gham se mar na jaau

Jabse chaman chhoota hai yeh haal ho gya hai
Dil gham ko kha rha hai, gham dil ko kha rha hai

Gaana ise samajh kar khush ho na sunne waale
Dukhe hue dilon ki paryaad se sada hai

Azaad mujhko karde O qaid karne waale
Main be-zubaan hoon qaidi tu chhor kar dua le


I have to die one day


Koi ummed bar nhin aati
Koi soorat nazar nhin aati

Maut ka ek din muaiyyan hai
Neend kyun raat bhar nhin aati

Aage aati thi haal-e-dil pe hasi
Ab kisi baat par nhin aati

Jaanta hoon sawaab-e-taa’at-o-zahad
Par tabiyat idhar nhin aati

Hai kuch aisi hi baat jo chup hoon
Warna kya baat kar nhin aati

Kyu na cheekhun ki yaad karte hain
Meri aawaaz gar nhin aati

Daagh-e-dilgar nazar nhin aata
Bu bhi ai chaargaar nhin aati

Hum wahan hain jahan se humko bhi
Kuch hamari khabar nhin aati

Marte hain aarzoo mein marne ki
Maut aati hai par nhin aati

Kaaba kis muh se jaoge Ghalib
Sharam tumko magar nhin aati

-Mirza Ghalib