Skip to main content

Discoveries

I never thought about research as a really engrossing activity right until you actually discover or uncover some facts that you never might have thought were there. I could only talk about my thesis because I have just begun analyzing results I had been gathering for a semester or so now already.

It seems that what has been written before about load balancing and processing distribution have a couple of stringent assumptions that may or may not be met in real life: first would be that the processors are the same (setup is homogeneous), and that global balance should be quantifiable. The first assumption is necessary to simplify the task of load balancing -- and stick as much as possible to solving the bin packing problem. However, in reality the bins don't all get emptied at the same rates, and the rate is not constant. The capacity is however, infinite (i.e. a processor can work forever or at least until it can perform as expected) but that hardly changes the setting of the problem.

So in the bin packing problem, if you want to be able to fit in items of different sizes and keep the sum of the sizes in each bin minimal, you'd most probably think that there should be a simple algorithm to solve the problem. I have read about the bin packing algorithm, and a modification which actually involves resorting the bins as they are being filled. However, the assumption here would be that you know the sizes of the items beforehand, and that there is a finite number of items to be distributed among the bins. The assumption of "finiteness" is necessary, otherwise putting the items in any manner into the bins wouldn't matter much. The assumption of knowing the sizes of the items to be put in the bins in real life would sound normal -- however, in the computing world, you never know what you have at hand.

I chose the prime number finder as a test problem for my proposed algorithm because primarily of the difference in the amount of work being done as you move from one number to another. Maybe that didn't make sense, but I'll try to outline it as simply as I can.

To find the prime numbers from 1 to N (where N > 5), you can either use the sieve of Erasthotenes (I hope I spelled that correctly) OR you check each and every number from 1 to N whether that number is prime or not. If N was a really large number, then the sieve would be faster but would use more memory than if you checked the primality of each number. However, checking the primality of a number in the worst case would be in exponential time (there are other ways, but that is not my concern). Aside from that, checking whether x is prime (where 3 < x < N - 2 and is ODD) would take less time than say whether x+2 is prime.

To check whether x is prime, there would be three basic steps: 1) if 0 < x <= 3 then x is PRIME; 2) if x is even then it is COMPOSITE; 3) check whether x is divisible by every ODD number from 3 to the cieling of the sqare root of x (CIEL(SQRT(x))) -- if it is, then it is COMPOSITE otherwise it is PRIME. In C it would be:

int isprime(int x) {
int i;
if ( x <= 3 ) return 1;
if ( x % 2 == 0) return 0;
for (i = 3; i*i <= x; i+=2)
if ( x % i == 0 ) return 0;
return 1;
}

NOW, to know whether 99983 is prime would take more time than say knowing whether 16 is prime. Then think of all the numbers from 1 to 10 million to be the items to be distributed to P bins... do you sort the numbers per value, or would you be able to know beforehand the length of time it will take a number to be determined prime or not? How then would you distribute the numbers being the items among P bins such that you do the most work at the shortest amount of time (i.e. the distribution is optimal)?

Dividing the 10 million numbers evenly among the number of slaves in a master-slave architecture doesn't necessarily achieve global load balance because the slaves don't do the same amount of work knowing that they all have to deal with different numbers. A mesh wouldn't help either because as in the master slave model, each node in the mesh would have different numbers to deal with. And then we all know that half the numbers would be even, but the number of primes in an interval wouldn't be easy to calculate beforehand (i might be wrong, but I haven't thought it out yet).

So what am I getting at here? Based on results, automatic load balancing (giving a node some work, and let it do the work until it needs more, on which time it will be given more work to do) would be a feasible solution to similar problems BUT the number of nodes and the amount of numbers to give each node would have a significant effect on the overall performance of the system. Even distribution of items in problems like these do not necessarily yield the best results in terms of time and resource maximization (throughput).

And it's so nice to know that you have data to back up your claim.

Keep chilled... Until next time.

(Technical Paper pending, but should be available to interested parties.)

Comments

Popular posts from this blog

Futures and Options III: Economics, Journalism, or Computer Science

I realise it's been a year since my previous post on this blog, and I've found myself having very little time to do another "brain dump" on the subject of my early choices in life. With that in mind (and as I'll be traveling again soon) I get to think a little more and reflect on a few of the things that have happened.

Much like the previous post, this one's set in high school -- where I was part of the swimming team, in a band, had been programming with Turbo Pascal, Java, and then C++ later on, and was about to make a choice that would literally change the course of my life. This one is about the choices I made, and the ones that were made for me.

Note: This is part 3 of a series about my early choices in life which have gotten me to where I am today. I would greatly appreciate your feedback and thoughts, as well as for your reading through this series!


Writing Again

It's 2019 and I just realised that I've not written on this blog for a long while. I feel a little bad about this so I'm picking it back up again. More importantly, I've limited my social media to just Twitter (I've deleted all my Facebook-related accounts) and will be writing more on the blog instead of engaging in other social media sites. If you want to reach me directly, you can also reach me through my keybase.io account for encrypted communication. If you have my phone number, you can also contact me through Signal. Quite a number of things have happened in the past few years and here's a quick update on things that I can share:

I've been working on XRay, a function call tracing system now part of the LLVM project. This took a good two and some years of my time at Google.Most recently I've moved to the Chrome Operations Team still here in Google Sydney. I can't give specifics yet of what I'll be working on, so stay tuned.There've been c…

Rant: Despair and Hopelessness

This weekend I had the chance to do a Google+ hangout with my father in the Philippines. He and I don't talk often but we do have a very good relationship. My dad is cool like that. In this hangout we talked about a few things happening in the Philippines and I've gotten the feeling that my homeland is getting ever deeper into economic disrepair, and that the politics to which I've come to be hopeless on is beyond repair. I've wanted to get something off my chest that's been bothering me for a while now, so if you would indulge me please read on.

Background

I grew up in a part of the Philippines where the land is fertile, there are thriving industries, and there's a certain sense of abundance and stability. This part of the Philippines has good schools, good employment opportunities (mostly industrial and service industries), good investment opportunities (real-estate and agricultural), and good potential for growth. This was true when I was young and this is tr…