Skip to main content

Discoveries

I never thought about research as a really engrossing activity right until you actually discover or uncover some facts that you never might have thought were there. I could only talk about my thesis because I have just begun analyzing results I had been gathering for a semester or so now already.

It seems that what has been written before about load balancing and processing distribution have a couple of stringent assumptions that may or may not be met in real life: first would be that the processors are the same (setup is homogeneous), and that global balance should be quantifiable. The first assumption is necessary to simplify the task of load balancing -- and stick as much as possible to solving the bin packing problem. However, in reality the bins don't all get emptied at the same rates, and the rate is not constant. The capacity is however, infinite (i.e. a processor can work forever or at least until it can perform as expected) but that hardly changes the setting of the problem.

So in the bin packing problem, if you want to be able to fit in items of different sizes and keep the sum of the sizes in each bin minimal, you'd most probably think that there should be a simple algorithm to solve the problem. I have read about the bin packing algorithm, and a modification which actually involves resorting the bins as they are being filled. However, the assumption here would be that you know the sizes of the items beforehand, and that there is a finite number of items to be distributed among the bins. The assumption of "finiteness" is necessary, otherwise putting the items in any manner into the bins wouldn't matter much. The assumption of knowing the sizes of the items to be put in the bins in real life would sound normal -- however, in the computing world, you never know what you have at hand.

I chose the prime number finder as a test problem for my proposed algorithm because primarily of the difference in the amount of work being done as you move from one number to another. Maybe that didn't make sense, but I'll try to outline it as simply as I can.

To find the prime numbers from 1 to N (where N > 5), you can either use the sieve of Erasthotenes (I hope I spelled that correctly) OR you check each and every number from 1 to N whether that number is prime or not. If N was a really large number, then the sieve would be faster but would use more memory than if you checked the primality of each number. However, checking the primality of a number in the worst case would be in exponential time (there are other ways, but that is not my concern). Aside from that, checking whether x is prime (where 3 < x < N - 2 and is ODD) would take less time than say whether x+2 is prime.

To check whether x is prime, there would be three basic steps: 1) if 0 < x <= 3 then x is PRIME; 2) if x is even then it is COMPOSITE; 3) check whether x is divisible by every ODD number from 3 to the cieling of the sqare root of x (CIEL(SQRT(x))) -- if it is, then it is COMPOSITE otherwise it is PRIME. In C it would be:

int isprime(int x) {
int i;
if ( x <= 3 ) return 1;
if ( x % 2 == 0) return 0;
for (i = 3; i*i <= x; i+=2)
if ( x % i == 0 ) return 0;
return 1;
}

NOW, to know whether 99983 is prime would take more time than say knowing whether 16 is prime. Then think of all the numbers from 1 to 10 million to be the items to be distributed to P bins... do you sort the numbers per value, or would you be able to know beforehand the length of time it will take a number to be determined prime or not? How then would you distribute the numbers being the items among P bins such that you do the most work at the shortest amount of time (i.e. the distribution is optimal)?

Dividing the 10 million numbers evenly among the number of slaves in a master-slave architecture doesn't necessarily achieve global load balance because the slaves don't do the same amount of work knowing that they all have to deal with different numbers. A mesh wouldn't help either because as in the master slave model, each node in the mesh would have different numbers to deal with. And then we all know that half the numbers would be even, but the number of primes in an interval wouldn't be easy to calculate beforehand (i might be wrong, but I haven't thought it out yet).

So what am I getting at here? Based on results, automatic load balancing (giving a node some work, and let it do the work until it needs more, on which time it will be given more work to do) would be a feasible solution to similar problems BUT the number of nodes and the amount of numbers to give each node would have a significant effect on the overall performance of the system. Even distribution of items in problems like these do not necessarily yield the best results in terms of time and resource maximization (throughput).

And it's so nice to know that you have data to back up your claim.

Keep chilled... Until next time.

(Technical Paper pending, but should be available to interested parties.)

Comments

Popular posts from this blog

Appreciating Rizal...

Nope, this is not an academic post. More of a reflective and wrote-because-i-was-enlightened type post. Anyway, I just passed a paper on Rizal's notion of a nation according to Quibuyen (a local writer who devoted a book -- A Nation Aborted -- on his treatise on Rizal). Chapter 6 was an interesting read, and a definite eye opener. Rizal all of a sudden became interesting, especially to someone like me who could care less. It seems that most of what Rizal aims for and wrote about is still evident in today's Philippines as I see it. I wonder why I didn't get to appreciate Rizal and his work when I was still in high school -- might be the fault of the high school and the curriculum, or might be because I was still considerably immature then. I wasn't able to understand most of Rizal's writings though even if I got to reading them basically because they translated from Spanish to Filipino/Tagalog. I don't have problems with Tagalog, until you put it in writing. I

From FOMO to JOMO

Until very recently I believed that I needed to be on top of the latest news and happenings not only in my field (computer science and software engineering) but also in as many things as I can be on top of. This meant subscribing to all sorts of magazines, newsletters, YouTube channels, Twitch streamers, watching TV and live sport events, etc. — I was on top of a lot of the latest happenings, trends, news, interesting developments. I was having fun and I felt busy. What I did not feel was particularly effective nor productive. I felt like I was consuming so much information with the thought that it might be useful someday. When I was younger this wouldn’t have been an issue but I realised that ever since I’ve started taking stock of what I’ve been spending my time on, that a lot of it I’ve been spending just staying on top of things that I really didn’t need to be on top of. This article is about some of the realisations I’ve made in the course of exploring this issue of “FOMO” or

So much for that...

I just came home from the seminar regarding my proposed load balancing algorithm. I tried to get as candid as I can, but still half of what I said was jargon -- which made me explain the thing in layman's terms and using more colloquial examples. I was wearing a black suit, (chinese collared americana suit that is), gray slacks, black leather belt (perry ellis), and leather shoes (by bristol). I'm beginning to sound like a caption to a fashion mag's pic, but I digress... So there I was, waiting for the seminar to start. As a speaker, I conducted myself properly and tried to get things cleared out with my co-presentors. I was asuuming that they knew at least half of what they were supposed to talk about, and that they knew how to speak in front of a crowd. BUT NO... I sat through two presentors, the first one reading the presentation of the projection, and then doing no explaining whatsoever. I didn't get that because she prepared her own slides, and prepared the hand