Distributed Computing | The Daily Programmer

Distributed computing, as the name suggests, involves complex computational tasks distributed over several computers. These computers are network-connected for communication and message sharing, and may not belong to a single organisation or region. Distributed computing is a very broad concept, and is loosely defined to describe any network of any type or number of computers connected to perform individual tasks assigned to them to eventually achieve a larger goal. There is usually a master-worker hierarchy. For instance, SETI@home, Search for Extra Terrestrial Intelligence, allows anyone with a computer to contribute their processor's capability to the project. In a distributed computing environment, it is not necessary that only physical computers are connected, we can instead have multiple processes on a single machine, or even multiple threads of a single program.

Amdahl's Law

Also known as Amdahl's argument, it is used to find the maximum expected improvement to an overall system when only part of the system is improved. It is used in parallel computing to predict theoretical maximum speed up using multiple processors. Basically, we can use the law to probe what amount of the computation is feasible to be parallelized, and what portion of it cannot be economically parallelizable.

The law is based on the property that any program consists of two parts - parallel portion and sequential portion. The parallel portion is, as obvious from the name, parallelizable, while the sequential portion is non-parallelizable.

Let A be the program, P be the parallelizable portion, then 1 - P is the non-parallelizable portion. If the number of processors is N, and execution time of A is 1 unit, then

Time to execute sequential portion = 1 - P

Time to execute parallel portion = P/N

Maximum Speedup = 1 / { (1-P) + (P/N) }

As N approaches infinity, S = 1 / (1 - P )

Try calculating S for various values of P, such as 10%, 50%, 90%.

Based on Amdahl's Law, only highly parallel programs are suitable for parallel computing, as, stated by calculations of the computational speed up S, it is clear that a parallel application cannot be run faster than its sequential portion. This law is applicable when N is a variable, while size is constant. When the size is not constant, we apply the Gustafson's Law.

Gustafson's Law

Time for executing sequential version of the same program in a single processor machine is 1 - P + P.N

Maximum speedup = S = 1 - P + P.N

Let sp = 1 - P = sequential portion, then S = sp + (1 - sp).N

As N approaches infinity, assuming the sequential portion diminishes, then S = N.

Add Comment

distributed computing

Wednesday, 27 January 2016

The Daily Programmer

Related Posts