Uni-processor and Parallelization using Threads

It was a fine evening in IIT Kanpur. We 5 of us from Samsung Software India Center, Noida are here for a traning in “Program Optimization for Multi-core Architecture”.

A small discussion has started on the use of threads (actually by using OpenMP: for those who dont know about OpenMp, it is a set of processor directives to make your program parallel by invocation of threads on Multi-Processors if any) and using them to parallelize the code already implemented on an Uni-Processor machine (asserting again: the disussion is on parallelization on Uni-Core and not on multi-core).

There were arguments and counter arguments saying that by using the threads you can speed up any of the serial (sequential program). In that discussion I was the only person to counter argue. I was strongly of the belief that, threads will be helpful to speedup your program if there are any I/0 happening in your program. If not sequential program is always faster than the thread programming.
The argument supporting this fact was the thread switching will always consume some processor cycles (may be more in the order of 4 to 100 micro secs.).

After some heated discussion we all agreed upon one thing. It can be formulated as

“An 100% CPU intensive program cannot be speeded up just by parallelizing it given all other conditions like memory access latency, CPU speed for the sequential and parallel programs remains constant”.

This is obvious because the overhead involved in doing the context switch will obviously reduce the performance (at least by little!!!).

It is one of the important observations when deciding whether to make the sequential programs parallel on an Uni-Processor environment. If the programs are roughly both IO and CPU intensive then they will be the ideal candidates to be implemented using threads (hence parallelize them).