Amdahl's Law and The Maximum Theoretical Speedup Achievable By Parallelism

We’ve been learning about theoretical parallelism in my concepts of parallel and distributed systems class, and I found this concept of a maximum theoretical speedup to be fascinating.

So, I decided to write an article on it.

What is Amdahl’s Law?

Any program can be divided into two parts:

Parallel – This is the portion of the program that can be run simultaneously
Sequential – This is the portion of a program that has to be run sequentially (line by line)

Restaurant Analogy

Imagine you run a restaurant with one person cooking.

So, you are the owner, and you have one chef.

That chef works pretty hard.

But, what if you multiply that chef by 10?

So, you borrow my clone machine and now you’ve got 10 chefs cooking in parallel.

It would be a lot faster than just having one person cooking, wouldn’t it?

You’d expect to get 10x more dishes cooked, due to the 10x more people, right?

We call that “10x more” the speedup.

The speedup is essentially the answer to the question of: “If we add more chefs, how much faster will we get food made?”

* Note that in reality, having 10x more chefs does not necessarily imply that we’ll get 10x more output. This is because in reality, things happen and these 10 more chefs will have to communicate with one another (having only 1 chef didn’t require this communication), spend time getting ingredients, and all sorts of other things that can happen when you have a room with 10 chefs making dishes.

But, it will definitely be faster to have 10 more chefs than to have just 1.

What if we had 100 chefs?

What if we had 1000?

Well, as you add more and more chefs, at some point, adding any more won’t help as much, due to the law of diminishing marginal utility.

That is, 1,000 chefs in a room may have to communicate with each other (imagine how inefficient it would be)!

Maximum Theoretical Parallelism in Computer Science

Now, instead of chefs, let’s say that we have a program, where 50% of it can be parallelized and 50% must remain sequential.

This means that 50% of the program can run concurrently, via multithreading.

Remember how we were able to add 10x more chefs? That’s similar to saying “we can run 50% of the program (the parallelized portion) at the same time in 10 different processors.”

But what if we had 100 different processors? What if we had 1,000?

Just like the chefs, adding more processors will help, but it will create additional overhead.

Just as the chefs have to communicate with each other, the program might have to communicate with many threads.

This communication introduces overhead, which makes the program run slower.

One question is “theoretically, just how many more processors can we add until we can’t speed the program up anymore?”

To answer this question, I introduce to you Amdahl’s Law.

In the context of parallelism in computer science, Amdahl’s Law states that there is a maximum theoretical speedup.

Now, how much we can speedup our program depends on $p$ , the number of processors.

Having 100 processors run our program will yield a faster result than just having 2.

Mathematically,

\displaystyle\psi = \frac{1}{f + \frac{(1-f)}{p}}

Where: $\psi$ is the maximum theoretical speedup

$f$ is the proportion of the program that must remain sequential.

$1-f$ is the proportion of the program that can be parallelized.

In our program from before, we said that 50% can be parallelized and 50% must remain sequential.

Thus, according to Amdahl’s Law, we have:

\psi = \displaystyle\frac{1}{f + \frac{(1-f)}{p}} =\frac{1}{f + \frac{(1-0.5)}{p}}

Say we have $p = 2$ processors.

Then,

\psi = \displaystyle\frac{1}{f + \frac{(1-f)}{p}} =\frac{1}{0.5 + \frac{(1-0.5)}{2}} = \frac{4}{3}

This is around 1.33x.

This means that, if we have a program where $0.5=50%$ can be parallelized, the maximum theoretical speedup we can with $p = 2$ processors that we have is 1.33x.

In simple words, if we applied parallelism into 50% of our code, with 2 processors, it will be at most 133% faster.

Implications of Amdahl’s Law

Amdahl’s Law with 50% parallelism

Now, let’s try to use Amdahl’s Law and some single variable calculus to understand this law a bit better.

\psi = \displaystyle\frac{1}{f + \frac{(1-f)}{p}}

Let’s continue with the proportion of sequential code, $f = 0.5$ .

But what if we try to increase the number of processors, $p$ ?

With $p = 2$ , we had $\psi$ roughly equal to 1.33.

What if we let p = 10?

With 10 processors running parallel on 50% of our code, the maximum theoretical speedup is:

\psi = \displaystyle\frac{1}{f + \frac{(1-f)}{p}} =\frac{1}{0.5 + \frac{(1-0.5)}{10}} \approx 1.81

What if $p = 100$ ?

With 100 processors running parallel on 50% of our code, the maximum theoretical speedup is:

\psi = \displaystyle\frac{1}{f + \frac{(1-f)}{p}} =\frac{1}{0.5 + \frac{(1-0.5)}{100}} \approx 1.98

With 1000 processors running parallel on 50% of our code, the maximum theoretical speedup is:

\psi = \displaystyle\frac{1}{f + \frac{(1-f)}{p}} =\frac{1}{0.5 + \frac{(1-0.5)}{1000}} \approx 1.99

Clearly,

\displaystyle\lim_{p\to\infty} \psi = 2

That means that, theoretically, if we let the number of processors, $p$ approach $\infty$ , the maximum theoretical speed up, according to Amdahl’s Law, is 2.

In other words, even if we had an infinite number of processors, we could theoretically increase the speed of our program by 2x.

Increasing the Parallel Portion

Now, what if we were to increase the part of our code that could be run in parallel?

Let’s say that the proportion of sequential code, $f = 0.1$ , and therefore $f = 0.9$ .

Thus, 10% of the code can run sequentially, but 90% of it can run in parallel.

With $p = 2$ processors…

\psi = \displaystyle\frac{1}{f + \frac{(1-f)}{p}} =\frac{1}{0.1 + \frac{(1-0.1)}{2}} \approx 6.66

Wow!

That means that, with 2 processors, if 90% of the code is parallelized, we can speed up our program by 6.66x!

Previously, with 2 processors and 50% parallelism, our speedup was only 1.33.

With 1000 processors running parallel on 90% of our code, the maximum theoretical speedup is:

\psi =\displaystyle \frac{1}{f + \frac{(1-f)}{p}} =\frac{1}{0.1 + \frac{(1-0.1)}{1000}} \approx 9.99

As you can see,

\displaystyle\lim_{p\to\infty} \psi = 10

Key Takeaways of Amdahal’s Law

Thus, Amdahl’s Law teaches us that increasing the proportion of code that may be parallelized results in a higher maximum theoretical speedup than trying to cram more processors $p$ .

In other words, making your program more parallel compatible is much better than trying to cram in more processors.

Thanks for reading!

Post Views: 5