Branch predictor facts for kids

Kids Encyclopedia Facts

A branch predictor is like a super-smart fortune teller inside a computer's microprocessor (the main brain of the computer). Its job is to guess which way a computer program will go next, especially when it reaches a "fork in the road" – like an if-then-else statement.

Think of a computer program as a recipe. Sometimes, the recipe says, "If the cake is golden, then take it out of the oven; otherwise, keep baking." The computer needs to know what to do next *before* it actually checks if the cake is golden. If it waits, it wastes time.

Figure 1: This picture shows how a computer's "pipeline" works. It's like an assembly line for instructions.

Modern computers work super fast by using an "assembly line" for instructions, called a pipeline (see Figure 1). Each step in the pipeline does a different part of processing an instruction. If the computer has to stop and wait for a guess to be confirmed, the whole assembly line pauses, which slows things down.

The branch predictor tries to guess the right path. If its guess is correct, the computer keeps working smoothly and quickly. If the guess is wrong, the computer has to throw away the work it started based on the wrong guess and restart with the correct path. This is called a branch misprediction, and it causes a small delay. In today's powerful microprocessors, this delay can be like losing 10 to 20 ticks of the computer's internal clock! So, having a good branch predictor is really important for making computers fast.

When a computer sees a new "fork in the road" instruction for the first time, the branch predictor doesn't have much to go on. But it remembers what happened last time. If it sees the same "fork" many times, it can use that history to make a better guess. For example, it might learn that a certain "fork" usually goes one way, or that it switches back and forth in a pattern.

Branch prediction is about guessing *which way* to go (taken or not taken). It's different from branch target prediction, which guesses *where* to go if a jump is taken. Often, these two types of guessing are combined in the same computer parts.

How Branch Predictors Work
- Static Branch Prediction
- Dynamic Branch Prediction
History
See also

How Branch Predictors Work

Static Branch Prediction

Static prediction is the simplest way to guess. It doesn't use any past information about how the program ran. Instead, it makes a guess based only on the instruction itself.

Early computers often used a simple static prediction: they always guessed that a "jump" instruction would not be taken. So, they would always get ready for the next instruction in line. If the jump *was* taken, the computer would then quickly switch to the correct new place.

A slightly smarter static prediction is to guess that "backward" jumps (which often happen in loops) will be taken, and "forward" jumps will not. This helps with loops, which usually repeat many times.

Some processors even let programmers add "hints" to the code to help the static predictor.

Dynamic Branch Prediction

Dynamic branch prediction is much smarter because it uses information gathered while the program is actually running. It remembers what happened before to make better guesses.

One-Level Branch Prediction

A simple dynamic predictor uses a "saturating counter." Imagine a tiny counter that remembers the last few times a branch was taken or not taken.

Figure 2: This diagram shows how a 2-bit counter changes its guess. It needs two "wrong" guesses in a row to completely change its mind.

A 1-bit counter is like a light switch: it just remembers the very last outcome (taken or not taken).
A 2-bit counter is better. It has four states (like "strongly not taken," "weakly not taken," "weakly taken," "strongly taken"). As shown in Figure 2, it needs to see the branch go the "other way" twice before it completely changes its prediction. This helps it avoid changing its mind too quickly if a branch just occasionally goes the opposite way.

These counters are stored in a special table, and the computer uses the instruction's location in memory to find the right counter for that specific "fork in the road."

Two-Level Predictor

Imagine if the decision at a "fork in the road" depends on what happened at the *last two* "forks." A two-level predictor can handle this! It's like remembering a pattern.

Figure 3: This shows how a two-level predictor works. It uses a "history" of past decisions to pick the best guess from a table.

This type of predictor remembers the history of the last few times a branch was taken or not taken. For example, if it remembers the last two outcomes (taken or not taken), it has four possible "history patterns" (like "not taken, not taken" or "taken, not taken"). For each of these patterns, it has a separate 2-bit counter (like the one in Figure 2).

So, if the history was "not taken, not taken," it uses one counter. If the history was "taken, not taken," it uses a different counter. This allows it to learn complex patterns, like a branch that is taken every third time (001001...). This smart method was invented by T.-Y. Yeh and Yale Patt in 1991 and is used in most modern microprocessors.

Local Branch Prediction

A local branch predictor keeps a separate history for *each* specific "fork in the road" instruction. This means it learns the unique behavior of each individual branch. For example, the Intel Pentium III processor used this method, remembering the last 4 outcomes for each branch.

Global Branch Prediction

Instead of a separate history for each branch, a global branch predictor uses one shared history for *all* "forks in the road" in the program. The idea is that sometimes the outcome of one branch might depend on what happened at a *different* branch earlier.

The challenge is that this shared history can get "diluted" with information from many different branches, which might not always be helpful. However, it can be very effective if branches often influence each other. Many modern processors, like those from AMD and Intel's Core series, use global branch prediction.

Hybrid Predictor

A hybrid predictor is like having several different types of fortune tellers, and then another "meta-fortune teller" that decides which of the other fortune tellers is usually the best at guessing for a particular situation. It combines the strengths of different prediction methods to get the best overall accuracy.

Loop Predictor

Loops are special parts of code that repeat many times. A "fork in the road" that controls a loop (like "keep going until you reach 100") has a very predictable pattern: it's taken many times, then not taken just once. A special loop predictor can easily spot this pattern and make very accurate guesses for loops.

Indirect Branch Predictor

Most "forks in the road" have only two paths. But an indirect jump instruction can choose from *many* different paths. Some processors have special predictors for these, often using a two-level adaptive predictor that can handle more complex choices.

Prediction of Function Returns

When a computer program calls a function (like a mini-program), it needs to remember where to come back to after the function is done. Many microprocessors have a special "return stack buffer" that helps predict where the program should return, making this process very fast.

Neural Branch Prediction

This is a very advanced type of branch prediction that uses artificial intelligence (AI) ideas, similar to how brains learn. It can learn very complex patterns and make highly accurate predictions. The first commercial use of a neural branch predictor was in AMD's Piledriver processors. This method is gaining popularity because it can use a lot of past information without needing a huge amount of computer resources.

History

The idea of predicting branches isn't new! The IBM 7030 Stretch computer, designed in the late 1950s, already tried to guess which way some branches would go. However, if it guessed wrong, it caused delays.

Two-bit predictors (like the 2-bit saturating counter we talked about) were introduced in the late 1970s.

Branch prediction became much more important in the 1990s with the rise of powerful superscalar processors like the Intel Pentium and DEC Alpha 21064. These processors could do many things at once, so a wrong guess could really slow them down.

In 2018, a major security issue called Spectre was discovered. It showed that clever attackers could sometimes use the leftover information from branch mispredictions to steal private data from computers. This led to new ways to protect against such attacks.