How CPUs are Designed and Built

What is going on inside with the billions of transistors that make your gear work?

What Does a CPU Actually Do?

This includes processor cores, the memory hierarchy, branch prediction, and more.

First, we need a basic definition of what a CPU does.

It only understands 1s and 0s, so we need a way to represent code in this format.

This is the set of instructions that the CPU is built to understand and execute.

Some of the most common ISAs are x86, MIPS, ARM, RISC-V, and PowerPC.

These ISAs can be broken up into two main categories: fixed-length and variable-length.

This is different from x86, which uses variable-length instructions.

In x86, instructions can be encoded in different ways and with different numbers of bits for different parts.

People generally believe there are a few thousand x86 instructions, but the exact number isn’t public.

Despite differences among the ISAs, they all carry essentially the same core functionality.

Now we are ready to turn our computer on and start running stuff.

There are many types, including arithmetic instructions, branch instructions, and memory instructions.

Most modern processors are 64-bit, which means that the size of each data value is 64 bits.

For example, addition is very fast, while division or loading from memory may take hundreds of cycles.

Rather than stalling the entire processor while one slow instruction finishes, most modern processors execute out-of-order.

In addition to out-of-order execution, typical modern processors employ what is called asuperscalar architecture.

It may also be waiting on hundreds more to begin their execution.

so you can execute many instructions at once, processors will have several copies of each pipeline stage inside.

One common implementation of this is called Simultaneous Multithreading (SMT), also known as Hyper-Threading.

To accomplish this carefully choreographed execution, a processor has many extra elements in addition to the basic core.

The two biggest and most beneficial are the caches and the branch predictor.

What sets caches apart, though, is their access latency and speed.

Even though RAM is extremely fast, it is orders of magnitude too slow for a CPU.

Without caches, our processors would grind to a halt.

Processors typically have three levels of cache that form what is known as amemory hierarchy.

Above the caches in the hierarchy are small registers that store a single data value during computation.

These registers are the fastest storage devices in your system by orders of magnitude.

If it is, the data can be quickly accessed in just a few cycles.

If it is not present, the CPU will check the L2 and subsequently search the L3 cache.

The caches are implemented in a way that they are generally transparent to the core.

The L2 caches are usually a few hundred kilobytes.

Branch instructions are similar to “if” statements for a processor.

These branch instructions are extremely common and can make up roughly 20% of all instructions in a program.

To address this issue, all modern high-performance processors employ a technique calledspeculation.

This means the processor keeps track of branch instructions and predicts whether a branch will be taken or not.

These branch predictors are among the earliest forms ofmachine learning, as they adapt to branch behavior over time.

If a predictor makes too many incorrect guesses, it adjusts to improve accuracy.

Decades of research into branch prediction techniques have led to accuracies exceeding 90% in modern processors.

The now-infamous Spectre attack exploits speculative execution bugs in branch prediction.

Attackers can use specially crafted code to trick the processor into speculatively executing instructions that leak sensitive memory data.

The architecture of modern processors has advanced dramatically over the past few decades.

Innovations and clever design have resulted in more performance and a better utilization of the underlying hardware.

That being said, the fundamental principles of how processors work remain consistent across all designs.

This overview and first part of the series covers most of the basics of how processors work.