CPU - ImaSystems.Engineer

Definition

A CPU (central processing unit) is the general-purpose processor that executes machine instructions and drives most of what we call “running a program”.

In most systems, the CPU is where control flow lives: branching, calling functions, handling interrupts, interacting with the OS, and coordinating I/O.

Related: core, thread (execution context), instruction set architecture (ISA), cache
Often contrasted with: GPU

What “runs on the CPU” means

When an executable starts, the operating system loads it and begins running its machine instructions on one or more CPU cores. At that point it’s a program: an executing instance with memory and state.

Key concepts (high level)

You don’t need microarchitecture expertise to use the CPU concept correctly, but it helps to know the main levers:

Cores and parallelism: modern CPUs have multiple cores. Execution can happen simultaneously across cores.
Caches: CPUs hide memory latency using cache hierarchies. Locality matters for performance.
Branching and prediction: unpredictable branches can stall pipelines. Control-flow heavy code behaves differently than straight-line numeric code.
Vector/SIMD instructions: CPUs can perform the same operation on multiple data elements at once, but typically at smaller scale than a GPU.

What “branchy” means (and why it matters)

When code is described as branchy, it means it contains lots of conditional decisions: if/else, switch, early returns, and data-dependent control flow.

Branchy is not automatically “bad”. It’s often just the natural shape of business logic. The important detail is predictability:

If the CPU can usually guess which branch will be taken (the same path repeats), it stays fast.
If branch outcomes vary unpredictably with input data, the CPU may mispredict and waste work, which can reduce throughput.

This is one reason the same algorithm can be fast on one workload and slow on another: the code is the same, but the branch behavior changes.

How the CPU relates to compilation and runtimes

Compilation and linking often produce a binary that contains machine code for a specific CPU architecture.
JIT compilation can generate CPU-specific machine code at runtime, based on profiling and execution behavior.
The runtime of a language may manage memory and execution details, but the CPU still executes the resulting machine instructions.

CPU vs GPU (the boundary)

A GPU is optimized for throughput on data-parallel workloads (many similar operations at once). A CPU is optimized for general-purpose control flow and orchestration. Many systems use both: CPU for coordination and “everything else”, GPU for specific parallel kernels.

Mini-scenario

If a service spends most of its time routing requests, parsing data, applying rules, and talking to the OS/network, it’s primarily CPU work (control flow + orchestration). If a small part of the service performs heavy numeric operations on large arrays (image transforms, matrix multiplications), that hotspot may be a candidate for GPU offload.