Million Instructions Per Second (MIPS)

2025-06-22
3 min read

Overview

MIPS stands for “Million Instructions Per Second”—it’s a measure of computer performance that indicates how many machine instructions a processor can execute in one second.

Example

  • A vintage Intel 8086 CPU (1980s) → ~0.33 MIPS
  • A modern ARM Cortex-A76 → ~10,000+ MIPS (estimated)

Unlike FLOPS (which specifically measures floating-point operations), MIPS counts all types of processor instructions, including:

What’s an “Instruction”?

In this context, an instruction is a low-level command a CPU understands — like:

  • Data movement (loading/storing values)
  • Integer arithmetic (addition, subtraction with whole numbers)
  • Logic operations (AND, OR, NOT)
  • Control flow (jumps, branches, loops)
  • Memory operations

MIPS vs FLOPS

Table. MIPS vs FLOPS
MetricMeasuresTypical UseExample
MIPSInteger/machine instructionsGeneral-purpose computingOS, compilers, system apps
FLOPSFloating point operationsScientific/AI/ML workloadsAI models, physics, simulations

Why MIPS Isn’t Always Reliable

MIPS was historically a popular benchmark for comparing processor performance, especially in the 1980s and 1990s. However, it has some significant limitations:

  • Different instructions take different amounts of time to execute
  • Simple instructions (like moving data) execute faster than complex ones (like division)
  • Modern processors use techniques like pipelining and parallel execution that make instruction counting less meaningful
  • The mix of instruction types varies greatly between different programs

For example: a CPU might boast “2000 MIPS,” but that tells you little about how fast it renders video or trains a neural net.

Modern performance benchmarks tend to use more realistic measures like actual program execution times, FLOPS for scientific computing, or specialized benchmarks that test real-world workloads rather than just raw instruction throughput.

Modern Performance Benchmarks

There isn’t a single universal benchmark that replaced MIPS, but rather a variety of specialized benchmarks depending on the use case:

For General Computing Performance:

  • SPEC benchmarks (like SPEC CPU): These run actual application workloads including compression, compilation, scientific computing, and other real-world tasks
  • Geekbench: Popular cross-platform benchmark that tests CPU and memory performance with practical workloads
  • Cinebench: Focuses on rendering performance, useful for creative workloads

For Gaming and Graphics:

  • 3DMark: Tests GPU performance with game-like graphics workloads
  • Frame rates in actual games: Often the most practical measure for gaming performance

For AI/Machine Learning:

  • FLOPS: (especially for training neural networks)
  • MLPerf: Standardized benchmarks for ML training and inference across different hardware
  • Tokens per second: For language models

For Mobile Devices:

  • AnTuTu: Comprehensive mobile benchmark testing CPU, GPU, memory, and storage
  • Battery life tests: Under specific workloads

For Servers/Data Centers:

  • TPC benchmarks: (Transaction Processing Performance Council) for database performance
  • LINPACK: For high-performance computing (used to rank supercomputers)

The key shift has been from simple instruction counting to workload-based benchmarks that measure how fast systems complete actual tasks that users care about. This gives a much more realistic picture of real-world performance than abstract metrics like MIPS ever could.