Explain the physical and virtual address caches

Physical vs Virtual Address Caches (Easy Explanation)

In advanced computer architecture, a cache stores recently used data so the CPU can access it faster than going to main memory. The key design choice is which address the cache uses: the virtual address (before translation) or the physical address (after translation). This choice impacts speed, complexity, and correctness.

Key Terms (Quick Refresher)

  • Virtual address (VA): The address generated by the CPU as a program runs.
  • Physical address (PA): The real location in RAM after address translation.
  • TLB (Translation Lookaside Buffer): A small cache that quickly translates VA to PA.
  • Page: A fixed-size block of virtual memory; common size is 4 KB.
  • Page offset: The lower bits of the address that do not change during translation.
  • Cache indexing/tagging: Bits used to choose a set (index) and identify a line (tag).

Physically Addressed Cache (PIPT: Physically Indexed, Physically Tagged)

How it works: The CPU first translates the virtual address to a physical address using the TLB, then uses the physical address to index the cache and compare tags.

  • Advantages:
    • No synonym/alias problems (one physical address maps to exactly one cache line).
    • Simpler coherence with other cores and DMA (everything is in physical space).
    • Strong protection correctness (permissions checked on physical mapping).
  • Disadvantages:
    • Extra latency: must get the physical address first (TLB must be accessed before or alongside the cache).
    • Potentially longer critical path for L1, which needs to be very fast.
  • Typical use: Lower levels like L2/L3 caches are often PIPT.

Virtually Addressed Cache (VIVT: Virtually Indexed, Virtually Tagged)

How it works: Uses the virtual address directly to index and tag the cache. Translation can happen later, after the cache lookup.

  • Advantages:
    • Very low access latency (no translation needed on the critical path).
    • Simpler hit path for small, fast caches.
  • Disadvantages:
    • Synonym/alias problem: Two different virtual addresses can refer to the same physical memory; both could end up as separate cache lines with inconsistent data.
    • Homonym problem: The same virtual address in different processes may refer to different physical locations; context switches can cause wrong hits unless the cache is flushed or tagged with ASIDs.
    • Complex invalidation and OS/hardware policies needed (e.g., cache flush on remap).
  • Typical use: Rare in modern general-purpose CPUs due to correctness and maintenance overhead.

Hybrid Design: VIPT (Virtually Indexed, Physically Tagged)

How it works: Index the cache using bits from the page offset (which are the same in VA and PA), so indexing can occur in parallel with the TLB lookup. After translation returns the PA, compare the physical tag for correctness.

  • Advantages:
    • Fast access like VIVT (parallel TLB and cache indexing).
    • Correctness like PIPT (physical tags avoid homonyms; fewer synonyms).
  • Constraint (important): The number of index bits must fit inside the page offset so that indexing is translation-independent.
Formulas:
Block offset bits = log2(cache line size)
Page offset bits  = log2(page size)
Max index bits    = Page offset bits - Block offset bits

Example:
Page size  = 4 KB  => 12 page offset bits
Line size  = 64 B  => 6 block offset bits
Max index bits = 12 - 6 = 6 => up to 64 sets without crossing page boundaries

If associativity = 8 ways:
Max L1 size ≈ sets × ways × line size = 64 × 8 × 64 B = 32 KB
  • Typical use: Most modern CPUs implement L1 caches as VIPT to get speed and correctness. L2/L3 are often PIPT.

Design Trade-offs and Practical Considerations

  • Latency: VIVT and VIPT can start lookup before translation; PIPT may add TLB latency unless optimized.
  • Synonyms and homonyms:
    • VIVT suffers both; requires flushes, page coloring, or ASIDs.
    • VIPT with physical tags fixes homonyms; synonyms are reduced if the index stays within page offset. Remaining synonyms can be handled by OS policies (page coloring) or hardware checks.
    • PIPT avoids both by design.
  • Context switches:
    • VIVT may need cache flushes unless using ASIDs (Address Space IDs).
    • VIPT and PIPT usually avoid full flushes; tags or PAs disambiguate processes.
  • Coherence and I/O: Physical caches (PIPT, VIPT with PA tags) simplify coherence and DMA because all identities are in physical space.
  • Security and protection: Physical tagging ensures cache hits respect permissions checked during translation.

Where Each Fits in a Modern CPU

  • L1 caches: Commonly VIPT to allow parallel TLB and cache lookup with physical tag verification; size often limited (e.g., around 32 KB with 4 KB pages and 64 B lines) unless high associativity or larger pages are used.
  • L2/L3 caches: Typically PIPT for simplicity and scalability.
  • Pure VIVT: Mostly historical or used in specialized/embedded designs with careful OS support.

Quick Summary

  • PIPT (physical cache): Simple and correct; slightly slower for L1 without parallel tricks.
  • VIVT (virtual cache): Fast but complex due to synonyms/homonyms; needs ASIDs/flushes.
  • VIPT (hybrid): Best of both for L1: fast indexing with physical tags; size constrained by page offset.