13 The Abstraction: Address Spaces

13.3 The Address Space

The address space of a process contains all of the memory state of the running program (e.g. code, stack, heap…)

13.4 Goals

Goals of the design:

  • Transparency
  • Efficiency
  • Protection

14 Interlude: Memory API

15 Mechanism: Address Translation

15.3 Dynamic (Hardware-based) Relocation

When a program is loaded, the OS sets base and bounds registers.

physical address = virtual address + base

15.4 Hardware Support: A Summary

The required hardware support:

  • privileged mode
  • base/bounds registers
  • translate virtual address and check if within bounds
  • privileged instructions to update base/bounds registers
  • privileged instructions to register exception handlers
  • ability to raise exceptions

15.6 Summary

The problem: all the space between stack and heap is wasted (called internal fragmentation)

16 Segmentation

16.1 Segmentation: Generalized Base/Bounds

Segment: contiguous portion of the address space of a particular length.

Segmentation allows the OS to place segments in differnet parts of physical memory.

The hardware structure in the MMU: three base and bounds register pairs.

16.2 Which Segment Are We Referring To?

The explicit approach: chop up the address space into segments based on the top few bits of the virtual address.

The implicit approcah: the hardware determins the segment by noticing how the address was formed.

16.3 What About The Stack?

The difference: it grows backwords.

Use a bit in the hardware to indicate the sign of the offset.

16.4 Support for Sharing

16.5 Fine-grained vs. Coarse-grained Segmentation

16.6 OS Support

The operation to support segmentation:

  • On context switch: save and store segment registers
  • Support growing heap segment (sbrk())
  • Handle external fragmentation (physical memory quickly becomes full of little holes of free space, making it difficult to allocate new segments, or to grow existing ones)

17 Free-Space Management

17.1 Assumptions

Internal fragmentation: the allocator hands out chunks of memory bigger than that requested

17.2 Low-level Mechanisms

Splitting and Coalescing

Coalescing: coalesce spliited fragments when they are contiguous

Tracking The Size Of Allocated Regions

Most allocators store a little bit of extra information in a header block which is kept in memory to keep track of the size of the allocated region.

typedef struct {
    int size;
    int magic;  /* for integrity checking */
} header_t;

Embedding A Free List

Growing The Heap

17.3 Basic Strategies

Unfortunately, because the stream of allocation and free requests can be arbitrary (after all, they are determined by the programmer), any particular strategy can do quite badly given the wrong set of inputs.

Best Fit

The best chunk: the smallest in candidates.

Cost: heavy performance overhead

Worst Fit

The opposite to the best fit strategy.

First Fit

Returns the first qualified block.

Problem: pollution at the beginning of the free list; approach: address-based ordering.

Advantage: speed.

Next Fit

Keep an eatra pointer to the last-searched location; the blocks are distributed more evenly than first-fit strategy.

17.4 Other Approaches

Segregated Lists

Keep a separate list just to manage objects of some popular sizes; all other requests are forwarded to a more general memory allocator.

Buddy Allocation

The binary buddy allocator: free memory is thought of as one big space of $2^N$; the search for free space recursively divides free space by two.

Suffers from internal fragmentation.

Free process: recursively coalescing blocks upwards.

Simple to determine the buddy of a particular block: each buddy pair (at the same level) only differs by a single bit.

18 Paging: Introduction

18.1 A Simple Example And Overview

Page table: a per-process data structure to store address translations.

18.3 What’s Actually In The Page Table?

Some bits:

  • valid bit (supporting a sparse address space; x86 doesn’t have it)
  • protection bit
  • present bit (whether a page has been swapped out)
  • dirty bit (whether a page has been modified)
  • refence bit (whether a page has been accessed)

Don’t worry: it definitely gets worse, because the mechanisms we are about to introduce only complicate this already complex machinery. Sorry!

19 Paging: Faster Translations (TLBs)

When we want to make things fast, the OS usually needs some help. And help often comes from the OS’s old friend: the hardware.

Translation-lookaside buffer (or TLB)

19.3 Who Handles The TLB Miss?

  • CISC: uses hardware-managed TLBs
  • RISC: uses software-managed TLBs

The software approach: more flexible and simple; need to avoid infinite chain of TLB misses.

19.4 TLB Contents: What’s In There?

The TLB typically has 32, 64 or 128 entries, and is fully associated (any given translation can be anywhere in the TLB).

19.5 TLB Issue: Context Switches

Some hardware systems provide an address space identifier (ASID) in the TLB to identify the associated process.

19.6 Issue: Replacement Policy

One common approach is to evict the least-recently-used or LRU entry.

Another typical approach is to use a random policy, which evicts a TLB mapping at random.

19.7 A Real TLB Entry

How to handle ‘ASID overflow’?

In MIPS, some TLB slots can be reserved for the OS.

20 Paging: Smaller Tables

Here, the problem should be clear: simple linear (array-based) page tables are too big.

20.2 Hybrid Approach: Paging and Segments

Whenever you have two reasonable but different approaches to something in life, you should always examine the combination of the two to see if you can obtain the best of both worlds.