Architectural Support
Introduction
Hardware should support certain features to run OS properly!
I/O
- I/O devices and CPU can execute concurrently
- Each device controller is in charge of a particular device type
- Each device has a local buffer
- CPU issues specific commands to I/O devices
- CPU moves data between main memory and local buffers
Problem: I/O is too slow (sometimes, its time is unknown e.g. keyboards)
CPU is a precious resource, it should be freed from time-consuming tasks!
Interrupts
Instead of polling (periodically checking whether the issued command has been completed or not), we use (hardware) interrupts.
e.g. We want to read a file from the disk.
- CPU asks disk controller to read a file. (Then it runs other processes until interrupt happens.)
- Disk controller reads file.
- Disk controller read file completely and send interrupt signal to interrupt controller.
- Interrupt controller sends interrupt signal to CPU.
- Interrupt controller sends interrupt information (e.g. who interrupted) to CPU.
- CPU perform current instruction and preserves the state of the CPU. (e.g. registers, program counter)
- CPU determines the interrupt's type. (polling or vectored interrupt system)
- CPU transfers control to the ISR (interrupt service routine) or interrupt handler.
- ISR or interrupt handler transfers control back to the next instruction.
Some I/O devices are too fast!
To reduce context switching with interrupts, some user programs directly polls I/O hardware.
Data Transfer Modes
- PIO (Programmed I/O)
- CPU moves data between I/O devices and memory.
- Can use special I/O instructions or memory-mapped I/O.
- Problem: CPU can only use registers! CPU can only move up to 64bit.
- DMA (Direct Memory Access)
- Device controller transfers blocks of data from the local buffer directly to main memory without CPU intervention.
- Only one interrupt is generated per request (after DMA finished).
- We don't have to waste CPU time!
Protection
How to prevent user applications from harming the system?
- Application shouldn't access disk drives directly.
- Application shouldn't execute the privileged instructions. (e.g. HLT instruction stops the processor)
Priviledged instructions (protected instructions)
Only kernel mode can perform certain tasks.
e.g. direct I/O access, accessing system registers, memory state management (page table, TLB, etc)
How does the CPU know if a protected instruction can be executed?
The architecture supports several modes of operation.
- x86_64: Ring 0 (kernel) > Ring 1 > Ring 2 > Ring 3 (user)
Ring 1, 2 were ment to be used by device drivers, but using system call was too slow.
Most device drivers only uses Ring 0. - ARM: EL3 > EL2 > EL1 > EL0
- RISC-V: Machine (only when booting) > Supervisor (kernel) > User
Mode can be set by a status bit in a protected register.
Servicing Requests
Okay, OS prevents applications from performing certain tasks.
Then how do we ask services to the OS?
e.g. How can an application read a file if it cannot access disk drives?
System Calls
OS defines a set of system calls - a programming interface to the services provided by OS.
OS may reject an illegal request, impose a quota on a certain resources, or consider fairness when sharing resources.
A system call is a protected procedure call!
On entry, CPU switch to the kernel mode.
On exit, CPU switch back to the user mode.
Exceptional Events
- Interrupts
- Generated by hardware devices
- Asynchronous
- Exceptions
- Generated by software executing instructions
- Can be unintentional; e.g. divide by zero
- Can be intentional; e.g. syscall instruction
- Synchronous
- Exception handling is same as interrupt handling
- Generated by software executing instructions
Exceptions are called differently for each CPU.
e.g. In x86_64, exceptions are divided into three categories.
- Traps
- Intentional
- e.g. system call, breakpoint, special instructions, ...
- Return control to next instruction
- Faults
- Unintentional but possibly recoverable
- e.g. page faults, protection faults, ...
- Re-execute faulting instruction or abort
- Abort
- Unintentional and unrecoverable
- e.g. parity error, machine check, ...
- Abort the current program or halt the system
OS Trap
There must be a special trap instruction that:
- Causes an exception, which invokes a kernel handler (i.e. runs in kernel mode)
- Passes a parameter indicating which system call to invoke
- Save caller's state (e.g. registers, mode bits)
- Returns to user mode when done with restoring its state
- OS must verify caller's parameters
e.g. x86_64 have SYSCALL instruction
c.f. Monolithic kernel: Kernel provide all services: memory, file system, device driver, etc.
Microkernel: Kernel only support basic memory management, process scheduling, etc. File system, device driver is in user level.
Microkernel is safer (if file system has error, we can just reboot file system), but it is slower (message should be sent between kernel and file system, can't be done in single system call).
Control
How to take the control of the CPU back from the running program?
We need to run other processes, system calls, etc.
- Each application periodically transfers the control of the CPU to OS by calling various system calls.
- A special system call can be used just to release the CPU. (e.g. yield())
Problem) What if application doesn't call system calls?
What if a process ends up in an infinite loop?
Timers
A non-cooperative approach: Use a hardware timer that generates a periodic interrupt.
The OS is guaranteed to always get CPU back within a fixed time period!
The timer is privileged - Only the OS can load it!
e.g. Linux use 10ms for 2.4, 1ms for 2.6, 4ms for 5.5
Memory
Applications can't access hardware resources.
But applications can access (read/write) memory directly!!!
Theoretically, memory should be accessed by a system call, but that will make process too slow!
Also, since memory should be fast, hardware should support memory protection instead of software.
But if the OS has all the information, how can hardware determine whether memory access is valid?
Simplest Memory Protection
Use base/limit registers!
Each process can only read/write memory if the memory address is out of bound.
Can be useful in a simple embedded environment!
Virtual Memory
Modern CPU have dedicated memory management hardware! (MMU - Memory Management Unit)
MMU can provide more sophisticated memory protection mechanisms.
- Virtual memory
- Paging, page tables, page protections, TLBs
- Memory segmentation
Obviously manipulating MMU is a privileged operation
Synchronization
Problem) Interrupt can occur at any time and may interfere with the interrupted code.
How do we coordinate concurrent activities?
c.f. Heisenbug: Bug disappears or alter its behavior when one attempts to study it.
This is because thread timing is almost random!
Atomic Instructions
CPU supports atomic instructions which help with writing multithreaded code!
e.g. RISC-V has AMO (Atomic Memory Operation) instructions which can swap, add integer, bitwise and/or/xor directly to memory's value.
This makes implementing locks and mutexes easier!
But it is still possible to implementing them without atomic instructions...