OS Notes: Processes & Threads
Processes and threads
Thread usage
- Only with threads we add a new element: the ability for the parallel entities to share an address space and all of its dataamong themselves.
- They are lighter weight than processes and easier (faster) to create and destroy than processes.
- When there is substantial computing and also substantial I/O, having threads allows these activities to overlap, thus speeding up the application.
- Useful on systems with multiple CPUs, where real parallelism is possible.
Processes are used to group resources together, threads are the entities scheduled for execution on the CPU.
per-process items | per-thread items |
---|---|
address space global variables open files child processes pending alarms signals and signal handlers accounting information | program counter registers stack state |
User space threads
Pros
- A user-level threads package can be implemented on an operating system that does not support threads.
- With instructions that store all the registers and load them all, the entire thread switch can be done in a handful of instructions.
- Invoking thread_yield is more efficient than making a kernel call. No trap is needed, no context switch is needed, the memory cache need not be flushed.
- They allow each process have its own customized scheduling system.
- Kernel threads invariably require table space and stack space in the kernel.
Cons
- How blocking system calls are implemented concerns since they will stop all the threads.
- Always using nonblocking system calls require changes to the operating system and will require changes to many user programs.
- Invoking select to tell in advance if a call will block requires rewriting parts of the system call library, and is inefficient and inelegant.
- If a thread causes a page fault, the kernel, unaware of even the existence of threads, blocks the process until the disk I/O is complete, even though other threads might be runnable.
- No other thread in the same process will ever run unless the first thread voluntarily gives up the CPU.
- Having the run-time system request a clock signal once a second to give it control is crude and messy.
- Some signals are logically thread specific. The kernel can hardly direct the signal to the right one.
Threads in the kernel
The kernel’s thread table hold each thread’s registers, state, and other information.
Due to the greater cost of creating and destroying threads in the kernel, some systems take an environmentally correct approach and recycle their threads.
Pros
- Kernel threads do not require any new, nonblocking system calls.
- Kernel can easily check to see if the process any other runnable threads if one thread causes a page fault.
Cons
- The cost of a system call is substantial.
- What happens when a multithreaded process forks concerns.
- If two or more threads register for the same signal what happens when a signal comes in?
Making single-threaded code multithreaded
Global variables, such as erroro maintained by UNIX, break consistency.
Assign each thread its own private global variables. (thread_local)
Many library procedures are not reentrant.
To rewrite the entire library is nontrivial with a real possibility of introducing subtle errors.
Provide each procudure with a jacket that sets a bit to mark the library as in use.
A process with multiple threads must also have multiple stacks. If the kernel is not aware of all the stacks, it cannot grow them automatically upon stack fault.
When to schedule
- When a new process is created.
- When a process exits.
- When a process blocks on I/O or some other reason.
- When an I/O interrupt occurs.
- At each clock interrupt or at every k-th clock interrupt.
Scheduling in batch systems
- First-come, first-served
- Shorted job first
- Shorted remaining time next
Scheduling in interactive systems
- Round-robin scheduling
- Priority scheduling
- Multiple queues
- Shortest process next
- Guaranteed scheduling
- Lottery scheduling
- Fair-share scheduling