Operating Systems

CSS 430 - Operating Systems
Neso Operating Systems Video

Note

A modern general-purpose computer system consists of one or more CPUs and a number of device controllers connected through a common bus that provides access to shared memory

Pasted image 20230201134737.png

System Calls

System calls provide an interface to the services made available by an operating system

Two modes

user mode
privileged mode (kernel mode) Direct access to resources

When a user program makes a system call, for an instant, it's switched from user mode to kernel mode so it can use those resources (context switching)

The 6 types of System Calls Are:

process control
file management
device management
information maintenance
communications
protection

Interprocess Communication

Any process that shares data with another process is a cooperating process
Pasted image 20230201151446.png

Shared Memory

In the shared memory model, a region of memory that is shared by cooperating processes is established
Processes can then exchange information by reading and writing data to the shared region

Typically, a shared memory region resides in the address space of the process creating the shared memory segment
Other processes that wish to communicate using this shared-memory segment must attach it to their address space
Normally, an operating system will try to prevent one processes from accessing another processes memory
Shared memory requires that two or more processes agree to remove any restriction on this

Producer Consumer Problem

A producer process produces information that is consumed by a consumer process

We need to make the producer and consumer so that the consumer doesn't try to consume what the producer has not consumed yet

To allow producer and consumer processes to run concurrently, we must have a buffer of items available that can be filled by the producer and emptied by the consumer.
The buffer will reside in a region of memory that is shared by the producer and consumer processes
A producer can produce one item while the consumer is consuming another item
The producer and consumer must be synchronized so that the consumer does not try to consume an item that has not been produced yet

Two Kinds of Buffers

Bounded: Assumes a fixed buffer size. In this case, the consumer must wait if the buffer is empty, and the producer must wait if the buffer is full. Fixed buffer size
Unbounded: No practical limit on the size of the buffer. The consumer may have to wait for new items, but the producer can always produce new items.

Message Passing

Communication takes place by means of messages exchanged by the cooperating processes

Message passing provides a mechanism to allow processes to communicate and synchronize their actions without sharing the same address space and is particularly useful in a distributed environment where the communicating processes may reside on different computers connected by a network

Provides at least 2 operations

send(message)
receive(message)

Messages sent can be either fixed or variable in size

Fixed Size: The system level implementation is straightforward, however the task of programming is more difficult because you always have to keep the size of the message in mind
Variable Size: Requires a more complex system level implementation, however programming becomes simpler

If processes P and Q want to communicate, they must send messages to and receive messages from each other.
Pasted image 20230202131757.png
A communication link bust exist between them

There are several methods for logically implementing a link, and the send()/receive() operations, like

direct or indirect communication
Synchronous or Asynchronous
Automatic or explicit buffering

Direct or Indirect Communication

Processes can either use direct or indirect communication to communicate with each other

Direct Communication

each process must explicitly name sender or recepient

send(p, message)
receive(q, message)
A link is established between every pair of processes that want to communicate, the processes need to know only each other's identity to communicate.

A link is associated with exactly two processes
Between each pair there is exactly one link

A disadvantage of direct communication is the limited modularity of the resulting process definitions. Changing the identifier of a process may necessitate examining all other process definitions

Indirect communication

Messages are sent to and received from mailboxes or ports

a mailbox can be viewed abstractly as an object into which messages can be placed by processes and from which messages can be removed
each mailbox has a unique id
two processes can communicate only if the processes have a shared mailbox
send(a, mailbox)
receive(a, mailbox) where a is the name of the mailbox

A communication link in this scheme has the following properties:

a link is established between a pair of processes only if both members of the pair have a shared mailbox
a link may be associated with more than two processes
between each pair of communicating processes, there may be a number of different links, each with a corresponding mailbox

The answer depends on which of the following methods is chosen
allow a link to be associated with two processes at most (sender and receiver)
allow at most one process at a time to execute a receive operation
allow the system to select arbitrarily which process will receive the message. The system may also define an algorithm for selecting which process will receive the message (such as round robin). The system may identify the receiver to the sender.

A mailbox may be owned by either a process or the operating system., however, when the process terminates, the mailbox will disappear.

message passing may be either blocking or non-blocking also known as synchronous and asynchronous

Blocking Send: the sending process is blocked until the message is received by the receiving process or by the mailbox.
Blocking Receive: the receiver blocks until a message is available

Nonblocking send: the sending process sends the message and then resumes operation
Non Blocking Receive: the receiver retrieves either a valid message or a NULL

Buffering

Whether communication is direct or indirect, messages exchanged by communicating processes reside in a temporary queue. Such queues can be implemented in three ways:

Zero Capacity: The queue has a max length of 0, thus, the link cannot have any messages waiting in it. In this case, the sender must block until the recipient receives the message. You can only have one message going from sender to receiver
Bounded Capacity: The queue has a finite length n, thus, at most n messages can reside in it. If a queue is not full when a new message is sent, the message is placed in the queue and the sender can continue execution without waiting. The links capacity is finite, however. If the link os full, the sender must block until space is available in the queue
Unbounded capacity, the queue length is potentially infinite, thus the sender never blocks.

Sockets

A socket is an endpoint for communication

A pair of processes communicating over a network employ a pair of sockets - one for each process

A socket is identified by an IP address concatenated with a port number

The server waits for incoming client requests by listening to a specified port. Once a request is received, the server accepts a connection from the client socket to complete the connection

The packets traveling between hosts are delivered to the appropriate process based on the destination port number

Threads

A thread is a basic unit of execution
A thread is comprised of:

a thread ID
a program counter
a register set
a stack

It shares it's code, data, and file sections with its parent process and other threads belonging to that process

Multithreading Benefits

Responsiveness

Multithreading an interactive application may allow a program to continue running even if part of it is blocked or is performing a lengthy operation, thereby increasing responsiveness to the user

By default, threads share the memory and the resources of the process to which they belong. The benefit of sharing code and data is that it allows an application to have several different threads of activity within the same address space

Economy

Allocating memory and resources for process creation is costly. Because threads share the resources of the process to which they belong, it is more economical to create and context switch threads

Utilization of multiprocessor architecture

The benefits of multithreading can be greatly increased in a multiprocessor architecture where threads may be running in parallel on different processors. A single threaded process can only run on one CPU no matter how many are available. Multi-threading on a multi cpu machine increases concurrency

Multi threading models

In general, there are two types of threads:

User threads: supported above the kernel and are managed without kernel support
Kernel Threads: supported and managed directly by the operating system

For user threads to work with kernel threads, there must exist a relationship between them.

There are three common ways of establishing this relationship

Many-to-one Model

Pasted image 20230203160855.png

Advantages

maps many user-level threads to one kernel thread
thread management is done by the thread library in user space

Disadvantages

the entire process will block if a thread makes a blocking system call
Because only one thread can access the kernel at a time, multiple threads are unable to run in parallel on multiprocessors

One-to-one Model

Pasted image 20230203161127.png

Advantages

maps each user thread to a kernel thread.
Provides more concurrency than the many-to-one model by allowing another thread to run when a thread makes a blocking system call
also allows multiple threads to run in parallel on multiprocessors

Disadvantages

Creating a user thread requires creating the corresponding kernel thread
Because the overhead of creating kernel threads can burden performance, most implementations of this model restrict the number of threads supported by the system

Many-to-many model

Pasted image 20230203161434.png

Implemented in most systems

Advantages

multiplexes many user-level threads to a smaller or equal number of kernel threads
the number of kernel threads may be specific to either a particular application or a particular machine
Developers can create as many user threads as necessary and the corresponding kernel threads can run in parallel on a multiprocessor
When a thread performs a blocking call, the kernel can schedule another thread for execution

Scheduling

In a single-processor system, only one process can run at a time. All others must wait until the cpu is free and can be rescheduled.

The objective of multiprogramming is to have some process running at all times to maximize utilization

A process is executed until it must wait, typically for the completion of some I/O request

Several processes are kept in memory at one time

When one process has to wait, the operating system takes the CPU away from that process and gives the CPU to another process and this pattern continues

CPU and I/O Burst Cycles

Process execution consists of a cycle of CPU execution and I/O wait. Processes alternate between these two states.

When a process has begun execution, it's under CPU state, otherwise it's waiting for an I/O event

CPU Burst: the time when a process is under CPU execution

I/O burst: when the process is waiting on an I/O event

Pasted image 20230203164525.png

Eventually, the final CPU burst ends with a system request to terminate execution

Preemptive and NonPreemptive Scheduling

CPU Scheduler

Whenever the CPU becomes idle, the operating system must select one of the processes in the ready queue to be executed. The selection process is carried out by the short-term scheduler(or CPU scheduler). The scheduler selects a process from the processes in memory that are ready to execute and allocates the CPU to that process

Dispatcher: The dispatcher is the module that gives control of the CPU to the process selected by the short-term scheduler. The time it takes for the dispatcher to stop one process and start another running is known as the dispatch latency

Preemptive Scheduling

CPU- Scheduling may take place under the following four circumstances

When a process switches from running to waiting
When a process switches from running to ready (for example, when an interrupt occurs)
when a process switches from waiting to ready
when a process terminates

For the first and last circumstance, there is no choice in terms of scheduling, we call this nonpreemptive, or, when a process is running, it will be allowed to use the CPU until its execution is complete

Scheduling Criteria

CPU Utilization

We want to keep the CPU as busy as possible. Conceptually, CPU utilization can range from 0 to 100 percent. In a real system, it should range from 40 percent to 90 percent

Throughput

The number of processes completed in a certain time unit. If the CPU is busy, the work is being done.

Turnaround Time

The time it takes from a process entering a ready queue to it's completion

Turn Around Time = Completion Time - Arrival Time

Waiting Time

The total sum of time a process spends waiting in the ready queue

Waiting time = Turn Around Time - Burst Time

Response time

The time it takes from submission until the first response is produced

FCFS (First Come First Serve)

Simplist Algo
The process that requests the CPU first is allocated the CPU first
easily implemented with a FIFO queue
Once a process is allocated to the CPU, it is removed from the head of the queue
the average time is fairly long
In many algorithms, the process with the smaller id breaks a tie

P5 Turnaround time = 8 (13-5)
P5 Waiting Time = 4(8-4)

(table is helpful)

SFJ

When the CPU is available, it is assigned to the process that has the next smallest remaining CPU burst
if the next CPU bursts are the same, FCFS scheduling is used to break tie
Can be preemptive or non preemptive

Waiting time = Total Waiting Time - No. of milliseconds Process Executed previously - arrival time

Preemptive SJF
Pasted image 20230204165320.png
Pasted image 20230204165612.png
![[Pasted image 20230204165951.png

The difficulty with SJF is knowing the length of the next CPU request
We try to approximate the next CPU burst.

Priority Scheduling

A priority is associated with each process, and the CPU is allocated to the process with the highest priority.
Equal-priority process are scheduled in FCFS order
An SJF algorithm is simply a priority algorithm where the priority is the inverse of the predicted next CPU burst, the larger the burst, the lower the priority
can be preemptive or non-preemptive
a non preemptive just puts the higher priority job at the front of the queue

Starvation is a problem with priority scheduling. In this instance, a lower priority process is indefinitely blocked. Some systems increase the priority of processes if they haven't been accessed recently (aging)

Round Robin Scheduling

Similar to FCFS but preemption is added to switch between processes
A small unit of time, a time quantum (or slice) is defined
the ready queue is treated as a circular queue
The scheduler goes around the ready queue allocating the CPU to each process for a time period of up to 1 time quantum

Pasted image 20230204174623.png

the ready queue is a FIFO queue of processes
new processes are added to the tail of the queue
the cpu scheduler picks the first process from the ready queue, sets a timer to interrupt after 1 time quantum, and dispatches the process
once the timer interrupts again, the scheduler will remove the process if it hasn't completed, move it to the back of the queue, and continue on

Turnaround time and waiting time for Round Robin

Pasted image 20230204180619.png

Pasted image 20230204180753.png

One method for calculating
Turn around time = completion time - arrival time
waiting time = Turnaround time - burst time

Or we can use
Waiting time = final start time - arrival time - (preemption x time quantum) where preemption = num times the process was preempted

P1 was preempted
Pasted image 20230204181957.png

Multilevel Queue

A class of scheduling algorithms has been created for situations in which processes are easily classified into different groups

Example: Foreground processes and Background processes.

these have different response time requirements and different scheduling needs
Additionally, foreground processes may have priority over background processes

The Multilevel queue partitions the ready queue into several separate queues

Each queue may have a different scheduling algorithm such as Round Robin for foreground and First come First serve for background

Multilevel Feedback queue

In multilevel feedback queue, processes may move from one queue to another. You separate them according to the characteristics of their CPU bursts. If a process uses too much CPU time, it will be moved to a lower priority queue

This leaves the I/O bound and interactive processes in the higher-priority queues

a process that waits too long in a lower priority queue may be moved to a higher priority queue

In general, A MFQ is defined by:

number of queues
the scheduling algorithm for each queue
the method use to determine when to upgrade a process to a higher priority queue
the method used to determine when to demote a process to a lower priority queue
the method used to determine which queue a process will enter when that process needs service

Process Synchronization

A cooperating process is one that can affect or be affected by other processes executing in the system

Cooperating processes can either

directly share a logical address space (that is, both code and data)
or be allowed to share data only through files or messages

This leads to a problem where concurrent access to shared data may result in data inconsistency

Shared memory systems is a good example of the shared data problem

To understand this, we need to have an understanding of how this happens at the machine level (assembly)

When several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which the access takes place, it is called a race condition

Since we don't want race conditions, we need process synchronization.

To deal with this we can use a few methods

Critical Section Problem

Consider a system with n processes (N1, N2....Nn)

Each process has a segment of code called a critical section in which the process may be changing common variables, updating a table, writing a file, and so on

When one process is executing in its critical section, no other process is allowed to execute in its critical section.

That is, no two processes are executing in their critical sections at the same time

The critical-section problem is to design a protocol that the processes can use to cooperate

Each process must request permission to enter its critical section
the section of code implementing this request is the entry section
the critical section may be followed by an exit section
the remaining code is the remainder section

Pasted image 20230205145343.png

A solution to the critical-section problem must satisfy the following three requirements

Requirements

Mutual Exclusion: If a process P is executing in its critical section, then no other processes can be executing in their critical sections.
Progress: If no process is executing in its critical section and some processes wish to enter their critical sections, then only those processes that are not executing in their remainder sections can participate in the decision on which will enter its critical section next, and this selection cannot be postponed indefinitely.
Bounded Waiting: There exists a bound, or limit on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is accepted (otherwise it may lead to starvation)

Peterson's Solution

Petersons solution is a 'humble' algorithm

a classic software-based solution to the critical section problem
may not work correctly on modern architectures
provides good description of solving the critical-section problem

The solution is restricted to two processes that alternate execution between their critical sections and remainder sections, we will call them $P_{i}$ and $P_{j}$
Peterson's solution also requires two data items to be shared between the two processes:

int turn: indicate whose turn it is to enter its critical section, this is a shared variable
boolean flag[2]:used to indicate if a process is ready to enter its critical section

When a process is ready to enter its critical section, it will set its flag to true.
When process $P_{i}$ wants to enter its critical section, it will set its turn variable to $j$

When one process wants to enter its critical section, it offers its turn to the other process

Structure of $P_{i}$ in Petersons's solution

do {
 flag[i] = true;
 turn = j;
 while(flag[j] && turn ==[j]); // this semicolon keeps this secion looping while it's true, so only when it's false does it move to the critical section
	 //critical section
flag[i] = false;
	//remainder section
}while(TRUE);

Structure of $P_{j}$ in Petersons's solution

do {
 flag[j] = true;
 turn = i;
 while(flag[i] && turn ==[i]); // this semicolon keeps this secion looping while it's true, so only when it's false does it move to the critical section
	 //critical section
flag[i] = false;
	//remainder section
}while(TRUE);

Test and Set Lock

a hardware solution to the synchronization problem
there is a shared lock variable which can take either of the two values, 0 or 1 (0 = unlocked, 1 = locked)
before entering into a critical section, a process inquires about the lock
If it is locked, it keeps waiting until it becomes free
if it is not locked, it takes the lock and executes the critical section

Test and Set definition

boolean TestAndSet(boolean *target){
boolean rv= *target;
*target = TRUE;
return rv;
}

This is an atomic operation

Example

lock is always 0 initially, and we are passing lock by reference

process p1

do{
while(TestAndSet(&lock));//if the while condition is false, it will continue processing, however, we set the lock value to true so other processes cannot take the lock
//do nothing
//critical section
lock = FALSE; 
//remainder section
} while(TRUE)

Advantages

Satisfies mutual-exclusion

Disadvantages

It does not satisfy bounded waiting

Semaphores

Semaphores were proposed by Edsger Dijkstra is a technique to manage concurrent processes by using a simple integer value, which is known as a semaphore
A Semaphore is simply a variable which is non-negative and shared between threads. This variable is used to solve the critical section problem and to achieve process synchronization in the multiprocessing environment
A semaphore S is a shared integer variable that, apart from initialization is accessed only through two standard atomic operations, wait() and signal()
wait() is denoted by $P$ which means "to test"
signal() is denoted by $V$ which means "to increment"
All modifications to the integer value of the semaphore in the wait() and signal() operations must be executed indivisibly, that is, when one process modifies the semaphore value, no other process can simultaneously modify that same semaphore value(Atomic Instruction)

Wait() signal

P(Semaphore S){
while (S<= 0); 
//no operation...forced to wait until S>0
S--;
}

It will only decrement S if it is greater than 0

Signal

V(Semaphore S){
S++;
}

increments the value of S

Binary Semaphore

The value of a binary semaphore can only range between 0 and 1. On some systems, binary semaphores are known as mutex locks, as they are locks that provide mutual exclusion

Counting Semaphores

The value of a counting semaphore can range over an unrestricted domain. It is used to control access to a resource that has multiple instances.

Disadvantages of Semaphores

The main disadvantage of a semaphore is that it requires "busy waiting".

while a process is in its critical section, any other process that tries to enter its critical section must loop continuously in the entry code
Busy waiting wastes CPU cycles that some other processes might be able to use productively
This type of semaphore is also called a spinlock because the process spins(loops while trying to decrement) while waiting for the lock

To overcome the need for busy waiting, we can modify the definitions of wait() and signal()?

When a process executes the wait() operation and finds that the semaphore value is not positive, it must wait
However, rather than engaging in busy waiting, the process can block itself

the block operation places a process into a waiting queue associated with the semaphore, and the state of the process is switched to the waiting state
the control is transferred to the CPU scheduler, which selects another process to execute

This still may lead to deadlocks and starvation.

The implementation of a semaphore with a waiting queue may result in a situation where two or more processes are waiting indefinitely for an event that can be caused only by one of the waiting processes
when two processes are waiting for each other during the signal operation, the processes are said to be deadlocked

Bounded Buffer Problem

There is a buffer of n slots and each slot is capable of storing one unit of data.

There are two processes running, namely a producer and consumer which are operating on the buffer

Pasted image 20230205171847.png

The producer tries to insert data into an empty slot in the buffer
The consumer tries to remove data from a filled slot in the buffer
The producer must not insert data when the buffer is full
The consumer must not remove data when the buffer is empty
the producer and consumer should not insert and remove data simultaneously

We can make use of three semaphores to solve this problem.

m(mutex): a binary semaphore which is used to acquire and release the lock
empty: a counting semaphore whose initial value is the number of slots in the buffer, since, initially all slots are empty
full: a counting semaphore whose initial value is 0 (how many slots are occupied)

Producer

do{
wait(empty)//wait until empty>0 then decrement empty
wait(mutex)//acquire lock
//add data to buffer
signal(mutex)//release lock
signal(full)//increment full
}while(TRUE)

Consumer

do{
wait(full)//wait until full>0
wait(mutex)//acquire lock
//remove data from buffer
signal(mutex)//release lock
signal(empty)//increment empty
}

Monitors

monitors are a high level abstraction which provide an effective mechanism for process synchronization

A monitor type presents a set of programmer defined operations that provide mutual exclusion within the monitor
the monitor type also contains the declaration of variables whose values define the state of an instance of that type, along with the bodies of procedures or functions that operate on those variables

Monitor Syntax

monitor monitor_name{
//shared variable declarations  shared between processes
procedure P1(...){

}
procedure P2(...){

}
procedure PN(...){

}

initialization code(...){

}
}

A procedure within a monitor can access only those varaiables declared locally within the monitor and its formal parameters
the local variables of a monitor can be accessed only by the local procedures
the monitor construct ensures that only one process at a time can be active within the monitor

Monitors also need a condition construct which are declared in the style condition x,y;

The only operations that can be invoked on a condition variable are wait() and signal()
the operation x.wait() means that the process that is invoking this opeartion is suspended until another process invokes x.signal()
The x.signal() operation resumes exactly one suspended process

Schematic View of a Monitor

Pasted image 20230315135836.png

Monitor with condition variables

Pasted image 20230315140200.png

Dining-Philosophers Solution Using Monitors

The monitor solution imposes the restriction that a philosopher may pick up their chopsticks only if both are available

To code the solution, we need to distinguish among three states in which we might find a philosopher, enum{thinking,hungry,eating} state[5] the array is for the state of 5 philosophers which can be in any of the above states
Philosopher i can set the variable state[i] = eating only if his two neighbors are not eating (state[(i+4]%5 !=eating) && (state[(i+1)%5] !=eating)
we also need to declare condition self[5] where philosopher i can delay himself when he is hungry but is unable to obtain the chopsticks he needs

monitor dp{
	enum{THINKING,HUNGRY,EATING} state[5];
	condition self[5];

	void pickup(int i){
		state[i] = HUNGRY;
		test(i);
		if(state[i]!=EATING)
		self[i].wait()
	}
	void putdown(int i){
		state[i] = THINKING;
		test((i+4)%5);
		test((i+1)%5);
	}
	void test(int i){
		if((state[(i+4)%5] != EATING)&& (state[(i+1)%5] != EATING) && state[i] == HUNGRY ){
		state[i]= EATING;
		self[i].signal();
		}
	}
	initialization(){
	for(int i=0; i<5; i++){
		state[i]= THINKING;
		}
	}
}

Write processes.cpp where:

A parent process writes a message to:
cout << “okay?” << endl;
But stdout is redirected to pipe 1(0). dup2(fd[0][1],1)

A child process read the above message from
cin >> message;
But stdin is redirected to pipe 1(0). //std in = dup2(x,0)
dup2(fd[0][0],0)

A child process then writes a response to:
cout << “yes!” << endl;
But stdout is redirected to pipe 2(1). //std out = dup2(x,1)
dup2(fd[1][1],1)

A parent process reads the above response from
cin >> response;
But stdin is redirected to pipe 2.
dup2(fd[1][0],0)

Then, the parent write this response to stderr like:
cerr << response;

#include <sys/types.h>   // for fork, wait                                                                                       
#include <sys/wait.h>    // for wait                                                                                             
#include <unistd.h>      // for fork, pipe, dup, close                                                                           
#include <stdio.h>       // for NULL, perror                                                                                     
#include <stdlib.h>      // for exit                                                                                             

#include <iostream>      // for cout                                                                                             

using namespace std;

int main( int argc, char** argv ) {
  int fd[2][2];   // fd[0][] parent-->child; fd[1][] child-->parent                                                              
  int pid;        // process ID                                                   
  char response[10], message[10];

  if ( pipe( fd[0] ) < 0 ) { // parent wants to create pipe fd[0]                                                                
    perror( "pipe fd[0] error" );
    exit( -1 );
  }
  if ( pipe( fd[1] ) < 0 ) { // parent also wants to create pipe fd[1]                                                           
    perror( "pipe fd[1] error" );
    exit( -1 );
  }
  if ( ( pid = fork( ) ) < 0 ) {
    perror( "fork a child: error" );
    exit( -1 );
  }
  if ( pid > 0 ) { // parent                                                                                                     
    close( fd[0][0] ); //close parent-->child read pipe as it's not used
    dup2( fd[0][1], 1 ); //dup2 redirect pipe write 0 to stdout

    close( fd[1][1] ); //close child--> parent write pipe as it's not used
    dup2( fd[1][0], 0 );//redirect read of pipe 1 to std in

    cout << "okay?" << endl;
    cin >> response;
    cerr << response;
  }
  else { // child                                                                                                                
    close( fd[0][1] ); //close parent-->child write pipe
    dup2( fd[0][0], 0 ); //duplicate parent-->child read pipe to stdin

    close( fd[1][0] ); //close child--> parent read pipe
    dup2( fd[1][1], 1 );//redirect write pipe of childparent to std out

    cin >> message;
    cout << "yes!" << endl;
  }
}

Deadlocks

In a multiprogramming environment, several processes may compete for a finite number of resources.

A process requests resources, and if the resources are not available at that time, the process enters a waiting state.
Sometimes, a waiting process is never again able to change state because the resources it has requested are held by other waiting processes. This situation is known as a Deadlock

System model

A system consists of a finite number of resources to be distributed among a number of competing processes.
The resources are partitioned into several types, each consisting of some number of identical instances.

Under a normal mode of operation, a process may utilize a resource only in the following sequence.

Request: If the request cannot be granted immediately(for example, if the resource is being used by another process) then the requesting process must wait until it can acquire the resource.
use: The process can operate on the resource(for example, if the resource is a printer, the process can print on the printer)
Release: the process releases the resource.

Deadlock Characterization

In a deadlock, processes never finish executing and system resources are tied up preventing other jobs from starting

Necessary conditions for deadlocks:

A deadlock situation can arise if the following four conditions hold simultaneously in a system

Mutual Exclusion: At least one resource must be held in a non-sharable mode, or, only one process at a time can use the resource. Any process that requests that resource must be delayed until the resource has been released.
Hold and wait: A process must be holding atg least one resource, and waiting to acquire additional resources that are currently being held by other processes.
No preemption: Resources cannot be preempted. A resource can only be released voluntarily by the process holding it. After that process has completed it's task.
Circular Wait: A set { $P_{0}, P_{1}, . . . P_{n}$ } of waiting processes must exist such that $P_{0}$ is waiting for a resource held by $P_{1}$ , $P_{1}$ is waiting for a resource held by $P_{2}$ ..., $P_{n - 1}$ is waiting for a resource held by $P_{n}$ .

Resource Allocation Graph

Deadlocks can be described more precisely in terms of a directed graph called a system resource-allocation graph.

This graph consists of a set of $V$ vertices and a set of edges $E$ . The set of vertices $V$ is partitioned into two different types of nodes, $P = P_{1}, P_{2}, . . . P_{N}$ , the set consisting of all active threads in the system, and $R = R_{1}, R_{2}, . . . R_{M}$ , the set of all resource types in the system.

A directed edge from thread $P_{i}$ to resource type $R_{j}$ is denoted by $P_{i} \to R_{J}$ . it signifies that thread $P_{i}$ has requested an instance of resource type $R_{j}$ and is currently waiting for that resource. A directed edge of from resource $R_{j}$ to thread $P_{i}$ is denoted by $R_{j} \to P_{i}$ . It signifies that an instance of resource type $R_{j}$ has been allocated to thread $P_{i}$ . A directed edge $P_{i} \to R_{j}$ is called a request edge and a directed edge $R_{j} \to T_{i}$ is called an assignment edge.

Generally we represent each thread $T_{i}$ as a circle and each resource type $R_{j}$ as a rectangle

Since resource $R_{j}$ may have more than one instance, we represent each instance as a dot within the rectangle. A request edge only points to the rectangle, however, an assignment edge points from one of the dots(resources) in the rectangle.
Pasted image 20230213192632.png

If the graph contains no cycles, then no process in the system is deadlocked. HOWEVER, if a graph contains a cycle, a deadlock may exist.

Methods for handling Deadlocks

In general, there are three ways we can handle deadlocks:

Prevention: We can use a protocol to prevent or avoid deadlocks, ensuring that the system will never enter a deadlock state
Recovery: We can allow the system to enter a deadlock state, detect it, and recover.
Ignore it: We can ignore the problem altogether, and pretend deadlocks never occur in the system. (This is used in most operating systems)

To ensure deadlocks never occur, the system can either use a deadlock prevention, or a deadlock avoidance scheme

Deadlock Prevention

Provides a set of methods to ensure at least one of the criteria for deadlock cannot hold

Deadlock Avoidance

Requires the operating system be given in advance, additional information concerning which resources a process will request during its lifetime.
With this knowledge, it can decide for each request whether or not the process should wait.

Recovery

The system can provide an algorithm that examines the state of the system to determine whether or not a deadlock has occurred and an algorithm to recover from the deadlock.

Ignorance

If a system neither ensures that a deadlock will never occur nor provides a mechanism for deadlock detection and recovery then we may arrive at a situation where the system is in a deadlocked state yet has no way of recognizing what has happened.

In this case, the undetected deadlock will result in the deterioration of the systems performance, because resources are being held by processes that cannot run and because more and more processes, as they make requests for resources, will enter a deadlocked state.

Eventually, the system will stop functioning and will need to be restarted manually.

Deadlock Prevention

For a deadlock to occur, all four of the following must be present:

Mutual Exclusion
Hold and Wait
No Preemption
Circular Wait
By ensuring any one of the above cannot happen, you can prevent Deadlocks.

Mutual Exclusion

The mutual condition must hold for non-sharable resources
eg: A printer cannot be simultaneously shared by several processes.

Sharable resources, in contrast, do not require mtuually exclusive access and thus cannot be involved in a deadlock.

For example, Read-only files. If several processes attempt to open a read-only file, there will be no problem.

In general we cannot prevent deadlocks by denying the mutual exclusion condition because some resources are inherently non-sharable

Hold and Wait

To ensure the hold-and-wait condition never occurs, we must guarantee that whenever a process requests a resource, it does not hold any other resources.

One protocol that can be used requires each process to request and be allocated all its resources before it begins execution.

An alternative protocol allows a process to request resources only when it has none. A process may request some resources and use them. Before it can request any additional resources, however, it must release all the resources that it's currently allocated.

Both of these protocols have two main disadvantages

Resource Utilization may be low
Starvation is possible

No Preemption

To ensure this condition does not hold, we can use the following protocol:

If a process is holding some resources and requests another resource that cannot be immediately allocated to it, then all resources currently being held are preempted

Alternatively, if a process requests some resources, we first check whether they are available. If they are, we allocate the. If they are not, we check whether they are allocated to some other process that is waiting for additional resources. If so, we preempt the desired resources from The waiting process and allocate them to the requesting process.
This protocol is often applied to resources whose state can be easily saved and restored later such as registers and memory space.

Circular Wait

A way to ensure that this condition never holds is to impose a total ordering of all resource types and to require that each process requests resources in an increasing order of enumeration.

Example: Suppose some process, P1, is allocated resource R5. Now, if P1 requests for resources R3 and R4 (which are less than R5), the requests will NOT be granted. Only requests for resources that are greater than R5 will be granted.

Developing a resource hierarchy does not itself prevent deadlocks. It is up to developers to write programs that follow that ordering.

Deadlock Avoidance

An alternative method for avoiding deadlocks is to require additional information about how resources are to be requested.

For example: In a system with one tape drive and one printer, the system might need to know that process P will request the tape drive and then the printer before releasing both resources, whereas process Q will request first the printer and then the tape drive.
With this knowledge of the complete sequence of requests and releases for each process, the system can decide for each request whether or not the process should wait in order to avoid a possible future deadlock

Each request requires that in making this decision the system consider the resources currently available, the resources currently allocated to each process, and the future requests and releases of each process.

Safe State: A state is safe if the system can allocate resources to each process(up to its maximum) in some order and still avoid a deadlock.

A system is in a safe state only if there exists a safe sequence. A sequence of processes < $P_{1}, P_{2}, . . . P_{n}$ is a safe sequence for the current allocation state if, for each $P_{i}$ the resource requests that $P_{i}$ can still make can be satisfied by the currently available resources plus the resources held by all of $P_{i}$ .

A safe state is not a deadlocked state, however not all unsafe states are deadlocked.

For example, lets say you have 12 resources and three threads, $< T_{0}, T_{1}, a n d T_{2} >$ . Thread $T_{0}$ requires a maximum of 10 resources, $T_{1}$ a maximum of four, and $T_{2}$ a max of 9. Suppose that at time $t_{0}$ , thread $T_{0}$ is holding 5 resources, $T_{1}$ is holding 2, and $T_{2}$ is holding 2, leaving us with 3 free resources ( $5 + 2 + 2 = 9$ ).

	Maximum Needs	Current Needs
$T_{0}$	10	5
$T_{1}$	4	2
$T_{2}$	9	2

at time $t_{0}$ the system is in a safe state. The sequence $< T_{1}, T_{0}, T_{2} >$ satisfies the safety condition. Thread $T_{1}$ can be allocated 2 resources, complete it's task at which point there will be 5 available resources. thread $T_{0}$ can then get it's necessary 5 resources, complete it's task, and return them which would give us 10 available resources. Finally $T_{2}$ can acquire its needed resources and return them giving the system all 12 resources back.

A system can go from safe to unsafe pretty easily. For instance, if at time $t_{1}$ thread $T_{2}$ is allocated one more resource, there will not be enough resources to complete task $T_{0}$ after $T_{1}$ completes
Additionally, If thread $T_{2}$ had requested and been allocated 6 resources, we would have had to wait, which would have resulted in deadlock. Therefore the mistake is in granting additional resources to thread $T_{2}$ . Instead we should have made it wait.

Resource Allocation Graph Algorithm

If we have a resource-allocation system with only one instance of each resource type, we can use a variant of the resource-allocation graph for deadlock avoidance. In addition to the request and assignment edges already described, we can introduce a new type of edge, called a claim edge. A claim edge $T_{i} \to R_{j}$ indicates a thread $T_{i}$ may request resource $R_{j}$ at some time in the future. This edge resembles a request edge in direction but is represented in the graph by a dashed line. When thread $T_{i}$ requests resource $R_{j}$ the claim edge $T_{i} \to R_{j}$ is converted to a request edge. Similarly, when resource $R_{j}$ is released by $T_{i}$ the assignment edge $R_{j} \to T_{i}$ is reconverted to a claim edge $T_{i} \to R_{j}$ .

The resources must be claimed a priori in the system, or before thread $T_{i}$ starts executing, all its claim edges must already appear in the resource allocation graph. We can relax this condition by allowing claim edge $T_{i} \to R_{j}$ to be added to the graph only if all the edges associated with thread $T_{i}$ are claim edges.

Suppose that thread $T_{i}$ requests $R_{j}$ , the request can be granted only if converting the edge $T_{i} \to R_{j}$ to an assignment edge $R_{j} \to T_{i}$ does not result in the formation of a cycle in the resource-allocation graph. We check for safety by using a cycle-detection algorithm. An algorithm for detecting a cycle requires an order of $n^{2}$ operations where $n$ is the number of threads in the system.

If no cycle exists, the allocation of resources will leave the system in a safe state. If a cycle is found, the allocation will put the system in an unsafe state. in that case, thread $T_{i}$ will have to wait for its request to be satisfied

Example:

Pasted image 20230214152919.png
Suppose that $T_{2}$ requests $R_{2}$ . Although $R_{2}$ is currently free, we cannot allocate it to $T_{2}$ since this action will create a cycle in the graph. A cycle indicates that the system is in an unsafe state. If $T_{1}$ requests $R_{2}$ and $T_{2}$ requests $R_{1}$ then a deadlock will occur

An unsafe state in the resource-allocation graph

Pasted image 20230214153408.png

This only happens when there is one instance of a resource

Memory Management

The main purpose of a computer system is to execute programs

During executions, programs, together with the data they access, must be in main memory (at least partially).

Main Memory

In CPU scheduling, we see how the CPU can be shared by a set of processes. As a result of this, we see that CPU utilization, as well as speed of the computers response to its users can usually be improved

To realize this improvement however, we must keep several processes in memory. Memory must be shared**

Basic Hardware

Memory consists of a large array of words or bytes, each with its own address
A CPU fetches instructions from memory according to the value of the program counter
These instructions may cause additional loading from and storing to specific memory addresses

A typical instruction-execution cycle first fetches an instruction from memory, the instruction is then decoded and may cause operands to be fetched from memory. After the instruction has been executed on the operands, results may be stored back in memory

A CPU can only access from:

registers
main memory
There are machine instructions that take memory addresses as arguments but none that take a disk address
Therefore, any instructions in execution and any data used by the instructions must be in either registers or main memory. If data is not in memory, they must be moved there before the CPU can operate on them.

Cycle time for accessing memories

Registers: Accessible within one cycle of the CPU clock. Most CPUs can decode instructions and perform simple operations on register contents at the rate of one or more operations per clock tick
Main Memory: Access may take many cycles of the CPU clock to complete. Normally, the CPU must stall, since it does not have data required to complete the instruction before executing
Generally, another fast memory is placed between main memory and the registers. This is called the cache

Protection of OS from unauthorized access

the operating system must be protected from access by user processes. Additionally, user processes must be protected from one another.
This protection must be provided by the hardware.

To do this, we ensure that each process has a separate memory space. To do this, we need the ability to determine the range of legal addresses that the process may access and to ensure that the process can easily access only these legal addresses

This is done with two registers. The base register and the limit register.

Base Register: Points to the bottom of the address space (Smallest Legal Physical Memory Address)
Limit Register: Holds the size of the address space(Specifies the size of that range)

The base and limit registers can only be loaded by the operating system which uses a special privileged instruction

Since privileged instructions can only be executed in kernel mode, and since only the operating system operates in kernel mode, only the operating system can load the base and limit registers.

This scheme allows the operating system to change the value of the registers but prevents user programs from changing the registers contents.

Address binding

Usually a program resides on a disk as a binary executable file

To be executed, the program must be brought into memory and placed within a process

Depending on the memory management in use, the process may be moved between the disk and memory during its execution

The processes on the disk that are waiting to be brought into memory for execution form the input queue

Flow goes like this:
input queue -> select a process -> loads it into memory. -> executes and accesses instructions and data from memory -> process terminates ->memory space is declared available

Most systems allow a user process to reside in any part of the physical memory

Though the address space of a computer starts at 00000, the first address of the user space does not need to start there

In most cases, a user program will go through several steps during compile time, load time, execution time before being executed.

Source program: addresses are generally symbolic
Compiler: typically binds these symbolic addresses to relocatable addresses (such as 14 bytes from beginning of this module)
Linkage or Loader: Binds the relocatable address to absolute address

Logical Vs Physical Address Space

Logical Address: An address generated by the CPU
Physical Address: An address seen by the memory unit, the one loaded into the memory address register of the memory.

We often refer to logical addresses as Virtual Addresses

Logical Address Space: The set of all logical addresses generated by a program.
Physical Address Space The set of all physical addresses corresponding to these logical addresses

How do we map these?
The runtime mapping from virtual to physical addresses is done by a hardware device called the Memory-Management Unit(MMU)

Dynamic Relocation using a Relocation Register

Pasted image 20230215214211.png

Dynamic Loading

The entire program and all data must be in physical memory for a process to execute. This means that the size of a process is limited by the size of physical memory
To obtain better memory-space utilization we can use dynamic loading

With dynamic loading a routine is not loaded until it's called.

All routines are kept on disk in a relocatable load format

The main program is loaded into main memory and executed

When a routine needs to call another routine, the calling routine first checks to see whether the other routine has been loaded

If it has not, the relocatable linking loader is called to load the desired routine into memory and update the program's address tables to reflect this change.

Then control is passed to the newly loaded routine.

Dynamic loading does not require special support from the operating system. It is the responsibility of the users to design their programs to take advantage of such a method

Dynamic Linking and Shared Libraries

When a program runs, it needs to use certain system libraries

Static Linking: system language libraries are treated like any other object module and are combined by the loader into the binary program image. This is very wasteful of disk space and main memory
Dynamic Linking: The system check to see if the needed routine is already in memory. If not, the routine is loaded into memory. It's similar to dynamic loading, but it's postponed until execution time.

Windows uses DLL files, or dynamic linked libraries.

With dynamic linking, a stub is included in the image for each library routine reference.

stub: a small piece of code that indicates how to locate the appropriate memory-resident library routine or how to load the library if the routine is not already present.

when a stub is executed, it check to see if the needed routine is already in memory, if not, the program loads the routine into memory.

Either way, the stub replaces itself with the address of the routine and executes the routine

This means, the next time that particular code segment is reached, the library routine is executed directly, incurring no cost for dynamic linking.

Swapping

A process must be in memory to be executed

However, a process can be swapped temporarily out of memory to a backing store and then brought back into memory for continued execution.

Note

A backing store is a computer storage device, usually a disk, that provides additional storage space for information so that it can be accessed and referred to when required and may be copied into the processor if needed.

Example:

In a multiprogramming environment with a round-robin CPU scheduling algorithm:
When a quantum expires, the memory manager will start to swap out the process that just finished and to swap another process into the memory space that has been freed.
The processes removed from scheduling may be placed in the backing store
Pasted image 20230220182511.png

Memory Space when Swapping Happens

Normally, a process that is swapped out will be swapped back into the same memory space it previously occupied
However, this heavily relies on the address binding method

If the binding is done at assembly or load time, the process can not be easily moved to a different location
If binding is done at execution time, the process can be swapped into a different memory space because physical addresses are computed during execution time.

Backing Store

The backing store is commonly a fast disk drive

It must be large enough to accommodate copies of all memory images for all users
It must provide direct access to these memory images
The system maintains a ready queue consisting of all processes whose memory images are on the backing store or in memory and are ready to run
Whenever the CPU scheduler decides to execute a process, it calls the dispatcher
The dispatcher then check to see whether the next process in the queue is in memory
If not, the dispatcher swaps out a process currently in memory and swaps in the desired process.
It then reloads registers and transfers control to the selected process.

Memory Allocation

We divide memory into several fixed sized partitions, each partition contains one process.
When a partition is free, a process is selected from the input queue and is loaded into the free partition.

When the process terminates, the partition becomes available for another process

Dynamic Storage Allocation Problem

The Dynamic storage allocation problem is as follows:
How do you satisfy a request of size $N$ from a list of free partitions/holes in main memory?

Solutions

First Fit

We allocate the first hole that is large enough to accommodate the process
Searching can start either at the beginning of the set of holes or where the previous first fit search ended.
We can stop searching as soon as we find a free hole that is large enough to accommodate our process

Best Fit

We allocate the smallest partition that is large enough to fit our process
We must search the entire list, unless that list is ordered by size
This leaves us with the smallest leftover hole

Worst Fit

This is the opposite of best fit.

We allocate the largest partition that can hold our process
We must still search the entire list unless sorted by size
This leaves us with the largest leftover hole

Fragmentation

As process are loaded and removed from memory, the free memory space is broken into little pieces which results in fragmentation.

These are the free memory partitions that are left in between spaces

There are two types of fragmentations:

External Fragmentation
Internal Fragmentation

External Fragmentation

It exists when there is enough total memory space to satisfy a request, but the available spaces are not contiguous.

Storage is fragmented into a large number of small holes
This fragmentation problem can be severe. In the worst case, we could have a block of free(or wasted) memory between every two processes
If all these small pieces of memory were in one big free block instead, we might be able to run several more processes.

We cannot combine these segments are they are not contiguous

Internal Fragmentation

It occurs when memory blocks assigned to processes are larger than the process actually needs.
Some portion of memory is left unused as it cannot be used by a process

The solution for internal fragmentation is by using the "best fit" approach to memory allocation.

Solution to external fragmentation

Compaction is the solution to external memory fragmentation

Compaction

Shuffle the memory contents so as to place all free memory together in one large block.

Paging

Paging is a memory management scheme that permits the physical address space of a process to be non-contiguous

Paging avoids the considerable problem of fitting memory chunks of various sizes into the backing store.

Most memory-management schemes suffer from this problem.

Basic Method of Paging

Break physical memory into fixed sized blocks called frames
Break logical memory into blocks of the same size called pages
When a process is to be executed, its pages are loaded into any available memory frames from the backing store.
The backing store is divided into fixed-size blocks that are the same size as memory frames.

Paging allows users to store processes in non-contiguous segments

Page Tables

Every address generated by the CPU is divided into two parts:

a page number
a page offset

The page table contains the base address of each page in physical memory
This base address is combined with the page offset to define the physical memory address that is sent to the memory unit

Pasted image 20230227110147.png

Translation of Logical Address To Physical Address using a Page Table

Every address generated by the CPU is divided into two parts:

Page Number(p): Used as an index into a page table
Page Offset(o): the displacement within the page and frame

Pasted image 20230227111742.png

logical address to a physical address is $(F r a m e \times P a g e S i z e) + O f f s e t$

Hardware Implementation of a Page Table

Case 1: Implement page table as a set of dedicated registers (only possible when page table is small)
Case 2: Keep Page Table in main memory and a page-table base register (PTBR) points to the page table. (This increases the time required to access a user memory location as there are two memory accesses per byte. One to access the page table, and one to access the actual data)
Case 3: Use a TLB (Translation Lookaside Buffer)

TLB

The TLB is an associative, high-speed memory

It consists of two parts:

A Key
A Value

When the associative memory is presented with an item, the item is compared with all keys simultaneously
If the item is found, the corresponding value field is returned
This is a very fast search

TLB with Page Table

The TLB contains a small portion of the page-table entries
When a logical address is generated by the CPU, its page number is presented to the TLB
If the page number is found, its frame number is immediately available and used to access that location in memory (This is a TLB Hit)
If the page number is Not in the TLB, a memory reference to the page table must be made (TLB MISS)
When the frame number is obtained, we can use it to access memory
We also add the page number and frame to the TLB so they will be found more quickly on the next memory search

If an entry in the TLB is not found, once it goes out to main memory, it may replace a prior entry in the TLB (This is sometimes done with LRU or Least Recently Used)
Pasted image 20230227114052.png

Page Table Entries

- The valid/invalid bit is actually present in main memory. If it is not present, it is known as a **page fault** - If the page is not present in main memory, the Present/Absent bit is set to 0 - The protection bit is also known as the Read/Write bit and sets the permission for reading or writing. The bit is set to 0 if only read is allowed, The bit is set to 1 if both read and write are allowed - The referenced bit sets whether or not the page has been referenced within the last clock cycle (set to 1 if true) - The caching bit is used for enabling or disabling the caching of the page when disabled, it's set to 1 - The Dirty bit is also known as the modified bit, whether it's been modified or not. If the page has been modified it is set to 1 This helps in avoiding unnecessary writes to the secondary memory when a page is being replaced by another page.

Shared Pages

An advantage of paging is the possibility of sharing common code.

Consider a system that supports 40 users, each of whom executes a text editor.
If the text editor consists of 150kb of code and 50kb of data space,
With non-shared memory, we would need 200*40 or 8000kb of space.

However, if the text editor was in shared memory, we would only need 150kb + 50*40 or...2150kb

With shared pages, each process has its own copy of registers and data storage to hold the data for the process's execution
The data for two different processes will obviously be different.

Pasted image 20230227121617.png

Hierarchical Paging

Most modern computer systems support a very large logical address space ( $2^{32}$ to $2^{64}$ )

In such an environment, the page table itself becomes excessively large

For example, imagine a system with:
32-bit logical address space
If page size = 4kb ( $2^{1} 2$ )
Then a page may consist of up to ( $\frac{2^{32}}{2^{12}}$ ) = 1 million entries

If each entry consists of 4 bytes, each process may need up to 4mb (4* 1 million) of physical address space

If we try to allocate a page table of this size contiguously in main memory, we will run into problems

The solution

Divide the page table into smaller sizes.
This is known as multi-level paging
Use a two-level paging algorithm, in which the page table itself is also paged

A two level page table scheme

Pasted image 20230301150420.png
Pasted image 20230301150430.png

Hashed Page Tables

Hashed page tables are used for handling address spaces larger than 32 bits (or large address spaces)

The hash value is the virtual page number (the page number in the logical address)
Each entry in the hash table contains a linked list of elements that has to the same location
Each node consists of three fields
Virtual page number
the value of the mapped page frame
a pointer to the next element in the linked list

The Algorithm

The virtual page number in the virtual address is hashed into the hash table
The virtual page number is compared with field 1 in the first element in the linked list.
If there is a match, the corresponding page frame(field 2) is used to form the desired physical address.
If there is no match, the linked list is walked searching for a virtual page number

Clustered Page Tables

This is a variation of the hashed page table that is most applicable to 64bit addresses

It's similar to the hashed page tables except that each entry in the hash table refers to several pages rather than a single page
Therefore, a single page table entry can store the mappings for multiple physical-page frames
Clustered page tables are particularly useful for sparse address spaces, where memory references are non-contiguous and scattered throughout the address space

Inverted Page Tables

In most operating systems, a separate page table is maintained for each process. For 'n' number of processes, there will be 'n' number of page tables.
For large processes, there would be many pages and for maintaining information about these pages, there would be too many entries in their page tables which itself would occupy a lot of memory.
This means, memory utilization is not efficient is a lot of memory is wasted in maintaining the page tables themselves

Inverted page tables solve this problem

Implementation

Pasted image 20230301152513.png
An inverted page table has one entry for each real page(or frame) of memory
Each entry consists of the virtual address of the page stored in that real memory location, with information about the process that owns that page.

This means, only one page table exists in the system, and it only has one entry for each page of physical memory.

With this, each virtual address consists of a triple with three fields

process-id
page-number
offset

Each inverted page table entry is a pair

process-id
page number

When a memory reference occurs, part of the virtual address, consisting of (process-id, page number) is presented to the memory subsystem

The invertaed page table is then searched for a match
If a match is found, say at entry 'i', then the physical address (i, offset) is generated

Advantages and Disadvantage

Advantage

Reduces Memory Usage by having only one page table

Disadvantage

Increased search time as inverted page table is sorted by physical address, but lookups occur on virtual addresses. The whole table may need to be searched to find a match

Segmentation

Segmentation is another non-contiguous memory allocation technique similar to paging
Unlike paging however, in segmentation, the processes are not divided into fixed pages
Instead, the processes are divided into several modules called segments which improves the visualization for the users.

Here, both secondary memory and main memory are divided into partitions of unequal sizes.

A logical address space is a collection of segments
Each segment has a name and a length
The addresses specify both the segment name and the offset within the segment

The user specifies the segment name and an offset to access that item in memory

In paging, the user only needs to specify a single address, which is partitioned by the hardware into a page number and an offset, all invisible to the programmer

For simplicity, segments are numbered and referred to by a segment number rather than a segment name.
Therefore, a logical address consists. of a two tuple:

Segment-number
Offset
To use this, we need a segment table
A segment table maps two dimensional user defined addresses into one dimensional physical addresses. This is done through the use of a segment table

Each entry in a segment table has a segment base and a segment limit
The segment base contains the starting physical address where the segment resides in memory and the limit specifies the size of the segment

Virtual Memory

The purpose of memory management is to keep many processes in memory simultaneously to allow multiprogramming. This can be difficult though as that tends to require that an entire process be in memory before it can begin executing. Therefore, the processes that can be loaded into main memory are often limited by the size of the main memory.

For example, if the size of main memory is 2gb, and a process requires 3gb...it becomes difficult to accommodate such processes in main memory. We can use methods such as dynamic loading but that generally requires extra precautions by the programmer.

What is Virtual Memory?

Virtual memory is a storage scheme where secondary memory can be treated as though it is a part of main memory and large processes can be stored in it. Only the part of the process that is actually needed for execution will be loaded to the actual main memory.

In most real programs, there is often code to handle unusual error conditions. Since these errors seldom, if ever, occur in practice, this code is almost never executed

Another examples, is arrays, lists, and tables. These are often allocated more memory than they actually need. An array may be declared 100 by 100 elements, even though it is seldom larger than 10 by 10 elements

Finally, an assembler symbol table may have room for 3000 symbols, although the average program has fewer than 200 symbols.

Pasted image 20230301172419.png

Benefits of being able to execute a program that is only partially in memory

A program is longer constrained by the amount of physical memory that is available. Users are able to write programs for an extremely large virtual address space, simplifying the programming task
Because each user program can take less physical memory, more programs can run at the same time. This gives us an increase in CPU utilization and throughput (it does not increase turnaround time or response time)
Less I/O is needed to load or swap each program into memory, so each program runs faster.
Using virtual memory benefits both the system AND the user

Virtual memory involves the separation of logical memory as perceived by users from physical memory, this separation allows an extremely large virtual memory to be provided for programmers when only a smaller physical memory is available.

Demand Paging

How can an executable be loaded from disk to memory?

Possible Approaches

Load the entire program into physical memory at program execution time.

The problem with this solution is we may not initially need the entire program in memory. Think of a program that prompts the user for available options. Loading the entire program results in loading code for all options instead of what the user needs which wastes memory space.

Load pages only as they are needed. Pages are loaded when they are demanded during program execution. Pages that are never accessed are never loaded into memory this is known as demand paging. Pages are demanded when they are needed

A demand paging system is similar to a paging system with swapping where processes reside in secondary memory(usually a disk)

When we want to execute a process, we swap it into memory.
Rather than swapping the entire process into memory, however, we use a lazy swapper

A lazy swapper never swaps a page into memory unless that page will be needed.

Since we are now viewing a process as a sequence of pages, rather than as one large contiguous space, the use of the term swapper is technically incorrect.

In connection with demand paging, the more appropriate term is pager
A swapper manipulates entire processes, whereas a pager is concerned with the individual pages of a process.

Hardware implementation of Demand Paging

In order to implement demand paging with hardware, we need to be able to distinguish between pages that are in memory and the pages on the disk

To do this, we can make use of the Present/Absent (or Valid/Invalid Bit).

If the bit is set to valid the page is both legal AND in main memory. By legal, it is a page that actually exists.
If the bit is set to invalid, the page is either not in the logical address space of the process, or it is valid but is in the backing store

Page Table Entries

Page table entries for pages that are brought into main memory will be set as usual
Page table entries for pages that are not currently in memory will either be marked as invalid or contains the address of the page on disk in secondary memory

Pasted image 20230302120012.png

Page Faults

Page Fault: when a process tries to access a page that is not in main memory(or marked as invalid)

Access to a page marked as invalid causes a page-fault trap. The paging hardware will notice that the invalid bit is set, causing a trap on the operating system.

A page fault may occur at any memory reference.

If the page fault occurs on the instruction fetch, we can restart by fetching the instruction again.
If the page fault occurs while we are fetching an operand, we must fetch and decode the instruction again and then fetch the operand

Procedure for handling page faults

We check an internal table, (usually kept within the Process Control Block) for this process to determine whether the reference was a valid or invalid memory access
If the reference was invalid, we terminate the process
If it was a valid memory access(legal, not in valid/invalid bit sense), but we have not yet brought in that page, we now page it in.
We find a free frame(by taking one from the free-frame list)
We schedule a disk operation to read the desired page into the newly allocated frame
When the disk read is complete, we modify the internal table kept within the process and the page table to indicate that the page is now in memory(set valid/invalid bit to 1)
We restart the instruction that was interrupted by the trap
The process can now access the page as though it had always been in memory

Demand Paging Performance

A computer systems performance can be significantly affected by Demand Paging

Memory Access time (ma) for most systems usually ranges from 10 to 200 nanoseconds

If there are no page faults, the Effective Access Time will be equal to the Memory Access Time

However, if a page fault occurs, we need to calculate the Effective Access Time
let $p$ be the probability of a page fault $0 \leq p \leq 1$

Then, effective access time =

(1 - p) \times m a + p \times P a g e F a u l t T i m e

$(1 - p) \times m a$ is the time spent for a normal memory access where $1 - p$ is the probability that a page fault won't occur
$p \times P a g e F a u l t T i m e$ is the time spent handling page fault

There are three major components of the page-fault service time:

Service the page-fault interrupt
Read in the page
restart the process

For example: If you have a memory access time of 200nanoseconds, and a page fault time of 8milliseconds(8,000,000 nanoseconds), and the probability of a page fault is 25% then you'd have $.75 \times 200 + .25 \times 8, 000, 000$

You can also create a formula out of it which would be $m a - m a (p) + P a g e F a u l t T i m e (p)$

Copy On Write

Copy on write (CoW) is a technique that is used for sharing virtual memory or pages. It's mostly used in conjunction with the fork() system call that is used for creating child processes

fork() will create a copy of the parent's address space for the child, duplicating the pages belonging to the parent.
Instead of duplicating the pages, we can make the parent and child processes share the common pages, and only create a copy when one of the processes (either child or parent) wants to write(modify) a page.

Before Process 1 modifies page C

Pasted image 20230302133714.png

After Process 1 modifies Page C

Pasted image 20230302133734.png

Problems with Demand Paging

With the help of demand paging we are increasing our degree of multi-programming by loading only the required pages into memory, hence facilitating the possibility of having more processes loaded into memory.

However, this could lead to an issue

Overallocation of memory

suppose we have 40 frames in memory
We have 6 processes each of which has 10 pages but uses only 5 at the moment.(5 pages of each will not be loaded)
We can load these (6 x 5)=30 pages into memory
So all the process are executing simultaneously
We still have 10 free frames
Now, say all 6 processes want to use all of their pages (the full 10 each)
Now we need to load the additional 30 pages, however we only have 10 free frames

So what can the OS do?

Terminate the process: Not ideal as it destroys the purpose of Demand Paging
The operating system can instead swap out an entire process, freeing all it's frames and reducing the level of multiprogramming....good in certain circumstances
We can make use of a page replacement technique(most common solution)

Page Replacement

Page Replacement is the technique of swapping out pages from physical memory when there are no more free frames available, in order to make room for other pages which have to be loaded to the main memory.

If a process wants to use/access a page which is not present in physical memory, it will cause a page fault. Now that page has to be loaded into memory

However, if there are no free frames available, how do we handle it?

We can look for an occupied frame that is not being used currently
We then free that frame by writing it's contents to the swap space (secondary memory)
We update the page tables to indicate the page we removed is now no longer in memory
We then load the page of the process that caused the page fault to occur in the freed frame

Steps of page replacement

Find the location of the page to be loaded on the disk (in the backing store)
If there is a free frame, use that frame by loading the page into the frame
If there are no free frames, use a page-replacement algorithm to select a victim frame.
Write the victim frame to the disk, and change the page and frame tables accordingly
Read the desired page into the newly freed frame and change the page and frame tables.
Restart the process

Note

Victim Frame: The frame we are choosing to remove from physical memory so that we have a space for the page we are trying to swap in
Pasted image 20230303144025.png

Page n is not currently in memory and has been accessed by the current instruction. A page fault occurs, and there are no free frames. Frame f is selected as the victim.
Frame f is currently holding page m. The OS writes the page to swap space on secondary storage, and frees the frame by marking m as invalid.
The OS locates page n on the backing store and schedules an I/O transfer to DMA the contents into frame f.
The transfer finishes. The OS changes the page table for n to point to frame f and that the frame is valid. The instruction is restarted and can now access the data it needs.

Page Fault Service Time When No Frames Are Free

if no frames are free, two page transfers (one out and one in) are required
This doubles the page-fault service time and increases the effective access time accordingly

We can reduce this time by making use of the modify(dirty) bit
The modify bit is set when a page has been modified

When page replacement has to be done on a page which is present in memory:

if a modify bit is set - that means the page has been modified and we need to write the changes to the disk.
If the modify bit is not set - that means that the page has not been modified and it is the same as the copy which is already present in the disk. In that case, we don't have to overwrite this page on the disk, reducing the overhead and the effective access time accordingly

FIFO Page Replacement

FIFO is the simplest page replacement algorithm

When page replacement has to be done, the oldest page in memory is chosen to be swapped out.

How do we maintain a record of the time that pages were brought into memory in order to figure out which page is the oldest?

We don't have to record the times in a strict matter, instead, we can use a fifo queue that can hold all the pages that are in memory.
When replacement has to be done, we choose the page at the head of the queue to be swapped out of memory
when a new page is loaded into memory, we added it into the tail of the queue

FIFO Page-Replacement Algorithm

Pasted image 20230303145544.png

Belady's Anomaly

In general, we would think that if there are more frames, there will be fewer page faults

Pasted image 20230303152823.png

However, We see that as the number of frames increases, the number of page faults may increase as well. This is known as Belady's anomoly

Optimal Page Replacement

Pasted image 20230303155251.png

An optimal page-replacement algorithm has the lowest page fault rate of all the algorithms

it will never suffer from Belady's anomaly
It guarantees the lowest possible page fault rate for a fixed number of frames

To select a victim frame, we choose the frame that will not be used for the longest period of time.

The optimal page-replacement algorithm is difficult to implement because it requires future knowledge of the reference string.

Least Recently Used Page Replacement

It associates with each page the time of that page's last use
When a page must be replaced, the LRU chooses the page that has not been used for the longest period of time.

It can be thought of as an optimal-replacement algorithm but looking backwards in time
Pasted image 20230303161528.png

It's less optimal than the OPT algorithm, but more optimal than FIFO

It's generally considered a good algorithm and is used frequently

Implementation of Least Recently Used Page Replacement

An LRU page-replacement algorithm may require substantial hardware assistance.

How do we keep track of the pages that were least recently used?

By using Counters
By using a Stack

Using Counters

With Each Page Table Entry, we associate a time-of-use field and add a logical clock or counter to the CPU.

- The lock is incremented for every memory reference - Whenever a reference to a page is made, the contents of the clock register are copied to the time-of-use field in the page-table entry for that page. - This will give us the "time" of the last reference to each page - We then replace the time with the smallest time value

Using a stack

We maintain a stack of page numbers
Whenever a page is referenced, it is removed from the stack and put on top
That way, the most recently used page is always at the top of the stack and the least recently used will always be at the bottom.
Since entries must be removed from the middle of the stack, it's best to implement this approach using a doubly linked list with a head and a tail pointer.

Pasted image 20230303164332.png

Additional-Reference Bits

LRU-Approximation Page Replacement

only a few computer systems provide the necessary hardware support for LRU Page Replacement
Some systems provide no support at all, so FIFO must be used
However, many systems provide support for LRU in the form of a Reference Bit
The reference bit (which is associated with each entry in the page table) for a page is set by the hardware whenever that page is referenced

Reference Bit Steps:

Initially the operating system sets all reference bits for all pages to '0'.
Whenever a page is referenced, the hardware sets the reference bit to '1'
By checking these reference bits, we can tell which pages have been used and which haven't been used
We cannot, however, determine the order of use.

Therefore we need to use the "Additional Reference Bits Algorithm"
The additional reference bits algorithm is used to get an idea of the ordering information by recording the reference bits at regular intervals

How do we use it?

We keep an 8-bit byte for each page in a table in memory
at regular intervals(say, every 100 ms) a timer interrupt transfers control to the operating system
The operating system shifts the reference bit for each page into the higher-order bit of it's 8-bit byte, shifting the other bits right by 1 bit and discarding the lower-order bit
These 8-bit shift registers contain the history of page use for the last 8 time periods

Example

Page No.	Shift Register Content	What It Means
P1	00000000	The page has not been used for eight time periods
P2	11111111	The Page has been used at least once in each period
P3	11000100	The page has been used more recently than P4
P4	01110111	The page has been used less recently than P3

When the page is accessed, it sets the MSB (left most) to one.

Second Chance Algorithm

Another LRU-Approximation Page Replacement

This is basically a FIFO replacement algorithm but with an additional feature
When a page has been selected, we check its reference bit (1 if used, 0 if not)
If there reference bit is 0, we replace the page
If the reference bit is 1, we give a second chance to that page and move on to the next.
When a page gets a SECOND CHANCE, it's REFERENCE BIT is reset to '0' and it's arrival time is reset to the current time
If a page is given a second chance, it is not replaced until all other pages are either replaced or given their own second chances
If a page is frequently used, it's reference bit will mostly be set to 1 and hence it will never be replaced

The implementation of the Second Chance Algorithm (also known as a clock algorithm) is usually implemented as a Circular Queue

Pasted image 20230303171419.png

If all reference bits are set to 1, this algorithm will work like a FIFO algorithm

Enhanced Second Chance Algorithm

Another LRU-Approximation Page Replacement

In the enhanced second chance algorithm, we consider the reference bit and the modify bit as an ordered pair

There are four possible cases for the Enhanced Second Chance Algorithm:

Reference Bit	Modify Bit	What It Implies	Remarks
0	0	The page was neither used recently nor modified	Best Page To Replace
0	1	This page was not used recently, but has been modified	This page was not used recently, but has been modified
1	0	The page was used recently but not modified	There are chances that this page will be used again soon
1	1	The page was used and modified recently	There are chances that this page will be used again soon and we will have to write this page to the disk if we replace it (worst case)

Counting-Based Page Replacement

Maintain a counter that keeps a count of the number of references made to each page.

Using the counter, we formulate the following schemes:

Least Frequently Used (LFU) Page-Replacement Algorithm
Most Frequently Used (MFU) Page-Replacement Algorithm

Least Frequently Used

Counter - keeps a count of the number of references made to each page

Pages which are least used will have a lower count
Pages which are frequently used will have a higher count
Replace the page with the lowest count

What is the problem with this?

Pages that were heavily used during the initial phase of the process would have higher counts and would remain in memory even though they may not be needed further

Solution

Shift the counts right by 1 bit at regular intervals forming an exponentially decaying average usage count

Most Frequently Used

Counter - keeps a count of the number of references made to each page

Pages which are least used will have a lower count
Pages which are frequently used will have a higher count
The page with the smallest count is probably the most recent page brought in and is yet to be used, therefore we do not replace the page with the least count.

Both these algorithms are expensive to implement and do not approximate the optimum page replacement well. Therefore, they are not often used.

Page Buffering Algorithms

In this algorithm, we maintain a pool of free frames

When A Page Fault Occurs:

A victim frame is chosen
The desired page is loaded into a frame from the free frame pool before the victim page is swapped out
Once the victim page is swapped out and written to the disk, the victim frame will be added to the free frames pool
This allows the process to restart as soon as possible without waiting for the victim page to be written to the backing store

This idea can be modified by maintaining a pool of modified pages.
Whenever the paging device is idle:

A modified page is selected and written to the disk
it's modify bit is then reset.
This increases the probability taht a page will be clean when it's selected for replacement and will not need to be written back to the disk again.

Allocation of Frames

How do we allocate to various process the fixed amount(number of frames) of free memory that is available?

Example

In a single user system:

Memory Size = 128kb
Page Size = 1kb
So number of frames will be 128
Suppose the operating system uses 30kb
Remaining frames for user processes = 98 (128-30)
These 98 frames would be available in the free frames list
If we're using demand paging, the first 98 page faults would get all free frames from the list
When the free frames are exhausted, a page replacement algorithm would be used to replace a page from the occupied list of 98 with the 99th page.

Minimum number of frames

Constraints for allocation of frames:

Unless there is page sharing, we cannot allocate more than the total number of available frames
A minimum number of frames should be allocated to each process
- as the number of frames allocated to a process decreases, the page fault rate increases
- There should be enough page to hold all the different pages that any single instruction can reference.
  - Suppose in a one-level indirect addressing, a load instruction on page 15 refers to an address on page 1 which is an indirect reference to page 22, here, 3 frames are needed per process(the three pages are 15,1,22, hence 3 frames).
    So how do we decide how many frames to allocate?

Allocation Algorithms

Frame allocation algorithms help us in determining how many frames should be allocated to different processes in a multiprocessing environment

The 3 Most common allocation algorithms:

Equal Allocation
Proportional Allocation
Priority Allocation

Equal Allocation

The frames are equally distributed among the processes (equal free frames/ num processes)

if we have $m$ number of frames
if we have $n$ number of processes
we allocate $m / n$ frames to each process

Example

Say we have 98 frames and 5 processes
each process gets $98 / 5 = 19$ frames
The remaining 3 frames are kept as a free frame buffer pool

Problems

The memory requirements of each processes are not the same, some may need more than others, so by allocating equally we are not using resources efficiently.

Example

In a system with 64 free frames with a frame size of 1kb, say we have two processes, P1(10kb) and P2(127kb).
We allocate $64 / 2 = 32$ frames to both processes.
P1 only needs 10 frames however, so we waste 22. ( $32 - 10$ )

Proportional Allocation

The frames are distributed among the processes according to their sizes

let $m$ be the number of frames
let $s_{i}$ be the size of process $p_{i}$
$S = Σ S_{i}$
The number of frames allocated to $p_{i}$ is equal to $a_{i}$ where $a_{i} = \frac{S_{i}}{S} \times m$

Example

In a system with 64 free frames with a frame size of 1kb, say we have two processes, P1(10kb) and P2(127kb).
$m = 64, S_{1} = 10, S_{2} = 127, S = 137$
The number of frames allocated to $p_{1}$ = $\frac{S_{1}}{S} \times m = \frac{10}{137} \times 64 = 4.7 \approx 5$
The number of frames allocated to $p_{2}$ = $\frac{S_{2}}{S} \times m = \frac{127}{137} \times 64 = 59.3 \approx 59$

Priority Allocation

The priority allocation algorithm is done using priorities or a combination of size and priorities rather than just the size of the processes

So we allocate more frames to processes with higher priorities so as to speed up their execution

Global vs Local Allocation

Page replacement algorithms can be classified into two broad categories:

Global Replacement
Local Replacement

This is important for deciding the way frames are allocated to different processes

Global Replacement

When a page fault occurs: - Processes are allowed to select replacement frames from the **set of ALL frames**, even if the frames are allocated to other processes >[!example] >If a process P1 is of higher priority than processes P2 and P3, and all of the frames of P1 are full and it now needs one more frame due to a page fault, it can take a frame allocated to either P2 or P3

Problem with global allocation

Processes cannot control their own page fault rate

Local Replacement

When a page fault occurs: - Processes are allowed to select replacement frames **only** from the set of frames that is particularly allocated to them.

Problems with local allocation

The memory utilization may not be optimal (eg an idle process that is not using all its frames)

Thrashing

Consider a process that doesn't have enough frames for its execution

Thrashing: when a process is spending more time in paging than actually executing the instructions.

Sever Performance Problems Due To Thrashing

The operating system monitors CPU utilization. If CPU utilization is too low, we increase the degree of multiprogramming by introducing a new process to the system.

as processes are busy swapping pages in and out, the queue up for the paging device and the ready queue becomes empty.
Since processes are now waiting for the paging devices, the CPU utilization decreases
The CPU scheduler sees that the CPU utilization has decreased and in an attempt to increase CPU utilization, it increase the degree of multiprogramming by bringing in new processes
These new processes try to take frames from the older processes which increases the number of page faults and increases the queue for the paging device
The CPU utilization drops even further and the cycle continues

Thrashing causes system throughput to decrease massively
Pasted image 20230303195621.png

Thrashing can be reduced slightly by using local allocation, but this is not a great solution as it will still have many page faults. To reduce thrashing we need to use something like the Working-Set Model

Working-Set Model

Question

How do we prevent thrashing?
We must provide a process with as many frames as it needs.
But how do we know how many frames it needs?
We make use of the working set strategy where it checks how many frames a process is actually using

The working set strategy defines the locality model of process execution

Locality model of process execution: As a process executes, it moves from locality to locality

Locality: A set of pages that are actively used together. A program is generally composed of several different localities which may overlap

Implementation of the working set model

To implement the working set model, we use a parameter $Δ$ (Delta) which defines the working set window.

The set of pages in the most recent $Δ$ page references is the "working set"
If a page is in active use, it will be in the working set.
If it is no longer being used, it will drop from the working set $Δ$ time units after its last reference
Thus, the working set is an approximation of the program's locality

Pasted image 20230303202317.png
In the above image $Δ = 10$

Determining $Δ$

The accuracy of our working set depends on the selection of $Δ$

if $Δ$ is too large: It may overlap several localities
if $Δ$ is too small: It may not cover the entire localities
if $Δ$ is infinite: The working set is the set of pages touched during the process execution

The next important property of a working set is its size
The working set size of a process is denoted by $W S S_{i}$ where process 'i' needs $W S S_{i}$ number of frames
$W S S_{i}$ is simply the number of unique pages that will be there for a particular process in its working set window. From the image above, the $W S S_{i}$ for $W S (t_{1})$ is 5, the $W S S_{i}$ for $W S (t_{1})$ is 2.

So, the total demand for frames in a system will be given by:

D = Σ W S S_{i}

This will give the total demand for frames in a system.
If the total demand for frames(D) is greater than the total number of available frames(m) (aka $D > m$ ), then thrashing will occur.

So what do we do if the demand is greater than the number of frames?

The operating system will select a process to suspend. The processes pages are swapped out and its frames are reallocated to the other processes. The suspended process can be restarted later.

The working-set strategy prevents thrashing while keeping the degree of multiprogramming as high as possible, which optimizes CPU utilization.

File System Interface

Storage Management

Programs have to be loaded to main memory for execution, however, the size of our main memory is limited and is usually too small to accommodate all the data and programs permanently. Because of this, the computer system must provide secondary story to back up main memory.

There are two main types of secondary storage:

Internal: fixed inside the computer
External: removable
Most computers use disks as the most common secondary storage device for both programs and data.
The File System provides a mechanism for:
Storage of data and programs on the disk
Access to data on the disk
Files are mapped by the operating system to physical devices.
Files are a collection of related information defined by its creator. These files are normally organized into directories for ease of use.

Different storage devices vary in many aspects
For example:

Some devices transfer a character or a block of characters at a time
some can be accessed only sequentially
some can be accessed randomly
some transfer data synchronously
some transfer data asynchronously
some are dedicated while some are shared
and some can be read-only or read-write
Because of the variation, the operating system needs to provide a wide range of functionality to applications, to allow them to control all aspects of the devices

Concept of Files

The operating system abstracts from the physical properties of its storage devices to define a logical storage unit - the file

A file is a logical storage unit
Files are mapped by the operating system onto physical devices
These physical devices are usually non-volatile(they will keep data after power is switched off) unlike main memory which is usually volatile

File definition:

A file is a named collection of related information that is recorded on secondary storage
From a User's perspective:

A file is the smallest allotment of logical secondary storage
data cannot be written to secondary storage unless they are within a file
Files represent both data and programs
this data can be numeric, alphabetic, alphanumeric, or binary
the files may be free form, such as text files, of may be formatted rigidly
In general, a file is a sequence of bits, bytes, lines, or records, the meaning of which is defined by the file's creator and user.

Many different types of information may be stored in a file

File Attributes

Naming a file

A file is named for the convenience of its human users, and is referred to by its name.
A name is usually a string of characters. Eg myfile.c

When a file is named, it becomes independent of the process, the user, and even the system that created it

The attributes of files can vary from one OS to another but typically consist of the following:

Name: the symbolic file name is the only information kept in human readable form.
Identifier: A unique tag, usually a number, that identifies the file within the file system. It's the non-human-readable name for the file.
Type: this information is needed for systems that support different types of files.
Location: this information is a pointer to a device and to the location of the file on that device.
Size: the current size of the file(in bytes, words, or blocks)and possibly the maximum allowed size are included in this attribute.
Protection/Permission: access-control information determines who can do reading, writing, executing and so on.
Creation Date: the date the file was created
Last Modified Date: the date on which the file was last modified
Owner: the information about who the owner of the file is.

The attributes of a file are stored in the File Control Bloc also known as the FCB.

File Operations

The operating system has to provide system calls to perform various file operations.

The common file operations are:

Creating a file
writing to a file
reading from a file
repositioning within a file
deleting a file
truncating a file

Creating A File

Steps for file creation:

Check if space is available in the file system to create the new file, and find space on the disk for it.
Make an entry for the file in the new directory.

Writing a File

Steps for writing a file:

make a system call specifying both the name of the file and the information to be written to the file.
With the given name of the file, the system searches for the file in the directory and locates it.
A write pointer is maintained which points to the location in the file from where teh next write must take place.
After the writing is done, the write pointer is updated

Reading a file

Steps for reading a file:

Make a system call specifying both the name of the file and where(in memory) the next block of the file should be read.
With the given name of the file, the system searches for the file in the directory and locates it
A read pointer is maintained which points to the location in the file from where the next read must take place.
After the reading is done, the read pointer is updated
In general, most read and write operations use the same pointer.

Repositioning within a file(File seek)

Steps for repositioning within a file:

The directory is searched for the appropriate file.
The current-file-position pointer is repositioned to a given value.

Repositioning can be done without any I/O

Deleting a file

Steps for deleting a file:

The directory is searched for the named file
Once the directory is found, all the file space occupied by the file is released so that it can be used by other files.

Truncating a file

A user may want to erase the contents of a file but keep its attributes. Rather than forcing the user to delete the file and recreate it, this function allows all attributes to remain unchanged
Steps to truncate a file:

Keep all the attributes of the file unchanged.
The file length is reset to zero
Release the file space.

Other file operations exist but are less common such as:

appending new information to the end of an existing file
renaming an existing file
editing existing data on a file
saving a file
creating a copy of a file
hiding a file
sending a file
printing a file

The `open()` System Call

Most of the common file operations involve searching the directory for the entry associated with the file name.

To avoid this constant searching, many systems require that an open() system call be made before a file is first used actively.

The operating system keeps a small table, called the open-file table which contains information about all open files.

When a file operation is requested, the file is specified via an index into this table so no searching is required.
When the file is no longer being actively used, it is closed by the process and the operating system removes its entry from the open-file table.

File Types

When a file system or an operating system is designed, the file types that the operating system must should recognize and support must be considered
If an operating system recognizes a type of file, it can then operate on that file in standard ways

The most common technique for implementing file types is to include the type as part of the file name

This is specified by the file extension that appears after a period in the file name

Common file types

File Type	Usual Extension	Function
Executable	exe, com, bin, or none	ready-to run machine language program
Object	obj, o	compiled machine language that has not been linked
Source Code	c, cpp, pas, java, asm, py	Source code in various languages
Batch	bat, sh	commands for the command interpreter
Text	txt, doc	textual data, documents
Word Processor	wp, doc, rtf	Various word processor formats
Library	lib, a	libraries of routines for programmers
Print or View	ps, pdf, jpg, png	ASCII or binary files in a format for printing or viewing
Archive	arc, zip, tar	Related files grouped into one file sometimes compressed for archiving or storage
Multimedia	mp3, mp4, avi	Binary files containing audio or A/v information

Other ways of identifying file types:
In Mac OS X each file has a type that is specified with it to identify its type
In Unix systems, it uses a crude magic number stored at the beginning of some files to indicate roughly the type of file.

Access Methods

The information in files can be accessed in several ways. Some systems provide just one access method for files while other systems may support many access methods.

The two most common access methods are:

Sequential Access
Direct Access

Sequential Access

the simplest access method
information in the file is processed in order, one record after the other.
most common editors, compilers, etc access files in this method

Pasted image 20230304155840.png

Read Operation

read next: reads the next portion of the file and automatically advances a file pointer which tracks the I/O location. The file pointer is advanced automatically

Write Operation

write next: appends to the end of the file and advances to the end of the newly written material (the new end of the file)

Direct Access

Also known as relative access

A file is made up of fixed length logical records that allow programs to read and write records rapidly in no particular order.
In direct access, the filed is viewed as a numbered sequence of blocks or records.
With direct access, there are no restrictions on the order of reading or writing for a direct-access file.
Direct-access files are great for immediate access with files of large amounts of information such as databases.

Simulation of sequential access on a direct-access file

Pasted image 20230304160354.png

Read Operation

read n: read from the block number $n$ from the file

Write Operation

write n: writes to the block number $n$ of the file

The block number provided by the user to the operating system is normally a relative block number which is an index to the relative beginning of the file.

Directory Structure

The file system of a computer can be extensive as systems need to store huge number of files on a disk. To manage this, they must be organized properly.

Directories are used for organizing this data

Storage Structure

a disk or any large storage device can be used entirely for a file system, however, it's better to have multiple file systems on a disk or use parts of a disk for a file system, and other parts of the disk for things like swap space, raw dis space, etc

These parts are called partitions, slices, or minidisks, however partitions is the most common. A file system can be created on each of these parts of the disk, and these parts can be further combined to form a structure called a volume, and a file system can be created on these as well.

Volumes

each volume can be thought of as a virtual disk.
Volumes can store multiple operating system allowing the system to boot and run more than one operating system
each volume that contains a file system must also contain information about the files in the system
This information is kept in entries in a device directory
a directory records information such as name, location, size, and type for all files in that volume

Operations That Can Be performed on a directory

Search for a file
Create a file
Delete a file
List a directory
rename a file
traverse the file system

Notice, some of these are the same operations that can be performed on a file.

Single-Level Directory

A single-level directory is the simplest directory structure
all the files are contained in the same directory

Pasted image 20230304162443.png

Advantages

implementation is easy as there is just as single directory
File operations like creating, reading, deleting, repositioning, renaming, etc are easily done
searching for files will be fast if there are not many files and if they are small in size.

Disadvantages

problems start arising when the number of files increases or when the system has multiple users
all the files must have unique names since the files are in the same directory
As the number of files increases:
- users will find it difficult to remember the names of each individual file
- searching for a file becomes difficult

Two-Level Directory

Each user has his or her own directory called the User File Directory(UFD)

Information about each UFD is stored in a system directory called the Master File Directory (MFD)
The MFD is indexed by user name or account number, and each entry points to the UFD for that user
Pasted image 20230304163433.png

When a user refers to a particular file, only their own UFD is searched. Thus, different users may have files with the same file name as long as all the file names within each UFD are unique

Problems with Two-Level Directories

Users are isolated from each other
hen two or more users want to cooperate on a task and access the same file, it becomes difficult as the system doesn't permit this
If users are allowed to access other users' files, then the files must be named with their complete path which specifies who is the owner of the file
Every system has its own syntax for naming files and directories
There are system files that have to be used by several users, when these files need to be accessed by a user, it will be searched for in that user's UFD. However, it won't be available there.

Tree Structured Directories

The natural generalization of a two-level directory structure is to extend the directory structure to a tree of arbitrary height

This generalization allows users to create their own subdirectories and organize their files accordingly
A tree is the most common directory structure
The tree has a root directory and every file in the system has a unique path name

Pasted image 20230304170028.png

Current Directory

Each process will have a current directory which should contain most of the files that are of current interest to the process.
When reference is made to a file, the current directory is searched
If a file is needed that is not in the current directory, the user must either specify a path name or change the current directory to the directory holding the file

Path Names

There are two types of path names:

Absolute: Begins at the root and follows a path down to the specified file, giving the directory names on the path.
Relative: defines a path from the current directory.

Deleting a Directory

If the directory is empty, it's entry in the directory that contains it can simply be deleted.
If the directory is not empty:

some systems will not allow deleting a directory unless it's empty, so to delete a directory, all it's files and subdirectories must be deleted first.

Acyclic-Graph Directories

Acyclic Graph Directories are good for sharing

An acyclic graph (a graph with no cycles) allows directories to share subdirectories and files.
Pasted image 20230304171931.png

When people are working as a team, all the files they want to share can be put into one directory
The UFD of each team member will contain this directory of shared files as a sub directory

Implementation of shared files and subdirectories

Links: a links is effectively a pointer to another file or subdirectory.
Duplicating: duplicate all information about shared files in both sharing subdirectories. This entries are identical and equal. (A major concern here is maintaining consistency when the files are modified)

Problems with Acyclic Graph Structures

Same file with many paths and names: A file may now have multiple absolute path names. Also, different file names may refer to the same filed. This may give the system a wrong count of files when trying to count the total number of files
Deleting of Shared Files: if the file is deleted whenever someone deletes it, it will create "dangling pointers" to the non-existent file. If shared files are implemented using symbolic links, then deletion of a file by one user will delete only the link and wont affect the original file, however if the original file is deleted, it will leave the links dangling.

General Graph Directory

Pasted image 20230304180053.png

If we start with a two-level directory and allow users to create subdirectories, a tree-structured directory is usually the result
If we continue adding files and subdirectories in this way, the tree structure is preserved and continues
However, when we add links to an existing tree-structured directory, the tree structure is destroyed, resulting in a simple graph structure

Problems

Searching for files: If cycles are allowed to exist in the directory, we want to avoid searching any component twice, for reasons of correctness as well as performance (think looping). A poorly defined algorithm may result in an infinite loop. A solution to this is to arbitrarily limit the number of directories that will be accessed during a search.
Deleting files: in an acyclic-graph directory structure, if the reference count is 0 for any file, then that means there are no references to that file or directory and it can safely be deleted, however when cycles exist, the reference count may not be 0 even when it is no longer possible to refer to a director or file because of the possibility of self referencing cycles

Garbage Collection

Garbage Collection: A scheme to determine when the last reference has been deleted and the disk space can be reallocated.

Traverse the entire file system and mark everything that can be accessed
Traverse again a second time and free up all the unmarked cases from the first step
Garbage collection is extremely time consuming and thus rarely used.

Protection (Types of Access)

The information stored in a computer system must be kept safe from:

Physical Damage (Reliability)
Improper Access (Protection)

Physical Damage

File systems can be damaged by:

hardware problems
power surges/failures
head crashes
dirt
extreme temperatures
vandalism, etc
Reliability is generally provided by duplicate copies of files

Improper Access

file systems must be allowed to be accessed only by authorized users
in a single user system, the protection can be done easily

Types of Access

Complete Protection: Access is prohibited to everyone except the owner
Free Access: Access is provided to everyone
Controlled Access: Can be provided by limiting the types of file access that can be made by different users

File Operations that may be controlled

Read: Read from the file
Write: write or rewrite the file
Execute: loading the file into memory and executing it
Append: write new information at the end of the file
Delete: delete the file and tree its space for possible reuse
List: list the name and attributes of the file
Other operations such as renaming, copying, and editing may also be controlled

Protection (Access Control)

The most common approach to the protection problem is to make access dependent on the identity of the user

We can implement this using an Access Control List (ACL)
With each file and directory, an access-control list is associated specifying user names and the types of access allowed for each user.

Access Control List

When a user requests access to a particular file, the operating system check the access list associated with that file
If that user is listed for the requested access, the access is allowed.
If not, a protection violation occurs and the user job is denied access to the file.

Problems with this approach:

Constructing this list may be tedious especially if we do not know in advance the list of users in the system
The directory entry, previously of fixed size, now needs to be of variable size resulting in more complicated space management

Solutions to this problem

Use a condensed version of the access list
Systems mostly recognize three classifications of users in connection with each file:

Owner: The user who created the file is the owner
Group: a set of users who are sharing the file and need similar access is a group or work group
Other: all other users in the system

Permissions in a Unix System

Pasted image 20230304201958.png

On Linux/Unix, it goes user | group | other. If it's prepended by a d, it means "directory"

Permissions in a Windows System

Pasted image 20230304202441.png

File System Implementation

Overview

The file system resides permanently on secondary storage, which is designed to hold a large amount of data in a permanent manner.

Different operating systems support different file system
for example:

Unix: Unix File System
Windows: FAT, FAT32, NTFS

Several on-disk and in-memory structures are used to implement a file system

On-disk structures

On disk, the file system may contain information about:

How to boot an operating system stored there
The total number of blocks
The number and location of free blocks
The directory structure
Individual files

Boot Control Block

This is an on-disk structure
There is a boot control block for each volume

Typically, the boot control block is the first block of a volume
it contains information needed by the system to boot an operating system from that volume
If the disk does not contain an operating system, this block can be empty

In the Unix file system, it is called a boot block
In NTFS it's called the boot sector

Volume control block

Each volume contains a volume control block that contains information such as:

The number of blocks in the partition
Free block count(how many blocks are free)
Free-block pointers
Free FCB(File Control Block) count and FCB pointers

In Unix this is called a superblock
In NTFS this is stored in the master file table

Directory Structure Per File System

In the Unix file system it includes file names and associated inode numbers
In NTFS, it is stored in the master table

Per-File FCB

This contains details about the file such as:

File permissions
ownership
size
location of the data blocks

In the UNIX file system, it is known as an inode
In NTFS it's stored in the master file table

Note

an INODE is just the per-file File Control Block

Typical File Control Block

Pasted image 20230305153334.png

In-Memory Structures

In memory information is used for both file-system management and performance improvement via caching

This is all done in kernel memory

Here, the data is loaded at mount time, and discarded when the drive is dismounted

These structures include:

In-memory mount table: The in memory mount table contains information about each mounted volume
In-memory directory-structure cache: Holds the directory information of recently accessed directories(For directories at which volumes are mounted it can contain a pointer to the volume table)
System-wide open-file table: Contains a copy of the FCB of each open file, as well as other information.
Per-process open-file table: contains a pointer to the appropriate entry in the system-wide open-file table as well as other information

In Memory File Systems File Open

Pasted image 20230305154625.png

A process requests open of "file name".
The directory structure is searched for the given file name.
The entry is loaded into kernel memory if it's not already in memory.
The directory structure includes the file control block (fcb) of the file, which is loaded into the system-wide open-file table.
The per-process open-file table contains pointers to each open file (entry in the system-wide open-file table) that the process can access.
A file descriptor (on UNIX systems) or file handle (on Windows) is an index into the per-process open-file table that is returned to the calling process to allow it to access the now-open file.

Reading A File

Pasted image 20230305155332.png

A read of a data block of an open file uses the file descriptor (or file handle) as an index into the per-process open-file table, which points to the file control block in the system-wide open-file table
The file control block has the information needed to find the data blocks, and the appropriate data from the data blocks is returned to the calling process

Virtual File Systems

Modern operating systems must concurrently support multiple types of file systems

Object Oriented Techniques

Most operating systems, including UNIX, use object-oriented techniques to simplify, organize, and modularize the implementation of multiple file systems
This allows:

Very dissimilar file-system types to be implemented within the same structure, including network files systems such as NFS.
Users to access files that are contained within multiple file systems on the local disk or even on file systems available across the network.

The file-system implementation consists of three major layers:

Layer 1: The file system interface
Layer 2: the virtual file system
Layer 3: the layer implementing the file system type or the remote-file-system protocol

Directory Implementation

The kind of directory allocation and directory-management algorithms used in a system affects the efficiency, performance, and reliability of the file system

The two main methods for implementing a directory are:

Linear List
Hash Table

Linear List

This is the simplest method of directory implementation.

Here, a linear list of file names is used which points to the data blocks
This is simple to implement, but time consuming to execute

For Example
To create a new file:

Search the directory to be sure that no existing file has the same name
add a new entry at the end of the directory

To delete a file:

Search the directory for the named file
release the space allocated to it

The biggest disadvantage of a linear search is that it's always required to find the files. This slows down the entire process as users frequently need to access directory information.

To overcome this, we can use a cache to store information about the files that are recently used, but this is still not very efficient

Hash Table

Here, a linear list stores the entries, but a hash data structure is also used

The hash table takes a value computed from the file name and returns a pointer to the file name in the linear list

Problems with Hash Table Implementation

Collisions: situations in which two file names hash to the same location
Hash tables are generally a fixed size, and the hash functions depend on that size

Contiguous Disk Space Allocation

Contiguous allocation requires that each file occupy a set of contiguous blocks on the disk
Disk addresses define a linear ordering on the disk
Contiguous allocation of a file is defined by the disk address and length of the first block in block units
Example:
If a file is $n$ blocks long and starts at location $b$ , then it occupies the blocks $b, b + 1, b + 2, . . . b + n - 1$
The directory entry for each file indicates the address of the starting block and the length of the area allocated for this file.
Pasted image 20230305165123.png

Advantages of Contiguous Space Allocation

Accessing a file that has been allocated is easy

For Sequential Access

The file system remembers the disk address of the last block referenced
When necessary, it reads the next block.

For Direct Access

For accessing block $i$ of a file that starts at block $b$
We can immediately access block $b + i$

Disadvantages of Contiguous Space Allocation

finding space for a new file is difficult. First fit and Best fit algorithms can be used for selecting free holes from a set of available holes. However, both these algorithms suffer from external fragmentation
As files are allocated and deleted, the free disk space is broken into smaller pieces
Determining how much space is needed for a file is a difficult task.
if the space allocated for a file is too small, than either the user program must be terminated and more space must be allocated or the system finds a larger hole, copies the content to this new space, then releases the previous space

Linked Disk Space Allocation

The linked disk space allocation method solves all the problems of contiguous allocation.

Each file is a linked list of disk blocks which may be scattered anywhere on the disk.
The directory contains a pointer to the first and last blocks of the file.
Each block contains a pointer to the next block
- For example, if each block is 512 bytes in size, and the disk address(the pointer) requires 4 bytes, then the user sees 508 bytes as usable.

Pasted image 20230305171253.png

Advantages of Linked Disk Space Allocation

There is no external fragmentation here (think of this like a page-table style implementation but for storage)
The size of a file does not need to be declared when the file is created
Creating, reading, and writing files can be done easily

Creating A File

Simply create a new entry in the directory
each directory has a pointer to the first disk block of the file
the pointer is initialized to nil(the end-of-list pointer value) to signify an empty file

Writing to a file

find a free block using the free-space management system
this new block is written to and is linked to the end of the file.

Reading a file

simply read blocks by following the pointers from block to block

Disadvantages of Linked Disk Space Allocation

Can be used effectively only for sequential access of files. To find the $i_{t h}$ block of a file, we must start at the beginning of that file and follow the pointers until we get to the $i_{t h}$ block.
Space required for pointers. If a pointer is required for each block (say 4 bytes out of a 512 byte block) then you are using up storage space. However you can minimize this by allocating pointers to clusters of blocks instead of single blocks, however this could result in internal fragmentation
Reliability: if pointers used for linking get lost or damaged this will lead to problems

File Allocation Table (FAT)

The file allocation table is a variation of the linked allocation method

This method is used by MS-DOS and OS/2 operating systems
A section of disk at the beginning of each volume is set aside to contain a file allocation table
The table has one entry for each disk block and is indexed by block number
The directory entry contains the block number of the first block of the file
The table entry indexed by that block number contains the block number of the next block in the file
This chain continues until the last block, which has a special end-of-file value as the table entry.
Unused blocks are indicated by a 0 table value

Allocating a new block

Find the first 0 valued table entry
Replace the previous end-of-file value with the address of the new block
replace 0 with the end-of-file value

Advantage of FAT

random-access time is improved because the disk head can find the location of any block by reading the information in the file allocation table

Indexed Disk Space Allocation

The main disadvantage of linked allocation is that it cannot support efficient direct access. Indexed allocation solves this problem by brining all the pointers together into one location called the index block. This is just a standard disk block that contains all the pointers

Each file has its own index block which is an array of disk-block addresses
The $i_{t h}$ entry in the index block points to the $i_{t h}$ block of the file
The directory contains the address of the index block
To find and read the $i_{t h}$ block, we use the pointer in the $i_{t h}$ block entry

The -1 in the index block denotes pointers that are not in use and they indicate the end of the file.

When Creating A file

All pointers in the index block are set to nil(-1)
When the $i_{t h}$ block is first written, a block is obtained from the free-space manager and it's address is put in the $i_{t h}$ index-block entry

Advantages of Indexed Allocation

Supports direct access, without suffering from external fragmentation
Any free block on the disk can satisfy a request for more space

Disadvantages of Indexed Allocation

It suffers from wasted space. The pointer overhead of the index block is generally greater than the pointer overhead of linked allocation

How large should index blocks be?

If it is too small, it will not be able to hold enough pointers for a large file.
If it's too large it is wasted space

We can deal with this by in a couple of ways:

A linked Scheme: You can have multiple index blocks that are linked to each other
A multilevel index: A variant of linked representation uses a first-level index block to point to a set of second-level index blocks, which in turn point to the file blocks. To access a block, the operating system uses the first-level index to find a second-level index block and then uses that block to find the desired data block. This approach could be continued to a third or fourth level, depending on the desired maximum file size. With 4,096-byte blocks, we could store 1,024 four-byte pointers in an index block. Two levels of indexes allow 1,048,576 data blocks and a file size of up to 4 GB.
A combined scheme: An index block is created and the first few entries point directly to the disk blocks, then the last few entries will point to index blocks

The Unix Inode

This is a modified indexed allocation method

An inode is a data structure in UIX operating systems that contains important information pertaining to files within a file system

When a file system is created in UNIX, a specific amount of inodes are also created
Usually, about 1% of the total file system disk space is allocated to the inode table

Pasted image 20230305181629.png

Metadata: contains information like mode, owners, timestamps, size block, and count
Direct Blocks: These contain the pointers or the address to the blocks directly - usually around 10 to 12 blocks
Single Indirect Blocks: There is just one pointer or address that points to an index block which will contain the pointers to the disk blocks
Double Indirect Blocks: This points to the address of an index block which contains pointers(addresses) of other index blocks which point to the data
Triple Indirect Blocks: Contains a three level index block scheme.

This is good for maintaining large files and small files alike.

Performance of Disk Space Allocation

We have seen 3 main types of allocation methods:

Contiguous Allocation
Linked Allocation
Indexed Allocation
These methods vary in storage efficiency and data-block access times. These are important criteria in selecting a method for an operating system to implement

Contiguous Allocation

Requires only one access to get a disk block for any type of access(sequential or direct). Only the initial block address has to be kept in memory.
This is good for:

Sequential Access
Direct Access

Linked Allocation

The address of the next block can be kept in memory and can be read directly. Therefore, it's really good for sequential access, but an access to the $i_{t h}$ disk block will require $i$ disk reads
This is good for:

Sequential Access

Some systems make use of both contiguous and linked allocation methods. For files that require direct-access they use contiguous allocation, and for files that make use of sequential access they use linked allocation
This must be declared when the file is created.

Indexed Allocation

This is good for:

Sequential Access
Direct Access

Performance in this system depends on:

Index Structure
Size of the file
Position of the desired block to be read

Some systems combine contiguous allocation with indexed allocation. For small files they use contiguous allocation. If the file grows large, it is switched to an indexed allocation method.

Free-Space Management

We know that disk space is limited, so we need to reuse the space from deleted files for new files if possible.
To keep track of free disk space, the system maintains a free-space list which records all free disk blocks, those not allocated to some file or directory

To Create A File

We search the free-space list for the required amount of space
We then allocate that space to the new file
Then this space is removed from the free space list

When A File Is Deleted

Its disk space is added to the free-space list

The Free Space List

There are a few ways we can implement a free space list:

Bit Vector

Bit Vector

The free-space list is implemented as a bit map or bit vector

Each block is represented by 1 bit
If the block is free, it is represented as a 1 and if the block is occupied it is set to 0.

Example

Consider a disk where blocks 2,3,4,5,8,9, 10, 11, 12, 13, 17, 18, 25, 26, and 27 are free and the rest of the blocks are allocated. The free-space bitmap would be:
001111001111110001100000011100000 ...

Advantage

It is relatively simple in finding the first free block or n consecutive free blocks on the disk

Disadvantage

It's inefficient unless the entire vector is kept in main memory

Linked List Method

All the free disk blocks are linked together

A pointer pointing to the first free block is kept in a special location on the disk.
It is cached in memory
This first block contains a pointer to the next free disk block and so on

Grouping Method

Here we store the addresses of $n$ free blocks in the first free block

The first $n - 1$ blocks are actually free
The last block contains the addresses of another $n$ free blocks and so on.
the addresses of large number of free blocks can now be found quickly, unlike the situation when the standard linked-list approach is used.

Counting Method

The counting method takes advantage of the fact taht generally, several contiguous blocks may be allocated or freed simultaneously, particularly when space is allocated with the contiguous allocation algorithm or through clustering.

we keep the address of the first free block and the number $n$ of free contiguous blocks that follow the first block.
Each entry in the free-space list then consists of a disk address and a count.

System Calls

Interprocess Communication

Shared Memory

Producer Consumer Problem

Two Kinds of Buffers

Message Passing

Direct or Indirect Communication

Direct Communication

Indirect communication

Buffering

Sockets

Threads

Multithreading Benefits

Responsiveness

Resource Sharing

Economy

Utilization of multiprocessor architecture

Multi threading models

Many-to-one Model

Advantages

Disadvantages

One-to-one Model

Advantages

Disadvantages

Many-to-many model

Advantages

Scheduling

CPU and I/O Burst Cycles

Preemptive and NonPreemptive Scheduling

Preemptive Scheduling

Scheduling Criteria

CPU Utilization

Throughput

Turnaround Time

Waiting Time

Response time

FCFS (First Come First Serve)

SFJ

Priority Scheduling

Round Robin Scheduling

Turnaround time and waiting time for Round Robin

Multilevel Queue

Multilevel Feedback queue

Process Synchronization

Critical Section Problem

Requirements

Peterson's Solution

Structure of Pi in Petersons's solution

Structure of Pj in Petersons's solution

Test and Set Lock

Test and Set definition

Example

process p1

Advantages

Disadvantages

Semaphores

Wait() signal

Signal

Binary Semaphore

Counting Semaphores

Disadvantages of Semaphores

Bounded Buffer Problem

Producer

Consumer

Monitors

Monitor Syntax

Schematic View of a Monitor

Monitor with condition variables

Dining-Philosophers Solution Using Monitors

Write processes.cpp where:

Deadlocks

System model

Deadlock Characterization

Necessary conditions for deadlocks:

Resource Allocation Graph

Methods for handling Deadlocks

Deadlock Prevention

Deadlock Avoidance

Recovery

Ignorance

Structure of $P_{i}$ in Petersons's solution

Structure of $P_{j}$ in Petersons's solution