Operating Systems

CSS 430 - Operating Systems
Neso Operating Systems Video

Note

A modern general-purpose computer system consists of one or more CPUs and a number of device controllers connected through a common bus that provides access to shared memory

Pasted image 20230201134737.png

System Calls

System calls provide an interface to the services made available by an operating system

Two modes

When a user program makes a system call, for an instant, it's switched from user mode to kernel mode so it can use those resources (context switching)

The 6 types of System Calls Are:

Interprocess Communication

Any process that shares data with another process is a cooperating process
Pasted image 20230201151446.png

Shared Memory

In the shared memory model, a region of memory that is shared by cooperating processes is established
Processes can then exchange information by reading and writing data to the shared region

Producer Consumer Problem

A producer process produces information that is consumed by a consumer process

We need to make the producer and consumer so that the consumer doesn't try to consume what the producer has not consumed yet

Two Kinds of Buffers

Message Passing

Communication takes place by means of messages exchanged by the cooperating processes

Message passing provides a mechanism to allow processes to communicate and synchronize their actions without sharing the same address space and is particularly useful in a distributed environment where the communicating processes may reside on different computers connected by a network

Provides at least 2 operations

Messages sent can be either fixed or variable in size

If processes P and Q want to communicate, they must send messages to and receive messages from each other.
Pasted image 20230202131757.png
A communication link bust exist between them

There are several methods for logically implementing a link, and the send()/receive() operations, like

Direct or Indirect Communication

Processes can either use direct or indirect communication to communicate with each other

Direct Communication

each process must explicitly name sender or recepient

A link is associated with exactly two processes
Between each pair there is exactly one link

A disadvantage of direct communication is the limited modularity of the resulting process definitions. Changing the identifier of a process may necessitate examining all other process definitions

Indirect communication

Messages are sent to and received from mailboxes or ports

A communication link in this scheme has the following properties:

A mailbox may be owned by either a process or the operating system., however, when the process terminates, the mailbox will disappear.

Blocking Send: the sending process is blocked until the message is received by the receiving process or by the mailbox.
Blocking Receive: the receiver blocks until a message is available

Nonblocking send: the sending process sends the message and then resumes operation
Non Blocking Receive: the receiver retrieves either a valid message or a NULL

Buffering

Whether communication is direct or indirect, messages exchanged by communicating processes reside in a temporary queue. Such queues can be implemented in three ways:

Sockets

A socket is an endpoint for communication

A pair of processes communicating over a network employ a pair of sockets - one for each process

A socket is identified by an IP address concatenated with a port number

The server waits for incoming client requests by listening to a specified port. Once a request is received, the server accepts a connection from the client socket to complete the connection

The packets traveling between hosts are delivered to the appropriate process based on the destination port number

Threads

A thread is a basic unit of execution
A thread is comprised of:

It shares it's code, data, and file sections with its parent process and other threads belonging to that process

Multithreading Benefits

Responsiveness

Multithreading an interactive application may allow a program to continue running even if part of it is blocked or is performing a lengthy operation, thereby increasing responsiveness to the user

Resource Sharing

By default, threads share the memory and the resources of the process to which they belong. The benefit of sharing code and data is that it allows an application to have several different threads of activity within the same address space

Economy

Allocating memory and resources for process creation is costly. Because threads share the resources of the process to which they belong, it is more economical to create and context switch threads

Utilization of multiprocessor architecture

The benefits of multithreading can be greatly increased in a multiprocessor architecture where threads may be running in parallel on different processors. A single threaded process can only run on one CPU no matter how many are available. Multi-threading on a multi cpu machine increases concurrency

Multi threading models

In general, there are two types of threads:

For user threads to work with kernel threads, there must exist a relationship between them.

There are three common ways of establishing this relationship

Many-to-one Model

Pasted image 20230203160855.png

Advantages

Disadvantages

One-to-one Model

Pasted image 20230203161127.png

Advantages

Disadvantages

Many-to-many model

Pasted image 20230203161434.png

Implemented in most systems

Advantages

Scheduling

In a single-processor system, only one process can run at a time. All others must wait until the cpu is free and can be rescheduled.

The objective of multiprogramming is to have some process running at all times to maximize utilization

A process is executed until it must wait, typically for the completion of some I/O request

Several processes are kept in memory at one time

When one process has to wait, the operating system takes the CPU away from that process and gives the CPU to another process and this pattern continues

CPU and I/O Burst Cycles

Process execution consists of a cycle of CPU execution and I/O wait. Processes alternate between these two states.

When a process has begun execution, it's under CPU state, otherwise it's waiting for an I/O event

CPU Burst: the time when a process is under CPU execution

I/O burst: when the process is waiting on an I/O event

Pasted image 20230203164525.png

Eventually, the final CPU burst ends with a system request to terminate execution

Preemptive and NonPreemptive Scheduling

Dispatcher: The dispatcher is the module that gives control of the CPU to the process selected by the short-term scheduler. The time it takes for the dispatcher to stop one process and start another running is known as the dispatch latency

Preemptive Scheduling

CPU- Scheduling may take place under the following four circumstances

For the first and last circumstance, there is no choice in terms of scheduling, we call this nonpreemptive, or, when a process is running, it will be allowed to use the CPU until its execution is complete

Scheduling Criteria

CPU Utilization

We want to keep the CPU as busy as possible. Conceptually, CPU utilization can range from 0 to 100 percent. In a real system, it should range from 40 percent to 90 percent

Throughput

The number of processes completed in a certain time unit. If the CPU is busy, the work is being done.

Turnaround Time

The time it takes from a process entering a ready queue to it's completion

Turn Around Time = Completion Time - Arrival Time

Waiting Time

The total sum of time a process spends waiting in the ready queue

Waiting time = Turn Around Time - Burst Time

Response time

The time it takes from submission until the first response is produced

FCFS (First Come First Serve)

SFJ

Waiting time = Total Waiting Time - No. of milliseconds Process Executed previously - arrival time

Preemptive SJF
Pasted image 20230204165320.png
Pasted image 20230204165612.png
![[Pasted image 20230204165951.png

The difficulty with SJF is knowing the length of the next CPU request
We try to approximate the next CPU burst.

Priority Scheduling

Starvation is a problem with priority scheduling. In this instance, a lower priority process is indefinitely blocked. Some systems increase the priority of processes if they haven't been accessed recently (aging)

Round Robin Scheduling

Pasted image 20230204174623.png

Turnaround time and waiting time for Round Robin

Pasted image 20230204180619.png

Pasted image 20230204180753.png

One method for calculating
Turn around time = completion time - arrival time
waiting time = Turnaround time - burst time

Or we can use
Waiting time = final start time - arrival time - (preemption x time quantum) where preemption = num times the process was preempted

P1 was preempted
Pasted image 20230204181957.png

Multilevel Queue

A class of scheduling algorithms has been created for situations in which processes are easily classified into different groups

Example: Foreground processes and Background processes.

The Multilevel queue partitions the ready queue into several separate queues

Each queue may have a different scheduling algorithm such as Round Robin for foreground and First come First serve for background

Multilevel Feedback queue

In multilevel feedback queue, processes may move from one queue to another. You separate them according to the characteristics of their CPU bursts. If a process uses too much CPU time, it will be moved to a lower priority queue

This leaves the I/O bound and interactive processes in the higher-priority queues

In general, A MFQ is defined by:

Process Synchronization

A cooperating process is one that can affect or be affected by other processes executing in the system

Cooperating processes can either

This leads to a problem where concurrent access to shared data may result in data inconsistency

Shared memory systems is a good example of the shared data problem

To understand this, we need to have an understanding of how this happens at the machine level (assembly)

When several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which the access takes place, it is called a race condition

Since we don't want race conditions, we need process synchronization.

To deal with this we can use a few methods

Critical Section Problem

Consider a system with n processes (N1, N2....Nn)

Each process has a segment of code called a critical section in which the process may be changing common variables, updating a table, writing a file, and so on

When one process is executing in its critical section, no other process is allowed to execute in its critical section.

That is, no two processes are executing in their critical sections at the same time

The critical-section problem is to design a protocol that the processes can use to cooperate

Pasted image 20230205145343.png

A solution to the critical-section problem must satisfy the following three requirements

Requirements

  1. Mutual Exclusion: If a process P is executing in its critical section, then no other processes can be executing in their critical sections.
  2. Progress: If no process is executing in its critical section and some processes wish to enter their critical sections, then only those processes that are not executing in their remainder sections can participate in the decision on which will enter its critical section next, and this selection cannot be postponed indefinitely.
  3. Bounded Waiting: There exists a bound, or limit on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is accepted (otherwise it may lead to starvation)

Peterson's Solution

Petersons solution is a 'humble' algorithm

The solution is restricted to two processes that alternate execution between their critical sections and remainder sections, we will call them Pi and Pj
Peterson's solution also requires two data items to be shared between the two processes:

When a process is ready to enter its critical section, it will set its flag to true.
When process Pi wants to enter its critical section, it will set its turn variable to j

When one process wants to enter its critical section, it offers its turn to the other process

Structure of Pi in Petersons's solution

do {
 flag[i] = true;
 turn = j;
 while(flag[j] && turn ==[j]); // this semicolon keeps this secion looping while it's true, so only when it's false does it move to the critical section
	 //critical section
flag[i] = false;
	//remainder section
}while(TRUE);

Structure of Pj in Petersons's solution

do {
 flag[j] = true;
 turn = i;
 while(flag[i] && turn ==[i]); // this semicolon keeps this secion looping while it's true, so only when it's false does it move to the critical section
	 //critical section
flag[i] = false;
	//remainder section
}while(TRUE);

Test and Set Lock

Test and Set definition

boolean TestAndSet(boolean *target){
boolean rv= *target;
*target = TRUE;
return rv;
}

This is an atomic operation

Example

lock is always 0 initially, and we are passing lock by reference

process p1

do{
while(TestAndSet(&lock));//if the while condition is false, it will continue processing, however, we set the lock value to true so other processes cannot take the lock
//do nothing
//critical section
lock = FALSE; 
//remainder section
} while(TRUE)

Advantages

Disadvantages

Semaphores

Wait() signal

P(Semaphore S){
while (S<= 0); 
//no operation...forced to wait until S>0
S--;
}

It will only decrement S if it is greater than 0

Signal

V(Semaphore S){
S++;
}

increments the value of S

Binary Semaphore

The value of a binary semaphore can only range between 0 and 1. On some systems, binary semaphores are known as mutex locks, as they are locks that provide mutual exclusion

Counting Semaphores

The value of a counting semaphore can range over an unrestricted domain. It is used to control access to a resource that has multiple instances.

Disadvantages of Semaphores

The main disadvantage of a semaphore is that it requires "busy waiting".

To overcome the need for busy waiting, we can modify the definitions of wait() and signal()?

When a process executes the wait() operation and finds that the semaphore value is not positive, it must wait
However, rather than engaging in busy waiting, the process can block itself

This still may lead to deadlocks and starvation.

Bounded Buffer Problem

There is a buffer of n slots and each slot is capable of storing one unit of data.

There are two processes running, namely a producer and consumer which are operating on the buffer

Pasted image 20230205171847.png

We can make use of three semaphores to solve this problem.

  1. m(mutex): a binary semaphore which is used to acquire and release the lock
  2. empty: a counting semaphore whose initial value is the number of slots in the buffer, since, initially all slots are empty
  3. full: a counting semaphore whose initial value is 0 (how many slots are occupied)

Producer

do{
wait(empty)//wait until empty>0 then decrement empty
wait(mutex)//acquire lock
//add data to buffer
signal(mutex)//release lock
signal(full)//increment full
}while(TRUE)

Consumer

do{
wait(full)//wait until full>0
wait(mutex)//acquire lock
//remove data from buffer
signal(mutex)//release lock
signal(empty)//increment empty
}

Monitors

monitors are a high level abstraction which provide an effective mechanism for process synchronization

Monitor Syntax

monitor monitor_name{
//shared variable declarations  shared between processes
procedure P1(...){

}
procedure P2(...){

}
procedure PN(...){

}

initialization code(...){

}
}

Monitors also need a condition construct which are declared in the style condition x,y;

Schematic View of a Monitor

Pasted image 20230315135836.png

Monitor with condition variables

Pasted image 20230315140200.png

Dining-Philosophers Solution Using Monitors

The monitor solution imposes the restriction that a philosopher may pick up their chopsticks only if both are available

monitor dp{
	enum{THINKING,HUNGRY,EATING} state[5];
	condition self[5];

	void pickup(int i){
		state[i] = HUNGRY;
		test(i);
		if(state[i]!=EATING)
		self[i].wait()
	}
	void putdown(int i){
		state[i] = THINKING;
		test((i+4)%5);
		test((i+1)%5);
	}
	void test(int i){
		if((state[(i+4)%5] != EATING)&& (state[(i+1)%5] != EATING) && state[i] == HUNGRY ){
		state[i]= EATING;
		self[i].signal();
		}
	}
	initialization(){
	for(int i=0; i<5; i++){
		state[i]= THINKING;
		}
	}
}

Write processes.cpp where:

A parent process writes a message to:
cout << “okay?” << endl;
But stdout is redirected to pipe 1(0). dup2(fd[0][1],1)

A child process read the above message from
cin >> message;
But stdin is redirected to pipe 1(0). //std in = dup2(x,0)
dup2(fd[0][0],0)

A child process then writes a response to:
cout << “yes!” << endl;
But stdout is redirected to pipe 2(1). //std out = dup2(x,1)
dup2(fd[1][1],1)

A parent process reads the above response from
cin >> response;
But stdin is redirected to pipe 2.
dup2(fd[1][0],0)

Then, the parent write this response to stderr like:
cerr << response;

#include <sys/types.h>   // for fork, wait                                                                                       
#include <sys/wait.h>    // for wait                                                                                             
#include <unistd.h>      // for fork, pipe, dup, close                                                                           
#include <stdio.h>       // for NULL, perror                                                                                     
#include <stdlib.h>      // for exit                                                                                             

#include <iostream>      // for cout                                                                                             

using namespace std;

int main( int argc, char** argv ) {
  int fd[2][2];   // fd[0][] parent-->child; fd[1][] child-->parent                                                              
  int pid;        // process ID                                                   
  char response[10], message[10];

  if ( pipe( fd[0] ) < 0 ) { // parent wants to create pipe fd[0]                                                                
    perror( "pipe fd[0] error" );
    exit( -1 );
  }
  if ( pipe( fd[1] ) < 0 ) { // parent also wants to create pipe fd[1]                                                           
    perror( "pipe fd[1] error" );
    exit( -1 );
  }
  if ( ( pid = fork( ) ) < 0 ) {
    perror( "fork a child: error" );
    exit( -1 );
  }
  if ( pid > 0 ) { // parent                                                                                                     
    close( fd[0][0] ); //close parent-->child read pipe as it's not used
    dup2( fd[0][1], 1 ); //dup2 redirect pipe write 0 to stdout

    close( fd[1][1] ); //close child--> parent write pipe as it's not used
    dup2( fd[1][0], 0 );//redirect read of pipe 1 to std in

    cout << "okay?" << endl;
    cin >> response;
    cerr << response;
  }
  else { // child                                                                                                                
    close( fd[0][1] ); //close parent-->child write pipe
    dup2( fd[0][0], 0 ); //duplicate parent-->child read pipe to stdin

    close( fd[1][0] ); //close child--> parent read pipe
    dup2( fd[1][1], 1 );//redirect write pipe of childparent to std out

    cin >> message;
    cout << "yes!" << endl;
  }
}

Deadlocks

In a multiprogramming environment, several processes may compete for a finite number of resources.

A process requests resources, and if the resources are not available at that time, the process enters a waiting state.
Sometimes, a waiting process is never again able to change state because the resources it has requested are held by other waiting processes. This situation is known as a Deadlock

System model

A system consists of a finite number of resources to be distributed among a number of competing processes.
The resources are partitioned into several types, each consisting of some number of identical instances.

Under a normal mode of operation, a process may utilize a resource only in the following sequence.

  1. Request: If the request cannot be granted immediately(for example, if the resource is being used by another process) then the requesting process must wait until it can acquire the resource.
  2. use: The process can operate on the resource(for example, if the resource is a printer, the process can print on the printer)
  3. Release: the process releases the resource.

Deadlock Characterization

In a deadlock, processes never finish executing and system resources are tied up preventing other jobs from starting

Necessary conditions for deadlocks:

A deadlock situation can arise if the following four conditions hold simultaneously in a system

  1. Mutual Exclusion: At least one resource must be held in a non-sharable mode, or, only one process at a time can use the resource. Any process that requests that resource must be delayed until the resource has been released.
  2. Hold and wait: A process must be holding atg least one resource, and waiting to acquire additional resources that are currently being held by other processes.
  3. No preemption: Resources cannot be preempted. A resource can only be released voluntarily by the process holding it. After that process has completed it's task.
  4. Circular Wait: A set {P0,P1,...Pn} of waiting processes must exist such that P0 is waiting for a resource held by P1, P1 is waiting for a resource held by P2...,Pn1 is waiting for a resource held by Pn.

Resource Allocation Graph

Deadlocks can be described more precisely in terms of a directed graph called a system resource-allocation graph.

This graph consists of a set of V vertices and a set of edges E. The set of vertices V is partitioned into two different types of nodes, P=P1,P2,...PN, the set consisting of all active threads in the system, and R=R1,R2,...RM, the set of all resource types in the system.

A directed edge from thread Pi to resource type Rj is denoted by PiRJ. it signifies that thread Pi has requested an instance of resource type Rj and is currently waiting for that resource. A directed edge of from resource Rj to thread Pi is denoted by RjPi. It signifies that an instance of resource type Rj has been allocated to thread Pi. A directed edge PiRj is called a request edge and a directed edge RjTi is called an assignment edge.

Generally we represent each thread Ti as a circle and each resource type Rj as a rectangle

Since resource Rj may have more than one instance, we represent each instance as a dot within the rectangle. A request edge only points to the rectangle, however, an assignment edge points from one of the dots(resources) in the rectangle.
Pasted image 20230213192632.png

If the graph contains no cycles, then no process in the system is deadlocked. HOWEVER, if a graph contains a cycle, a deadlock may exist.

Methods for handling Deadlocks

In general, there are three ways we can handle deadlocks:

  1. Prevention: We can use a protocol to prevent or avoid deadlocks, ensuring that the system will never enter a deadlock state
  2. Recovery: We can allow the system to enter a deadlock state, detect it, and recover.
  3. Ignore it: We can ignore the problem altogether, and pretend deadlocks never occur in the system. (This is used in most operating systems)

To ensure deadlocks never occur, the system can either use a deadlock prevention, or a deadlock avoidance scheme

Deadlock Prevention

Provides a set of methods to ensure at least one of the criteria for deadlock cannot hold

Deadlock Avoidance

Requires the operating system be given in advance, additional information concerning which resources a process will request during its lifetime.
With this knowledge, it can decide for each request whether or not the process should wait.

Recovery

The system can provide an algorithm that examines the state of the system to determine whether or not a deadlock has occurred and an algorithm to recover from the deadlock.

Ignorance

If a system neither ensures that a deadlock will never occur nor provides a mechanism for deadlock detection and recovery then we may arrive at a situation where the system is in a deadlocked state yet has no way of recognizing what has happened.

In this case, the undetected deadlock will result in the deterioration of the systems performance, because resources are being held by processes that cannot run and because more and more processes, as they make requests for resources, will enter a deadlocked state.

Eventually, the system will stop functioning and will need to be restarted manually.

Deadlock Prevention

For a deadlock to occur, all four of the following must be present:

Mutual Exclusion

The mutual condition must hold for non-sharable resources
eg: A printer cannot be simultaneously shared by several processes.

Sharable resources, in contrast, do not require mtuually exclusive access and thus cannot be involved in a deadlock.

For example, Read-only files. If several processes attempt to open a read-only file, there will be no problem.

In general we cannot prevent deadlocks by denying the mutual exclusion condition because some resources are inherently non-sharable

Hold and Wait

To ensure the hold-and-wait condition never occurs, we must guarantee that whenever a process requests a resource, it does not hold any other resources.

One protocol that can be used requires each process to request and be allocated all its resources before it begins execution.

An alternative protocol allows a process to request resources only when it has none. A process may request some resources and use them. Before it can request any additional resources, however, it must release all the resources that it's currently allocated.

Both of these protocols have two main disadvantages

  1. Resource Utilization may be low
  2. Starvation is possible

No Preemption

To ensure this condition does not hold, we can use the following protocol:

If a process is holding some resources and requests another resource that cannot be immediately allocated to it, then all resources currently being held are preempted

Alternatively, if a process requests some resources, we first check whether they are available. If they are, we allocate the. If they are not, we check whether they are allocated to some other process that is waiting for additional resources. If so, we preempt the desired resources from The waiting process and allocate them to the requesting process.
This protocol is often applied to resources whose state can be easily saved and restored later such as registers and memory space.

Circular Wait

A way to ensure that this condition never holds is to impose a total ordering of all resource types and to require that each process requests resources in an increasing order of enumeration.

Example: Suppose some process, P1, is allocated resource R5. Now, if P1 requests for resources R3 and R4 (which are less than R5), the requests will NOT be granted. Only requests for resources that are greater than R5 will be granted.

Developing a resource hierarchy does not itself prevent deadlocks. It is up to developers to write programs that follow that ordering.

Deadlock Avoidance

An alternative method for avoiding deadlocks is to require additional information about how resources are to be requested.

For example: In a system with one tape drive and one printer, the system might need to know that process P will request the tape drive and then the printer before releasing both resources, whereas process Q will request first the printer and then the tape drive.
With this knowledge of the complete sequence of requests and releases for each process, the system can decide for each request whether or not the process should wait in order to avoid a possible future deadlock

Each request requires that in making this decision the system consider the resources currently available, the resources currently allocated to each process, and the future requests and releases of each process.

Safe State: A state is safe if the system can allocate resources to each process(up to its maximum) in some order and still avoid a deadlock.

A system is in a safe state only if there exists a safe sequence. A sequence of processes <P1,P2,...Pn is a safe sequence for the current allocation state if, for each Pi the resource requests that Pi can still make can be satisfied by the currently available resources plus the resources held by all of Pi.

A safe state is not a deadlocked state, however not all unsafe states are deadlocked.

For example, lets say you have 12 resources and three threads, <T0,T1,andT2>. Thread T0 requires a maximum of 10 resources, T1 a maximum of four, and T2 a max of 9. Suppose that at time t0, thread T0 is holding 5 resources, T1 is holding 2, and T2 is holding 2, leaving us with 3 free resources (5+2+2=9).

Maximum Needs Current Needs
T0 10 5
T1 4 2
T2 9 2

at time t0 the system is in a safe state. The sequence <T1,T0,T2> satisfies the safety condition. Thread T1 can be allocated 2 resources, complete it's task at which point there will be 5 available resources. thread T0 can then get it's necessary 5 resources, complete it's task, and return them which would give us 10 available resources. Finally T2 can acquire its needed resources and return them giving the system all 12 resources back.

A system can go from safe to unsafe pretty easily. For instance, if at time t1 thread T2 is allocated one more resource, there will not be enough resources to complete task T0 after T1 completes
Additionally, If thread T2 had requested and been allocated 6 resources, we would have had to wait, which would have resulted in deadlock. Therefore the mistake is in granting additional resources to thread T2. Instead we should have made it wait.

Resource Allocation Graph Algorithm

If we have a resource-allocation system with only one instance of each resource type, we can use a variant of the resource-allocation graph for deadlock avoidance. In addition to the request and assignment edges already described, we can introduce a new type of edge, called a claim edge. A claim edge TiRj indicates a thread Ti may request resource Rj at some time in the future. This edge resembles a request edge in direction but is represented in the graph by a dashed line. When thread Ti requests resource Rj the claim edge TiRj is converted to a request edge. Similarly, when resource Rj is released by Ti the assignment edge RjTi is reconverted to a claim edge TiRj.

The resources must be claimed a priori in the system, or before thread Ti starts executing, all its claim edges must already appear in the resource allocation graph. We can relax this condition by allowing claim edge TiRj to be added to the graph only if all the edges associated with thread Ti are claim edges.

Suppose that thread Ti requests Rj, the request can be granted only if converting the edge TiRj to an assignment edge RjTi does not result in the formation of a cycle in the resource-allocation graph. We check for safety by using a cycle-detection algorithm. An algorithm for detecting a cycle requires an order of n2 operations where n is the number of threads in the system.

If no cycle exists, the allocation of resources will leave the system in a safe state. If a cycle is found, the allocation will put the system in an unsafe state. in that case, thread Ti will have to wait for its request to be satisfied

Example:

Pasted image 20230214152919.png
Suppose that T2 requests R2. Although R2 is currently free, we cannot allocate it to T2 since this action will create a cycle in the graph. A cycle indicates that the system is in an unsafe state. If T1 requests R2 and T2 requests R1 then a deadlock will occur

An unsafe state in the resource-allocation graph

Pasted image 20230214153408.png

This only happens when there is one instance of a resource

Memory Management

The main purpose of a computer system is to execute programs

During executions, programs, together with the data they access, must be in main memory (at least partially).

Main Memory

In CPU scheduling, we see how the CPU can be shared by a set of processes. As a result of this, we see that CPU utilization, as well as speed of the computers response to its users can usually be improved

To realize this improvement however, we must keep several processes in memory. Memory must be shared**

Basic Hardware

Memory consists of a large array of words or bytes, each with its own address
A CPU fetches instructions from memory according to the value of the program counter
These instructions may cause additional loading from and storing to specific memory addresses

A typical instruction-execution cycle first fetches an instruction from memory, the instruction is then decoded and may cause operands to be fetched from memory. After the instruction has been executed on the operands, results may be stored back in memory

A CPU can only access from:

Cycle time for accessing memories

Protection of OS from unauthorized access

the operating system must be protected from access by user processes. Additionally, user processes must be protected from one another.
This protection must be provided by the hardware.

To do this, we ensure that each process has a separate memory space. To do this, we need the ability to determine the range of legal addresses that the process may access and to ensure that the process can easily access only these legal addresses

This is done with two registers. The base register and the limit register.

The base and limit registers can only be loaded by the operating system which uses a special privileged instruction

Since privileged instructions can only be executed in kernel mode, and since only the operating system operates in kernel mode, only the operating system can load the base and limit registers.

This scheme allows the operating system to change the value of the registers but prevents user programs from changing the registers contents.

Address binding

Usually a program resides on a disk as a binary executable file

To be executed, the program must be brought into memory and placed within a process

Depending on the memory management in use, the process may be moved between the disk and memory during its execution

The processes on the disk that are waiting to be brought into memory for execution form the input queue

Flow goes like this:
input queue -> select a process -> loads it into memory. -> executes and accesses instructions and data from memory -> process terminates ->memory space is declared available

Most systems allow a user process to reside in any part of the physical memory

Though the address space of a computer starts at 00000, the first address of the user space does not need to start there

In most cases, a user program will go through several steps during compile time, load time, execution time before being executed.

Logical Vs Physical Address Space

We often refer to logical addresses as Virtual Addresses

How do we map these?
The runtime mapping from virtual to physical addresses is done by a hardware device called the Memory-Management Unit(MMU)

Dynamic Relocation using a Relocation Register

Pasted image 20230215214211.png

Dynamic Loading

The entire program and all data must be in physical memory for a process to execute. This means that the size of a process is limited by the size of physical memory
To obtain better memory-space utilization we can use dynamic loading

With dynamic loading a routine is not loaded until it's called.

All routines are kept on disk in a relocatable load format

The main program is loaded into main memory and executed

When a routine needs to call another routine, the calling routine first checks to see whether the other routine has been loaded

If it has not, the relocatable linking loader is called to load the desired routine into memory and update the program's address tables to reflect this change.

Then control is passed to the newly loaded routine.

Dynamic loading does not require special support from the operating system. It is the responsibility of the users to design their programs to take advantage of such a method

Dynamic Linking and Shared Libraries

When a program runs, it needs to use certain system libraries

Windows uses DLL files, or dynamic linked libraries.

With dynamic linking, a stub is included in the image for each library routine reference.

when a stub is executed, it check to see if the needed routine is already in memory, if not, the program loads the routine into memory.

Either way, the stub replaces itself with the address of the routine and executes the routine

This means, the next time that particular code segment is reached, the library routine is executed directly, incurring no cost for dynamic linking.

Swapping

A process must be in memory to be executed

However, a process can be swapped temporarily out of memory to a backing store and then brought back into memory for continued execution.

Note

A backing store is a computer storage device, usually a disk, that provides additional storage space for information so that it can be accessed and referred to when required and may be copied into the processor if needed.

Example:

In a multiprogramming environment with a round-robin CPU scheduling algorithm:
When a quantum expires, the memory manager will start to swap out the process that just finished and to swap another process into the memory space that has been freed.
The processes removed from scheduling may be placed in the backing store
Pasted image 20230220182511.png

Memory Space when Swapping Happens

Normally, a process that is swapped out will be swapped back into the same memory space it previously occupied
However, this heavily relies on the address binding method

If the binding is done at assembly or load time, the process can not be easily moved to a different location
If binding is done at execution time, the process can be swapped into a different memory space because physical addresses are computed during execution time.

Backing Store

The backing store is commonly a fast disk drive

Memory Allocation

We divide memory into several fixed sized partitions, each partition contains one process.
When a partition is free, a process is selected from the input queue and is loaded into the free partition.

When the process terminates, the partition becomes available for another process

Dynamic Storage Allocation Problem

The Dynamic storage allocation problem is as follows:
How do you satisfy a request of size N from a list of free partitions/holes in main memory?

Solutions

First Fit

Best Fit

Worst Fit

This is the opposite of best fit.

Fragmentation

As process are loaded and removed from memory, the free memory space is broken into little pieces which results in fragmentation.

These are the free memory partitions that are left in between spaces

There are two types of fragmentations:

  1. External Fragmentation
  2. Internal Fragmentation

External Fragmentation

It exists when there is enough total memory space to satisfy a request, but the available spaces are not contiguous.

Storage is fragmented into a large number of small holes
This fragmentation problem can be severe. In the worst case, we could have a block of free(or wasted) memory between every two processes
If all these small pieces of memory were in one big free block instead, we might be able to run several more processes.

We cannot combine these segments are they are not contiguous

Internal Fragmentation

It occurs when memory blocks assigned to processes are larger than the process actually needs.
Some portion of memory is left unused as it cannot be used by a process

The solution for internal fragmentation is by using the "best fit" approach to memory allocation.

Solution to external fragmentation

Compaction is the solution to external memory fragmentation

Compaction

Shuffle the memory contents so as to place all free memory together in one large block.

Paging

Paging is a memory management scheme that permits the physical address space of a process to be non-contiguous

Paging avoids the considerable problem of fitting memory chunks of various sizes into the backing store.

Most memory-management schemes suffer from this problem.

Basic Method of Paging

Paging allows users to store processes in non-contiguous segments

Page Tables

Every address generated by the CPU is divided into two parts:

The page table contains the base address of each page in physical memory
This base address is combined with the page offset to define the physical memory address that is sent to the memory unit

Pasted image 20230227110147.png

Translation of Logical Address To Physical Address using a Page Table

Every address generated by the CPU is divided into two parts:

Pasted image 20230227111742.png

logical address to a physical address is (Frame×PageSize)+Offset

Hardware Implementation of a Page Table

TLB

The TLB is an associative, high-speed memory

It consists of two parts:

When the associative memory is presented with an item, the item is compared with all keys simultaneously
If the item is found, the corresponding value field is returned
This is a very fast search

TLB with Page Table

If an entry in the TLB is not found, once it goes out to main memory, it may replace a prior entry in the TLB (This is sometimes done with LRU or Least Recently Used)
Pasted image 20230227114052.png

Page Table Entries

- The valid/invalid bit is actually present in main memory. If it is not present, it is known as a **page fault** - If the page is not present in main memory, the Present/Absent bit is set to 0 - The protection bit is also known as the Read/Write bit and sets the permission for reading or writing. The bit is set to 0 if only read is allowed, The bit is set to 1 if both read and write are allowed - The referenced bit sets whether or not the page has been referenced within the last clock cycle (set to 1 if true) - The caching bit is used for enabling or disabling the caching of the page when disabled, it's set to 1 - The Dirty bit is also known as the modified bit, whether it's been modified or not. If the page has been modified it is set to 1 This helps in avoiding unnecessary writes to the secondary memory when a page is being replaced by another page.

Shared Pages

An advantage of paging is the possibility of sharing common code.

Consider a system that supports 40 users, each of whom executes a text editor.
If the text editor consists of 150kb of code and 50kb of data space,
With non-shared memory, we would need 200*40 or 8000kb of space.

However, if the text editor was in shared memory, we would only need 150kb + 50*40 or...2150kb

With shared pages, each process has its own copy of registers and data storage to hold the data for the process's execution
The data for two different processes will obviously be different.

Pasted image 20230227121617.png

Hierarchical Paging

Most modern computer systems support a very large logical address space (232 to 264)

In such an environment, the page table itself becomes excessively large

For example, imagine a system with:
32-bit logical address space
If page size = 4kb (212)
Then a page may consist of up to (232212) = 1 million entries

If each entry consists of 4 bytes, each process may need up to 4mb (4* 1 million) of physical address space

If we try to allocate a page table of this size contiguously in main memory, we will run into problems

The solution

Divide the page table into smaller sizes.
This is known as multi-level paging
Use a two-level paging algorithm, in which the page table itself is also paged

A two level page table scheme

Pasted image 20230301150420.png
Pasted image 20230301150430.png

Hashed Page Tables

Hashed page tables are used for handling address spaces larger than 32 bits (or large address spaces)

The Algorithm

Clustered Page Tables

This is a variation of the hashed page table that is most applicable to 64bit addresses

Inverted Page Tables

In most operating systems, a separate page table is maintained for each process. For 'n' number of processes, there will be 'n' number of page tables.
For large processes, there would be many pages and for maintaining information about these pages, there would be too many entries in their page tables which itself would occupy a lot of memory.
This means, memory utilization is not efficient is a lot of memory is wasted in maintaining the page tables themselves

Inverted page tables solve this problem

Implementation

Pasted image 20230301152513.png
An inverted page table has one entry for each real page(or frame) of memory
Each entry consists of the virtual address of the page stored in that real memory location, with information about the process that owns that page.

This means, only one page table exists in the system, and it only has one entry for each page of physical memory.

With this, each virtual address consists of a triple with three fields

Each inverted page table entry is a pair

When a memory reference occurs, part of the virtual address, consisting of (process-id, page number) is presented to the memory subsystem

The invertaed page table is then searched for a match
If a match is found, say at entry 'i', then the physical address (i, offset) is generated

Advantages and Disadvantage

Advantage

Disadvantage

Segmentation

Segmentation is another non-contiguous memory allocation technique similar to paging
Unlike paging however, in segmentation, the processes are not divided into fixed pages
Instead, the processes are divided into several modules called segments which improves the visualization for the users.

Here, both secondary memory and main memory are divided into partitions of unequal sizes.

The user specifies the segment name and an offset to access that item in memory

In paging, the user only needs to specify a single address, which is partitioned by the hardware into a page number and an offset, all invisible to the programmer

For simplicity, segments are numbered and referred to by a segment number rather than a segment name.
Therefore, a logical address consists. of a two tuple:

Each entry in a segment table has a segment base and a segment limit
The segment base contains the starting physical address where the segment resides in memory and the limit specifies the size of the segment

Virtual Memory

The purpose of memory management is to keep many processes in memory simultaneously to allow multiprogramming. This can be difficult though as that tends to require that an entire process be in memory before it can begin executing. Therefore, the processes that can be loaded into main memory are often limited by the size of the main memory.

For example, if the size of main memory is 2gb, and a process requires 3gb...it becomes difficult to accommodate such processes in main memory. We can use methods such as dynamic loading but that generally requires extra precautions by the programmer.

What is Virtual Memory?

Virtual memory is a storage scheme where secondary memory can be treated as though it is a part of main memory and large processes can be stored in it. Only the part of the process that is actually needed for execution will be loaded to the actual main memory.

In most real programs, there is often code to handle unusual error conditions. Since these errors seldom, if ever, occur in practice, this code is almost never executed

Another examples, is arrays, lists, and tables. These are often allocated more memory than they actually need. An array may be declared 100 by 100 elements, even though it is seldom larger than 10 by 10 elements

Finally, an assembler symbol table may have room for 3000 symbols, although the average program has fewer than 200 symbols.

Pasted image 20230301172419.png

Benefits of being able to execute a program that is only partially in memory

Virtual memory involves the separation of logical memory as perceived by users from physical memory, this separation allows an extremely large virtual memory to be provided for programmers when only a smaller physical memory is available.

Demand Paging

How can an executable be loaded from disk to memory?

Possible Approaches

  1. Load the entire program into physical memory at program execution time.
  1. Load pages only as they are needed. Pages are loaded when they are demanded during program execution. Pages that are never accessed are never loaded into memory this is known as demand paging. Pages are demanded when they are needed

A demand paging system is similar to a paging system with swapping where processes reside in secondary memory(usually a disk)

When we want to execute a process, we swap it into memory.
Rather than swapping the entire process into memory, however, we use a lazy swapper

A lazy swapper never swaps a page into memory unless that page will be needed.

Since we are now viewing a process as a sequence of pages, rather than as one large contiguous space, the use of the term swapper is technically incorrect.

In connection with demand paging, the more appropriate term is pager
A swapper manipulates entire processes, whereas a pager is concerned with the individual pages of a process.

Hardware implementation of Demand Paging

In order to implement demand paging with hardware, we need to be able to distinguish between pages that are in memory and the pages on the disk

To do this, we can make use of the Present/Absent (or Valid/Invalid Bit).

Page Table Entries

Page table entries for pages that are brought into main memory will be set as usual
Page table entries for pages that are not currently in memory will either be marked as invalid or contains the address of the page on disk in secondary memory

Pasted image 20230302120012.png

Page Faults

Page Fault: when a process tries to access a page that is not in main memory(or marked as invalid)

Access to a page marked as invalid causes a page-fault trap. The paging hardware will notice that the invalid bit is set, causing a trap on the operating system.

A page fault may occur at any memory reference.

Procedure for handling page faults

  1. We check an internal table, (usually kept within the Process Control Block) for this process to determine whether the reference was a valid or invalid memory access
  2. If the reference was invalid, we terminate the process
  3. If it was a valid memory access(legal, not in valid/invalid bit sense), but we have not yet brought in that page, we now page it in.
  4. We find a free frame(by taking one from the free-frame list)
  5. We schedule a disk operation to read the desired page into the newly allocated frame
  6. When the disk read is complete, we modify the internal table kept within the process and the page table to indicate that the page is now in memory(set valid/invalid bit to 1)
  7. We restart the instruction that was interrupted by the trap
    The process can now access the page as though it had always been in memory
    Pasted image 20230302130131.png
    Pasted image 20230302130155.png
    Pasted image 20230302130208.png
    Pasted image 20230302130219.png
    Pasted image 20230302130227.png

Demand Paging Performance

A computer systems performance can be significantly affected by Demand Paging

Memory Access time (ma) for most systems usually ranges from 10 to 200 nanoseconds

If there are no page faults, the Effective Access Time will be equal to the Memory Access Time

However, if a page fault occurs, we need to calculate the Effective Access Time
let p be the probability of a page fault 0p1

Then, effective access time =

(1p)×ma+p×PageFaultTime

(1p)×ma is the time spent for a normal memory access where 1p is the probability that a page fault won't occur
p×PageFaultTime is the time spent handling page fault

There are three major components of the page-fault service time:

  1. Service the page-fault interrupt
  2. Read in the page
  3. restart the process

For example: If you have a memory access time of 200nanoseconds, and a page fault time of 8milliseconds(8,000,000 nanoseconds), and the probability of a page fault is 25% then you'd have .75×200+.25×8,000,000

You can also create a formula out of it which would be mama(p)+PageFaultTime(p)

Copy On Write

Copy on write (CoW) is a technique that is used for sharing virtual memory or pages. It's mostly used in conjunction with the fork() system call that is used for creating child processes

fork() will create a copy of the parent's address space for the child, duplicating the pages belonging to the parent.
Instead of duplicating the pages, we can make the parent and child processes share the common pages, and only create a copy when one of the processes (either child or parent) wants to write(modify) a page.

Before Process 1 modifies page C

Pasted image 20230302133714.png

After Process 1 modifies Page C

Pasted image 20230302133734.png

Problems with Demand Paging

With the help of demand paging we are increasing our degree of multi-programming by loading only the required pages into memory, hence facilitating the possibility of having more processes loaded into memory.

However, this could lead to an issue

Overallocation of memory

So what can the OS do?

  1. Terminate the process: Not ideal as it destroys the purpose of Demand Paging
  2. The operating system can instead swap out an entire process, freeing all it's frames and reducing the level of multiprogramming....good in certain circumstances
  3. We can make use of a page replacement technique(most common solution)

Page Replacement

Page Replacement is the technique of swapping out pages from physical memory when there are no more free frames available, in order to make room for other pages which have to be loaded to the main memory.

If a process wants to use/access a page which is not present in physical memory, it will cause a page fault. Now that page has to be loaded into memory

However, if there are no free frames available, how do we handle it?

Steps of page replacement

  1. Find the location of the page to be loaded on the disk (in the backing store)
  2. If there is a free frame, use that frame by loading the page into the frame
  3. If there are no free frames, use a page-replacement algorithm to select a victim frame.
  4. Write the victim frame to the disk, and change the page and frame tables accordingly
  5. Read the desired page into the newly freed frame and change the page and frame tables.
  6. Restart the process
Note

Victim Frame: The frame we are choosing to remove from physical memory so that we have a space for the page we are trying to swap in
Pasted image 20230303144025.png

  1. Page n is not currently in memory and has been accessed by the current instruction. A page fault occurs, and there are no free frames. Frame f is selected as the victim.
  2. Frame f is currently holding page m. The OS writes the page to swap space on secondary storage, and frees the frame by marking m as invalid.
  3. The OS locates page n on the backing store and schedules an I/O transfer to DMA the contents into frame f.
  4. The transfer finishes. The OS changes the page table for n to point to frame f and that the frame is valid. The instruction is restarted and can now access the data it needs.

Page Fault Service Time When No Frames Are Free

  1. if no frames are free, two page transfers (one out and one in) are required
  2. This doubles the page-fault service time and increases the effective access time accordingly

We can reduce this time by making use of the modify(dirty) bit
The modify bit is set when a page has been modified

When page replacement has to be done on a page which is present in memory:

FIFO Page Replacement

FIFO is the simplest page replacement algorithm

When page replacement has to be done, the oldest page in memory is chosen to be swapped out.

How do we maintain a record of the time that pages were brought into memory in order to figure out which page is the oldest?

FIFO Page-Replacement Algorithm

Pasted image 20230303145544.png

Belady's Anomaly

In general, we would think that if there are more frames, there will be fewer page faults

Pasted image 20230303152823.png

However, We see that as the number of frames increases, the number of page faults may increase as well. This is known as Belady's anomoly

Optimal Page Replacement

Pasted image 20230303155251.png

An optimal page-replacement algorithm has the lowest page fault rate of all the algorithms

To select a victim frame, we choose the frame that will not be used for the longest period of time.

The optimal page-replacement algorithm is difficult to implement because it requires future knowledge of the reference string.

Least Recently Used Page Replacement

It associates with each page the time of that page's last use
When a page must be replaced, the LRU chooses the page that has not been used for the longest period of time.

It can be thought of as an optimal-replacement algorithm but looking backwards in time
Pasted image 20230303161528.png

It's less optimal than the OPT algorithm, but more optimal than FIFO

It's generally considered a good algorithm and is used frequently

Implementation of Least Recently Used Page Replacement

An LRU page-replacement algorithm may require substantial hardware assistance.

How do we keep track of the pages that were least recently used?

  1. By using Counters
  2. By using a Stack

Using Counters

With Each Page Table Entry, we associate a time-of-use field and add a logical clock or counter to the CPU.

- The lock is incremented for every memory reference - Whenever a reference to a page is made, the contents of the clock register are copied to the time-of-use field in the page-table entry for that page. - This will give us the "time" of the last reference to each page - We then replace the time with the smallest time value

Using a stack

Pasted image 20230303164332.png

Additional-Reference Bits

LRU-Approximation Page Replacement

Reference Bit Steps:

  1. Initially the operating system sets all reference bits for all pages to '0'.
  2. Whenever a page is referenced, the hardware sets the reference bit to '1'
  3. By checking these reference bits, we can tell which pages have been used and which haven't been used
    We cannot, however, determine the order of use.

Therefore we need to use the "Additional Reference Bits Algorithm"
The additional reference bits algorithm is used to get an idea of the ordering information by recording the reference bits at regular intervals

How do we use it?

Example
Page No. Shift Register Content What It Means
P1 00000000 The page has not been used for eight time periods
P2 11111111 The Page has been used at least once in each period
P3 11000100 The page has been used more recently than P4
P4 01110111 The page has been used less recently than P3

When the page is accessed, it sets the MSB (left most) to one.

Second Chance Algorithm

Another LRU-Approximation Page Replacement

The implementation of the Second Chance Algorithm (also known as a clock algorithm) is usually implemented as a Circular Queue

Pasted image 20230303171419.png

If all reference bits are set to 1, this algorithm will work like a FIFO algorithm

Enhanced Second Chance Algorithm

Another LRU-Approximation Page Replacement

In the enhanced second chance algorithm, we consider the reference bit and the modify bit as an ordered pair

There are four possible cases for the Enhanced Second Chance Algorithm:

Reference Bit Modify Bit What It Implies Remarks
0 0 The page was neither used recently nor modified Best Page To Replace
0 1 This page was not used recently, but has been modified This page was not used recently, but has been modified
1 0 The page was used recently but not modified There are chances that this page will be used again soon
1 1 The page was used and modified recently There are chances that this page will be used again soon and we will have to write this page to the disk if we replace it (worst case)

Counting-Based Page Replacement

Maintain a counter that keeps a count of the number of references made to each page.

Using the counter, we formulate the following schemes:

Least Frequently Used

Counter - keeps a count of the number of references made to each page

What is the problem with this?

Pages that were heavily used during the initial phase of the process would have higher counts and would remain in memory even though they may not be needed further

Solution

Shift the counts right by 1 bit at regular intervals forming an exponentially decaying average usage count

Most Frequently Used

Counter - keeps a count of the number of references made to each page

Both these algorithms are expensive to implement and do not approximate the optimum page replacement well. Therefore, they are not often used.

Page Buffering Algorithms

In this algorithm, we maintain a pool of free frames

When A Page Fault Occurs:

This idea can be modified by maintaining a pool of modified pages.
Whenever the paging device is idle:

Allocation of Frames

How do we allocate to various process the fixed amount(number of frames) of free memory that is available?

Example

In a single user system:

  • Memory Size = 128kb
  • Page Size = 1kb
    So number of frames will be 128
    Suppose the operating system uses 30kb
    Remaining frames for user processes = 98 (128-30)
    These 98 frames would be available in the free frames list
  • If we're using demand paging, the first 98 page faults would get all free frames from the list
  • When the free frames are exhausted, a page replacement algorithm would be used to replace a page from the occupied list of 98 with the 99th page.

Minimum number of frames

Constraints for allocation of frames:

Allocation Algorithms

Frame allocation algorithms help us in determining how many frames should be allocated to different processes in a multiprocessing environment

The 3 Most common allocation algorithms:

  1. Equal Allocation
  2. Proportional Allocation
  3. Priority Allocation

Equal Allocation

The frames are equally distributed among the processes (equal free frames/ num processes)

Example

Say we have 98 frames and 5 processes
each process gets 98/5=19 frames
The remaining 3 frames are kept as a free frame buffer pool

Problems

The memory requirements of each processes are not the same, some may need more than others, so by allocating equally we are not using resources efficiently.

Example

In a system with 64 free frames with a frame size of 1kb, say we have two processes, P1(10kb) and P2(127kb).
We allocate 64/2=32 frames to both processes.
P1 only needs 10 frames however, so we waste 22. (3210)

Proportional Allocation

The frames are distributed among the processes according to their sizes

Example

In a system with 64 free frames with a frame size of 1kb, say we have two processes, P1(10kb) and P2(127kb).
m=64,S1=10,S2=127,S=137
The number of frames allocated to p1 = S1S×m=10137×64=4.75
The number of frames allocated to p2 = S2S×m=127137×64=59.359

Priority Allocation

The priority allocation algorithm is done using priorities or a combination of size and priorities rather than just the size of the processes

So we allocate more frames to processes with higher priorities so as to speed up their execution

Global vs Local Allocation

Page replacement algorithms can be classified into two broad categories:

This is important for deciding the way frames are allocated to different processes

Global Replacement

When a page fault occurs: - Processes are allowed to select replacement frames from the **set of ALL frames**, even if the frames are allocated to other processes >[!example] >If a process P1 is of higher priority than processes P2 and P3, and all of the frames of P1 are full and it now needs one more frame due to a page fault, it can take a frame allocated to either P2 or P3

Problem with global allocation

Processes cannot control their own page fault rate

Local Replacement

When a page fault occurs: - Processes are allowed to select replacement frames **only** from the set of frames that is particularly allocated to them.

Problems with local allocation

The memory utilization may not be optimal (eg an idle process that is not using all its frames)

Thrashing

Consider a process that doesn't have enough frames for its execution

Thrashing: when a process is spending more time in paging than actually executing the instructions.

Sever Performance Problems Due To Thrashing

The operating system monitors CPU utilization. If CPU utilization is too low, we increase the degree of multiprogramming by introducing a new process to the system.

Thrashing causes system throughput to decrease massively
Pasted image 20230303195621.png

Thrashing can be reduced slightly by using local allocation, but this is not a great solution as it will still have many page faults. To reduce thrashing we need to use something like the Working-Set Model

Working-Set Model

Question

How do we prevent thrashing?
We must provide a process with as many frames as it needs.
But how do we know how many frames it needs?
We make use of the working set strategy where it checks how many frames a process is actually using

The working set strategy defines the locality model of process execution

Locality model of process execution: As a process executes, it moves from locality to locality

Locality: A set of pages that are actively used together. A program is generally composed of several different localities which may overlap

Implementation of the working set model

To implement the working set model, we use a parameter Δ(Delta) which defines the working set window.

Pasted image 20230303202317.png
In the above image Δ=10

Determining Δ

The accuracy of our working set depends on the selection of Δ

The next important property of a working set is its size
The working set size of a process is denoted by WSSi where process 'i' needs WSSi number of frames
WSSi is simply the number of unique pages that will be there for a particular process in its working set window. From the image above, the WSSi for WS(t1) is 5, the WSSi for WS(t1) is 2.

So, the total demand for frames in a system will be given by:

D=ΣWSSi

This will give the total demand for frames in a system.
If the total demand for frames(D) is greater than the total number of available frames(m) (aka D>m), then thrashing will occur.

So what do we do if the demand is greater than the number of frames?

The operating system will select a process to suspend. The processes pages are swapped out and its frames are reallocated to the other processes. The suspended process can be restarted later.

The working-set strategy prevents thrashing while keeping the degree of multiprogramming as high as possible, which optimizes CPU utilization.

File System Interface

Storage Management

Programs have to be loaded to main memory for execution, however, the size of our main memory is limited and is usually too small to accommodate all the data and programs permanently. Because of this, the computer system must provide secondary story to back up main memory.

There are two main types of secondary storage:

Different storage devices vary in many aspects
For example:

Concept of Files

The operating system abstracts from the physical properties of its storage devices to define a logical storage unit - the file

A file is a logical storage unit
Files are mapped by the operating system onto physical devices
These physical devices are usually non-volatile(they will keep data after power is switched off) unlike main memory which is usually volatile

File definition:

A file is a named collection of related information that is recorded on secondary storage
From a User's perspective:

Many different types of information may be stored in a file

File Attributes

Naming a file

A file is named for the convenience of its human users, and is referred to by its name.
A name is usually a string of characters. Eg myfile.c

When a file is named, it becomes independent of the process, the user, and even the system that created it

The attributes of files can vary from one OS to another but typically consist of the following:

  1. Name: the symbolic file name is the only information kept in human readable form.
  2. Identifier: A unique tag, usually a number, that identifies the file within the file system. It's the non-human-readable name for the file.
  3. Type: this information is needed for systems that support different types of files.
  4. Location: this information is a pointer to a device and to the location of the file on that device.
  5. Size: the current size of the file(in bytes, words, or blocks)and possibly the maximum allowed size are included in this attribute.
  6. Protection/Permission: access-control information determines who can do reading, writing, executing and so on.
  7. Creation Date: the date the file was created
  8. Last Modified Date: the date on which the file was last modified
  9. Owner: the information about who the owner of the file is.

The attributes of a file are stored in the File Control Bloc also known as the FCB.

File Operations

The operating system has to provide system calls to perform various file operations.

The common file operations are:

Creating A File

Steps for file creation:

  1. Check if space is available in the file system to create the new file, and find space on the disk for it.
  2. Make an entry for the file in the new directory.

Writing a File

Steps for writing a file:

  1. make a system call specifying both the name of the file and the information to be written to the file.
  2. With the given name of the file, the system searches for the file in the directory and locates it.
  3. A write pointer is maintained which points to the location in the file from where teh next write must take place.
  4. After the writing is done, the write pointer is updated

Reading a file

Steps for reading a file:

  1. Make a system call specifying both the name of the file and where(in memory) the next block of the file should be read.
  2. With the given name of the file, the system searches for the file in the directory and locates it
  3. A read pointer is maintained which points to the location in the file from where the next read must take place.
  4. After the reading is done, the read pointer is updated
  5. In general, most read and write operations use the same pointer.

Repositioning within a file(File seek)

Steps for repositioning within a file:

  1. The directory is searched for the appropriate file.
  2. The current-file-position pointer is repositioned to a given value.

Repositioning can be done without any I/O

Deleting a file

Steps for deleting a file:

  1. The directory is searched for the named file
  2. Once the directory is found, all the file space occupied by the file is released so that it can be used by other files.

Truncating a file

A user may want to erase the contents of a file but keep its attributes. Rather than forcing the user to delete the file and recreate it, this function allows all attributes to remain unchanged
Steps to truncate a file:

  1. Keep all the attributes of the file unchanged.
  2. The file length is reset to zero
  3. Release the file space.

Other file operations exist but are less common such as:

The open() System Call

Most of the common file operations involve searching the directory for the entry associated with the file name.

To avoid this constant searching, many systems require that an open() system call be made before a file is first used actively.

When a file operation is requested, the file is specified via an index into this table so no searching is required.
When the file is no longer being actively used, it is closed by the process and the operating system removes its entry from the open-file table.

File Types

When a file system or an operating system is designed, the file types that the operating system must should recognize and support must be considered
If an operating system recognizes a type of file, it can then operate on that file in standard ways

The most common technique for implementing file types is to include the type as part of the file name

This is specified by the file extension that appears after a period in the file name

Common file types

File Type Usual Extension Function
Executable exe, com, bin, or none ready-to run machine language program
Object obj, o compiled machine language that has not been linked
Source Code c, cpp, pas, java, asm, py Source code in various languages
Batch bat, sh commands for the command interpreter
Text txt, doc textual data, documents
Word Processor wp, doc, rtf Various word processor formats
Library lib, a libraries of routines for programmers
Print or View ps, pdf, jpg, png ASCII or binary files in a format for printing or viewing
Archive arc, zip, tar Related files grouped into one file sometimes compressed for archiving or storage
Multimedia mp3, mp4, avi Binary files containing audio or A/v information

Other ways of identifying file types:
In Mac OS X each file has a type that is specified with it to identify its type
In Unix systems, it uses a crude magic number stored at the beginning of some files to indicate roughly the type of file.

Access Methods

The information in files can be accessed in several ways. Some systems provide just one access method for files while other systems may support many access methods.

The two most common access methods are:

Sequential Access

Pasted image 20230304155840.png

Read Operation

read next: reads the next portion of the file and automatically advances a file pointer which tracks the I/O location. The file pointer is advanced automatically

Write Operation

write next: appends to the end of the file and advances to the end of the newly written material (the new end of the file)

Direct Access

Also known as relative access

A file is made up of fixed length logical records that allow programs to read and write records rapidly in no particular order.
In direct access, the filed is viewed as a numbered sequence of blocks or records.
With direct access, there are no restrictions on the order of reading or writing for a direct-access file.
Direct-access files are great for immediate access with files of large amounts of information such as databases.

Simulation of sequential access on a direct-access file

Pasted image 20230304160354.png

Read Operation

read n: read from the block number n from the file

Write Operation

write n: writes to the block number n of the file

The block number provided by the user to the operating system is normally a relative block number which is an index to the relative beginning of the file.

Directory Structure

The file system of a computer can be extensive as systems need to store huge number of files on a disk. To manage this, they must be organized properly.

Directories are used for organizing this data

Storage Structure

a disk or any large storage device can be used entirely for a file system, however, it's better to have multiple file systems on a disk or use parts of a disk for a file system, and other parts of the disk for things like swap space, raw dis space, etc

These parts are called partitions, slices, or minidisks, however partitions is the most common. A file system can be created on each of these parts of the disk, and these parts can be further combined to form a structure called a volume, and a file system can be created on these as well.

Volumes

Operations That Can Be performed on a directory

Notice, some of these are the same operations that can be performed on a file.

Single-Level Directory

A single-level directory is the simplest directory structure
all the files are contained in the same directory

Pasted image 20230304162443.png

Advantages

Disadvantages

Two-Level Directory

Each user has his or her own directory called the User File Directory(UFD)

Information about each UFD is stored in a system directory called the Master File Directory (MFD)
The MFD is indexed by user name or account number, and each entry points to the UFD for that user
Pasted image 20230304163433.png

When a user refers to a particular file, only their own UFD is searched. Thus, different users may have files with the same file name as long as all the file names within each UFD are unique

Problems with Two-Level Directories

Tree Structured Directories

The natural generalization of a two-level directory structure is to extend the directory structure to a tree of arbitrary height

This generalization allows users to create their own subdirectories and organize their files accordingly
A tree is the most common directory structure
The tree has a root directory and every file in the system has a unique path name

Pasted image 20230304170028.png

Current Directory

Each process will have a current directory which should contain most of the files that are of current interest to the process.
When reference is made to a file, the current directory is searched
If a file is needed that is not in the current directory, the user must either specify a path name or change the current directory to the directory holding the file

Path Names

There are two types of path names:

  1. Absolute: Begins at the root and follows a path down to the specified file, giving the directory names on the path.
  2. Relative: defines a path from the current directory.

Deleting a Directory

If the directory is empty, it's entry in the directory that contains it can simply be deleted.
If the directory is not empty:

Acyclic-Graph Directories

Acyclic Graph Directories are good for sharing

An acyclic graph (a graph with no cycles) allows directories to share subdirectories and files.
Pasted image 20230304171931.png

When people are working as a team, all the files they want to share can be put into one directory
The UFD of each team member will contain this directory of shared files as a sub directory

Implementation of shared files and subdirectories

  1. Links: a links is effectively a pointer to another file or subdirectory.
  2. Duplicating: duplicate all information about shared files in both sharing subdirectories. This entries are identical and equal. (A major concern here is maintaining consistency when the files are modified)

Problems with Acyclic Graph Structures

  1. Same file with many paths and names: A file may now have multiple absolute path names. Also, different file names may refer to the same filed. This may give the system a wrong count of files when trying to count the total number of files
  2. Deleting of Shared Files: if the file is deleted whenever someone deletes it, it will create "dangling pointers" to the non-existent file. If shared files are implemented using symbolic links, then deletion of a file by one user will delete only the link and wont affect the original file, however if the original file is deleted, it will leave the links dangling.

General Graph Directory

Pasted image 20230304180053.png

If we start with a two-level directory and allow users to create subdirectories, a tree-structured directory is usually the result
If we continue adding files and subdirectories in this way, the tree structure is preserved and continues
However, when we add links to an existing tree-structured directory, the tree structure is destroyed, resulting in a simple graph structure

Problems

Garbage Collection

Garbage Collection: A scheme to determine when the last reference has been deleted and the disk space can be reallocated.

  1. Traverse the entire file system and mark everything that can be accessed
  2. Traverse again a second time and free up all the unmarked cases from the first step
    Garbage collection is extremely time consuming and thus rarely used.

Protection (Types of Access)

The information stored in a computer system must be kept safe from:

Physical Damage

File systems can be damaged by:

Improper Access

Types of Access

File Operations that may be controlled

Protection (Access Control)

The most common approach to the protection problem is to make access dependent on the identity of the user

We can implement this using an Access Control List (ACL)
With each file and directory, an access-control list is associated specifying user names and the types of access allowed for each user.

Access Control List

When a user requests access to a particular file, the operating system check the access list associated with that file
If that user is listed for the requested access, the access is allowed.
If not, a protection violation occurs and the user job is denied access to the file.

Problems with this approach:

Solutions to this problem

Use a condensed version of the access list
Systems mostly recognize three classifications of users in connection with each file:

Permissions in a Unix System

Pasted image 20230304201958.png

On Linux/Unix, it goes user | group | other. If it's prepended by a d, it means "directory"

Permissions in a Windows System

Pasted image 20230304202441.png

File System Implementation

Overview

The file system resides permanently on secondary storage, which is designed to hold a large amount of data in a permanent manner.

Different operating systems support different file system
for example:

Several on-disk and in-memory structures are used to implement a file system

On-disk structures

On disk, the file system may contain information about:

Boot Control Block

This is an on-disk structure
There is a boot control block for each volume

In the Unix file system, it is called a boot block
In NTFS it's called the boot sector

Volume control block

Each volume contains a volume control block that contains information such as:

In Unix this is called a superblock
In NTFS this is stored in the master file table

Directory Structure Per File System

In the Unix file system it includes file names and associated inode numbers
In NTFS, it is stored in the master table

Per-File FCB

This contains details about the file such as:

In the UNIX file system, it is known as an inode
In NTFS it's stored in the master file table

Note

an INODE is just the per-file File Control Block

Typical File Control Block

Pasted image 20230305153334.png

In-Memory Structures

In memory information is used for both file-system management and performance improvement via caching

This is all done in kernel memory

Here, the data is loaded at mount time, and discarded when the drive is dismounted

These structures include:

Virtual File Systems

Modern operating systems must concurrently support multiple types of file systems

Object Oriented Techniques

Most operating systems, including UNIX, use object-oriented techniques to simplify, organize, and modularize the implementation of multiple file systems
This allows:

The file-system implementation consists of three major layers:

Directory Implementation

The kind of directory allocation and directory-management algorithms used in a system affects the efficiency, performance, and reliability of the file system

The two main methods for implementing a directory are:

  1. Linear List
  2. Hash Table

Linear List

This is the simplest method of directory implementation.

Here, a linear list of file names is used which points to the data blocks
This is simple to implement, but time consuming to execute

For Example
To create a new file:

To delete a file:

The biggest disadvantage of a linear search is that it's always required to find the files. This slows down the entire process as users frequently need to access directory information.

To overcome this, we can use a cache to store information about the files that are recently used, but this is still not very efficient

Hash Table

Here, a linear list stores the entries, but a hash data structure is also used

Problems with Hash Table Implementation

Contiguous Disk Space Allocation

Contiguous allocation requires that each file occupy a set of contiguous blocks on the disk
Disk addresses define a linear ordering on the disk
Contiguous allocation of a file is defined by the disk address and length of the first block in block units
Example:
If a file is n blocks long and starts at location b, then it occupies the blocks b,b+1,b+2,...b+n1
The directory entry for each file indicates the address of the starting block and the length of the area allocated for this file.
Pasted image 20230305165123.png

Advantages of Contiguous Space Allocation

Accessing a file that has been allocated is easy

For Sequential Access

For Direct Access

Disadvantages of Contiguous Space Allocation

Linked Disk Space Allocation

The linked disk space allocation method solves all the problems of contiguous allocation.

Pasted image 20230305171253.png

Advantages of Linked Disk Space Allocation

Creating A File

Writing to a file

Reading a file

Disadvantages of Linked Disk Space Allocation

File Allocation Table (FAT)

The file allocation table is a variation of the linked allocation method

Allocating a new block

Advantage of FAT

Indexed Disk Space Allocation

The main disadvantage of linked allocation is that it cannot support efficient direct access. Indexed allocation solves this problem by brining all the pointers together into one location called the index block. This is just a standard disk block that contains all the pointers

When Creating A file

Advantages of Indexed Allocation

Disadvantages of Indexed Allocation

How large should index blocks be?

We can deal with this by in a couple of ways:

  1. A linked Scheme: You can have multiple index blocks that are linked to each other
  2. A multilevel index: A variant of linked representation uses a first-level index block to point to a set of second-level index blocks, which in turn point to the file blocks. To access a block, the operating system uses the first-level index to find a second-level index block and then uses that block to find the desired data block. This approach could be continued to a third or fourth level, depending on the desired maximum file size. With 4,096-byte blocks, we could store 1,024 four-byte pointers in an index block. Two levels of indexes allow 1,048,576 data blocks and a file size of up to 4 GB.
  3. A combined scheme: An index block is created and the first few entries point directly to the disk blocks, then the last few entries will point to index blocks

The Unix Inode

This is a modified indexed allocation method

An inode is a data structure in UIX operating systems that contains important information pertaining to files within a file system

Pasted image 20230305181629.png

This is good for maintaining large files and small files alike.

Performance of Disk Space Allocation

We have seen 3 main types of allocation methods:

  1. Contiguous Allocation
  2. Linked Allocation
  3. Indexed Allocation
    These methods vary in storage efficiency and data-block access times. These are important criteria in selecting a method for an operating system to implement

Contiguous Allocation

Requires only one access to get a disk block for any type of access(sequential or direct). Only the initial block address has to be kept in memory.
This is good for:

Linked Allocation

The address of the next block can be kept in memory and can be read directly. Therefore, it's really good for sequential access, but an access to the ith disk block will require i disk reads
This is good for:

Some systems make use of both contiguous and linked allocation methods. For files that require direct-access they use contiguous allocation, and for files that make use of sequential access they use linked allocation
This must be declared when the file is created.

Indexed Allocation

This is good for:

Performance in this system depends on:

Some systems combine contiguous allocation with indexed allocation. For small files they use contiguous allocation. If the file grows large, it is switched to an indexed allocation method.

Free-Space Management

We know that disk space is limited, so we need to reuse the space from deleted files for new files if possible.
To keep track of free disk space, the system maintains a free-space list which records all free disk blocks, those not allocated to some file or directory

To Create A File

When A File Is Deleted

The Free Space List

There are a few ways we can implement a free space list:

  1. Bit Vector

Bit Vector

The free-space list is implemented as a bit map or bit vector

Example

Consider a disk where blocks 2,3,4,5,8,9, 10, 11, 12, 13, 17, 18, 25, 26, and 27 are free and the rest of the blocks are allocated. The free-space bitmap would be:
001111001111110001100000011100000 ...

Advantage

It is relatively simple in finding the first free block or n consecutive free blocks on the disk

Disadvantage

It's inefficient unless the entire vector is kept in main memory

Linked List Method

All the free disk blocks are linked together

Grouping Method

Here we store the addresses of n free blocks in the first free block

Counting Method

The counting method takes advantage of the fact taht generally, several contiguous blocks may be allocated or freed simultaneously, particularly when space is allocated with the contiguous allocation algorithm or through clustering.