1) Base of hexadecimal number system? Answer : 16 2) Universal gate in digital logic? Answer : NAND 3) Memory type that is non-volatile? Answer : ROM 4) Basic building block of digital circuits? Answer : Gate 5) Device used for data storage in sequential circuits? Answer : Flip-flop 6) Architecture with shared memory for instructions and data? Answer : von Neumann 7) The smallest unit of data in computing? Answer : Bit 8) Unit that performs arithmetic operations in a CPU? Answer : ALU 9) Memory faster than main memory but smaller in size? Answer : Cache 10) System cycle that includes fetch, decode, and execute? Answer : Instruction 11) Type of circuit where output depends on present input only? Answer : Combinational 12) The binary equivalent of decimal 10? Answer : 1010 13) Memory used for high-speed temporary storage in a CPU? Answer : Register 14) Method of representing negative numbers in binary? Answer : Two's complement 15) Gate that inverts its input signal? Answer : NOT 16)...
Allocating Kernel Memory
* Previous discussions have centered on process memory, which can be appropriately split up into page-sized chunks, and the only fragmentation that occurs is the average half-page lost to internal fragmentation for each process (segment.)
* There is also an additional memory allocated to the kernel, however, which cannot be so easily paged. Some of it is used for I/O buffering and direct retrieve by devices, example, and must therefore be contiguous and not affected by paging. Other memory is used for internal kernel data structures of discrete sizes, and since kernel memory is often locked (restricted from being ever changed out ), management of this resource must be done carefully to avoid internal fragmentation or other waste. ( I.e. you would like the kernel to preoccupy as little memory as possible, leaving as much as possible for user processes). Accordingly there are specific classic algorithms in place for allocating kernel memory
structures.
Buddy System
* The Buddy System allotes memory using a power of two allocator.
* Under this policy, memory is always allocated as a power of 2 ( 4K, 8K,16K,) etc, rounding up to the next nearest power of two if necessary.
* If a block of the correct size is not presently available, then one is formed by dividing the next larger block in two, forming two matched buddies. (And if that larger size is not accessable, then the next largest available size is split, and so on).
* One nice feature of the buddy system is that if the address of a block is completely ORed with the size of the block, the resulting address is the address of the buddy of the same size, which permits for fast and easy combining of free blocks back into larger blocks.
• Free lists are kept for every size block.
• If the necessary block size is not available upon request, a free block from thenext largest size is split into two buddies of the desired size. ( Recursively
splitting larger size blocks if necessary. )
• When a block is freed, its buddy's address is estimated, and the free list for that size block is checked to see if the buddy is also free. If it is, then the two buddies are combined into one larger free block, and the process is repeated with
successively larger free lists.
• See the ( explain ) Figure below for an example.
Slab Allocation
* Slab Allocation allotes memory to the kernel in chunks called slabs, consisting of one or more continous pages. The kernel then creates different caches for each type of data structure it might need from one or more slabs. Initially the caches are decided empty, and are decided full as they are used.
* New requests for space in the cache is first allowed from empty or partially empty slabs, and if all slabs are full, then additional slabs are allocated.
* This fundamentally amounts to allocating space for arrays of structures, in large chunks suitable to the size of the structure being stored. For example if a particular structure were 512 bytes long, space for them would be allocated in groups of 8 using 4K pages. If the shape were 3K, then space for 4 of them could be allocated at one time in a slab of 12K using three 4K pages.
* Benefits of slab allocation add lack of internal fragmentation and fast allocation of space for individual structures
* Solaris uses slab allocation for the kernel and also for definite user-mode memory
allocations. Linux used the buddy system prior to 2.2 and interchanged to slab allocation since then.
Other Considerations
Prepaging
* The basic idea behind prepaging is to predict the pages that will be required in the near future, and page them in before they are actually requested.
* If a process was interchanged and we know what its working set was at the time, then when we swap it back in we can go ahead and page back in the whole working set, before the page faults usually occur.
* With small ( data ) files we can give permission and prepage all of the pages at one time.
* Preparing can be of benefit if the prediction is good and the pages are needed eventually, but slows the system down if the prediction is wrong.
Page Size
* There are quite a few trade-offs of small with large page sizes:
* Small pages waste less memory due to internal breakingg.
* Large pages require smaller page tables.
* For disk retrieve, the latency and seek times greatly outweigh the actual data transfer times. This makes it much speeder to transfer one large page of data than two or more smaller pages containing the same amount of data.
* Smaller pages match locality better, because we are not bringing in data that is not really required.
* Small pages create more page faults, with attending overhead.
* The physical hardware may also play a part in deciding page size.
* It is hard to define an "optimal" page size for any given system. Current norms range from 4K to 4M, and tend regards larger page sizes as time passes.
TLB Reach
* TLB Reach is defined as the amount of memory that can be extended by the pages listed in the TLB.
* Ideally the working set would fix within the reach of the TLB.
* Increasing the size of the TLB is an specific way of increasing TLB reach, but TLB memory is very expensive and also draws lots of power.
* Increasing page sizes increases TLB reach but also leads to increased fragmentation loss.
* Some systems give multiple size pages to increase TLB reach while keeping fragmentation low.
* Multiple page sizes requires that the TLB be controlled by software, not hardware.
Inverted Page Tables
* Inverted page tables keep one entry for each frame instead of one entry for each virtual page. This reduces the memory requirement for the page table, but loses the information needed to implement virtual memory paging.
* A solution is to keep a ideal page table for each process, for virtual memory
management purposes. These are placed on disk, and only paged in when a page fault occurs. ( I.e. they are not referenced with every memory access the way a traditional page table would be. )
Program Structure
* Consider a pair of nested loops to obtain every element in a 1024 x 1024 two-dimensional array of 32-bit ints.
* Arrays in C are kept in row-major order, which means that each row of the array would occupy a page of memory.
* If the loops are nested so that the outer loop increases the row and the inner loop increases the column, then an entire row can be processed before the next page fault, giving 1024 page faults total.
* On the other hand, if the loops are nested the other way, so that the program functioned down the columns instead of across the rows, then every retrieve would be to a different page, giving a new page fault for each retrieve, or over a million page faults all together.
* Be aware that different languages store their arrays otherwise. FORTRAN for example stores arrays in column-major format alternative of row-major. This means that conversion of code from one language to another may turn a fast program into a very slow one, strictly because of the extra page faults.
I/O Interlock and Page Locking
There are several cause when it may be desirable to lock pages in memory, and not let them get paged out:
* Certain kernel operations cannot tolerate having their pages interchanged.
* If an I/O controller is doing direct memory access, it would be wrong to change pages in the middle of the I/O operation.
* In a priority based scheduling system, low priority jobs may require to wait quite a while before getting their turn on the CPU, and there is a danger of their pages being paged out before they get a try to use them even once after paging them in. In this situation pages may be secured when they are paged in, until the process that requested them gets at least one turn in the CPU.
Operating-System Examples ( Optional )
Windows
* Windows works demand paging with clustering, meaning they page in multiple pages whenever a page fault occurs.
* The working put minimum and maximum are normally put at 50 and 345 pages
respectively. ( Maximums can be surpassed in rare circumstances. )
* Free pages are kept on a free list, with a minimum threshold indicating when there are enough free frames available.
* If a page fault occurs and the process is below their maximum, then additional pages are alloted. Otherwise some pages from this process must be restored, using a local page replacement algorithm.
* If the amount of free frames falls below the allowed threshold, then working set
clipping occurs, taking frames away from any processes which are above their
minimum, until all are at their minimums. Then additional frames can be alloted to processes that need them.
* The algorithm for choosing victim frames dependable on the type of processor:
* On single processor 80x86 systems, a difference of the clock ( second chance ) algorithm is used.
* On Alpha and multiprocessor systems, signing the reference bits may require
invalidating entries in the TLB on other processors, which is an costly operation. In this case Windows uses a differentiation of FIFO.
Solaris
* Solaris keeps a list of free pages, and allocates one to a faulting thread whenever a fault occurs. It is therefore hesistant that a minimum amount of free memory be kept on hand at all times.
* Solaris has a parameter, lotsfree, normally set at 1/64 of total physical memory. Solaris verifies 4 times per second to see if the free memory falls below this threshold, and if it does, then the page out process is started.
* Pageout uses a difference of the clock (second chance ) algorithm, with two hands rotating around through the frame table. The first hand frees the reference bits, and the second hand comes by afterwards and checks them. Any frame whose reference bit has not been resumed before the second hand gets there gets paged out.
* The Pageout method is adjustable by the distance between the two hands, ( the handspan), and the speed at which the hands move. For example, if the hands each check 100 frames per second, and the handspan is 1000 frames, then there would be a 10 second difference between the time when the leading hand clears the reference bits and the time when the trailing hand verifies them.
* The speed of the hands is usually adjusted according to the total of free memory, as shown below. Slowscan is usually set at 100 pages per second, and fastscan is usually set at the smaller of 1/2 of the total physical pages per second and 8192 pages per second.
* Solaris also keeps a cache of pages that have been reclaimed but which have not yet been overwritten, as opposed to the free list which only holds pages whose current contents are invalid. If one of the pages from the cache is needed before it gets moved to the free list, then it can be quickly recovered.
* Normally page out runs 4 times per second to verify if memory has fallen below lotsfree. However if it falls below desfree, then page out will run at 100 times per second in an attempt to keep at least desfree pages free. If it is unable to do this for a 30-second average, then Solaris begins interchanging processes, starting preferably with processes that have been idle for a long time.
* If free memory drops below minfree, then page out runs with every page fault.
* Recent releases of Solaris have enhanced the virtual memorymanagement system including recognizing pages from shared libraries, and protecting them from being paged out.