Storage Management

In this article we will talk about storage management and its various aspects.

Storage Areas

There are 3 major kinds of storage areas, called segments :
1. Static Storage

2. Stack Storage

3. Heap Storage

Static Storage

In this segment of storage block, the executable code of your program is stored. In addition to this, all constant data that you may utilize in your program such as constants and global variables ( both initialised and uninitialised ) reside in this segment.

Stack Storage

Stack follows the LIFO rule : Last In First Out. Local variables and function call information utilize this segment. The significance of LIFO here is that whenever a procedure / method / function / separate block is defined, several variables are declared which are local to that block. These variables are stored onto the stack along with the information of function calls and return statements. As soon as the variables reach the end of their scope, they are no longer required and hence, they are popped out from the stack and are no longer available for use,thereby freeing the memory. Same is true for function calls, and their return statements. Recursively defined functions are also stored in this stack implementation. The base case of return statements is on the top of the stack, which when pops out, returns some value to its previous call, which now becomes the top of the stack. This process continues till the top is the last return statement of all recursive calls.

Heap Storage

Heap is a collection of memory locations from which we can dynamically allocate and deallocate memory using functions like malloc, calloc, free etc. This segment is the most flexible among all as it is dynamic. Memory can be allocated at runtime and we need not know the amount of memory required before hand.

Storage Reclamation

Memory which remains unused for a long time or which is no longer required should be returned back to the free storage so that we can utilize it for other programs. After the program has used the memory, it does not require it anymore. We need to reclaim this memory to make it available for other programs. This can be done in two ways :

1. Reference Counting

2. Garbage Collection

Reference Counting

A block of data, such as an array, a variable or anything is reachable (accessible) only when there exists some reference to it. If there is no reference, then the block is not of any use, and hence can be considered as garbage. A counter is set for each block that uses the memory. Each time this block is referenced, the counter is incremented by 1. Now, if the pointer that references this block now points to another block, then the counter of our previous block is decremented or reduced by 1. When the counter becomes 0, it means that there are no references to this block, and hence the memory occupied by this block can be freed. Note that reference counting is done explicitly, and extra memory in the form of counter is required.

A disadvantage of reference counting can be observed when using circular data structures. In scircular data structures or any data structure that has some form of cycle in it, there is a link that points back to a block. Suppose we maintain the start or root pointer to keep track of our data structure. This pointer points to some node, say the first node of our structure. let some k'th element point to the first node. Example : Consider the circular linked list shown below :

Circular linked list

As we can see, each node will have a reference count of 1. Now, if our start pointer points to A, reference count of A becomes 2, while for others it is 1. What would happen if the start pointer refrences some other block and stops references A? The reference count of A becomes 1. But now, we have no way to access the list and hence it is of no use to us. Therefore we require to free the memory occupied by the list. But none of the nodes has a reference count of 0, so we cannot free the memory. This is a major reason to not use reference counting.

Garbage Collection

The process of recycling memory which is no longer in use is called Garbage Collection. There are several different ways to perform garbage collection. Some of them are :

1. Mark and Sweep

This method, as the name suggests, marks all reachable blocks and sweeps away the unreachable blocks. It achieves this by scanning the entire memory and marking all the blocks which are pointed to by some reference pointer, or some variable etc., and are therefore useful to some program. It sets a flag for such blocks. Then, it sweeps (scans) the entire memory again and recycles all blocks which are not marked. The disadvantage is that it has to scan the entire memory in an operation.

2. Copying

In this method, we divide the entire heap into two regions say, old and new. The new region is initially completely empty and the old region is where the used blocks reside. Scan the old region and copy the blocks into the new region .The new region now consists of all the required blocks, and the old region consists of all the required blocks plus the unused memory. We can now update the new region to be considered as the main region (or old region) and recycle the old region. Its disadvantages include space overheads in maintaining two copies of same data simultaneously, and the process of copying blocks is a time consuming process.

Compaction

It is quite often the case that the memory is divided into several separate blocks (not sequential). In this situation, it would be very difficult to obtain a huge continuous block. Compaction is a technique which prevents such situations. It does so by collecting several blocks and placing them together so that unused blocks are continuous in the memory.

2 Comments

c, programming, storage management

Friday, 19 December 2014

The Daily Programmer