Tuesday, September 21, 2010

Automatic Memory Management - Garbage Collection


Overview:

Memory management is essentially a topic that requires objects to be clean/cleared when their usage is complete. And obviously freed memory should not be referred.

But this seemingly simple task has been the major source of programming errors. Programmers either forget to free memory when it is no longer needed OR use memory that has already been freed up. Numerous tools have been designed to help programmers deal with these kinds of issues. All of this though helpful but still require additional efforts to solve these issues rather than focusing on the real problem. This is where Garbage Collection comes into play: It totally abstracts the programmers from worrying about freeing memory.

Application programmers create object/types and use them in their programs. And when the object/type goes out of scope, that is, it is not reachable by the application; it will automatically be collected and recycled by Garbage Collector, at the appropriate time.

So why is there no Garbage Collector for C++?

Because in C++ object pointer can be type caste to any other object, hence making it impossible to determine what is the object that the object pointer is pointing to. If the object that the object pointer is pointing to cannot be determined, how can garbage collection be done in C++ environment!

Working of Garbage Collector:

CLR (Common Language Runtime) mandates that resources be allocated from a heap, called managed heap. (Primitive types like int, string, etc are allocated on stack and not heap). The managed heap maintains a distinct pointer called NextPtr, from where the next object is allocated. And then the Nextptr is advanced by the number of bytes of the newly allocated object. This mechanism ensures that consecutive objects are contiguous in memory, which gives performance gains due to locality of reference.

In C, to allocate a resource/object, a linked list would needed to be traversed to find an appropriate memory chunk and allocate it and make the necessary changes in the linked list. Not only this required additional time but there was no guarantee that consecutive objects would be continuous in memory.

Thus managed heap is superior to C-runtime heap, in ways described above.

But, managed heap would need to reclaim memory as it cannot just keep on allocating memory infinitely. In order to reclaim memory, garbage collector needs to keep tracks of all resources/objects on heap that are not reachable i.e. It is no longer being used by the application and then reclaim the memory for these objects. Keeping track of the reach ability of the objects is done via a mechanism called roots, where the JIT compiler maintains besides each method offset the list of objects associated with it. Thus with the help of these method roots as well as global variables, GC can walk through the stack trace and find out the list of all reachable objects.

For the Garbage Collector (GC) to start there needs to be some kind of a threshold to start it. Let’s say at a particular point, the allocated memory on the managed heap reaches a threshold and starts the GC. The GC will look at the objects in the Heap (not all objects but based upon generations, which we will talk later) and identifies the objects that are not being used by the application. (it determines this by looking at the roots for all the methods on the stack trace to find out which objects are being used and which are not). Then the objects that are not being used are garbage collected i.e. their memory is reclaimed and if there is a need to compress the Heap, a decision that GC takes based on fragmentation of memory, it compresses the heap by removing unreachable objects and shifting the addresses of the other objects to make the heap continuous, as shown below. If the objects are moved, the GC obviously corrects all the references to the object, to point correctly.

Generations

If Garbage collector has to scan the entire heap i.e. all the allocated objects in heap, it would be very time consuming and inefficient too. Hence the garbage collector works on the assumption that newly allocated objects have shorter lifetime compared to older objects on heap. Numerous studies have been done to validate this assumption.

When an application starts, it is allocated a managed heap. Objects are allocated memory on this heap by the application tread(s). Up till now all the objects on the heap are termed as Generation 0. If the memory threshold defined for Generation 0 triggers in (which may when the Generation 0 reaches the size of 512KB), the GC will start scanning the objects in the managed heap to check if they are reachable by the application. All the objects that are reachable will be left as they are, and those that are not reachable will be takenoff/ cleaned up from managed heap by GC. If any compression needs to be done, the GC takes the call and moves the object in the heap so that they are continuous and the NextPtr points after all the object in heap. GC takes care to reset the object references wherever required, if the objects have been moved in the Heap.

Now the objects which were not garbage collected become Generation 1. And henceforth all the objects that would be allocated memory on heap, form a part of Generation 0, which will be blank currently. Again if the threshold on Generation 0 is reached, the GC will kick in to do the cleaning activity for the managed Heap.

What about objects in Generation 1? Well, there is a threshold for Generation 1 too, i.e. let us say that when Generation 1 reaches the size of 1 MB which may the defined Threshold limit for it, the GC will kick off looking into Generation 1 to find non-reachable objects. And the objects which survive Generation 1, will be promoted up to Generation 2 and likewise surviving objects from Generation 0, will form part of Generation1.

That is the generation theory of GC, where objects from Generation 0, i.e. newer objects are always scanned first for garbage collection, and then generation 1 and at last generation 2 is scanned. There is no further generation than generation 2.

Generations and Thread Hijacking and Suspension

Before GC can run for collecting memory, it will need to suspend all the existing threads running managed code till the GC completes its activities and resets any object pointers if required due to the movement of objects. After GC completes its activities, the threads resume their activities from the point onwards where they were suspended. But GC does not just suspend a thread just like that. It checks whether the Inst Ptr of the thread is at the offset address of the method specified in the method table produced when the IL code is JIT compiled. If yes, the thread can be easily suspended as it is at a safe point.

If not, the thread is hijacked and the return address points to a function implemented inside the CLR. The thread is then resumed with the hope that when the currently executing method returns, it will execute the special function inside CLR and thus suspend the thread. But the thread may not return for quite some time. The CLR waits for 250 milliseconds and if the method does not finish (and enter into the special method implemented inside CLR), the thread is suspended again and hijacked i.e. the stack is modified so that the current method’s return address is into the special method inside CLR for suspending the thread. The thread again resumes till it enters into the CLR method for suspension OR 250 milliseconds expire. The process continues till all threads are suspended and then the GC kicks in.


Multiprocessor

In a multiprocessor environment, the managed heap is partitioned into multiple memory arenas, one for each thread so that exclusive access to single managed heap is not required by various threads. The server version of the execution engine (MSCorSvr.dll), in the multiprocessor environment, initiates the garbage collection per thread per CPU thus allowing parallel garbage collection for each memory arena.

Concurrent collections

In a multiprocessor environment, a concurrent garbage collector thread can be initiated, of normal priority, which works in the background while the application thread runs, and it builds a graph of unreachable objects. Thus when the garbage collection actually takes place, since the graph of unreachable objects is already build, the GC process takes less time. For more interactive GUI applications, concurrent collection may be a good option.

Concurrent option needs to be turned ON in the configuration file via the attribute

Large Objects

Objects larger than 85,000 bytes are allocated in special part of the managed heap and are treated as Generation 2, simply because these objects are heavy and taking them off and compacting 85,000 bytes from the heap each time these objects are garbage collected will waste too much time. So in case these heavy objects are short lived and used frequently, these may cause generation 2 to be collected more frequently and this hurt the performance.

In case your application does have large objects that are collected frequently, try and see if they can be broken or composed of smaller objects such that only few of them need to be collected frequently OR try and circumvent the situation of having to collect these large objects frequently.

Finalize and Dispose

Most of the object types manipulate bytes in memory and hence their memory is reclaimed when they are garbage collected by the garbage collector. So why do we need a finalize method on object types and what does it do?

Finalize method is called when the object is actually garbage collected by the garbage collector. This is required when the memory of the type cannot be reclaimed by the garbage collector itself i.e. when the object/type uses an unmanaged resource like Filehandle, Mutex Kernel object, etc. for such types, the Finalize method is required to execute code to free the unmanaged resources which cannot be claimed by the GC.

The usual and the good practice is to encapsulate these unmanaged resources within a managed object type, for e.g.: the FileStream type encapsulates the windows FileHandle and exposes methods to do various operations on the unmanaged resource like Create, Read, Write, etc. When object of FileStream type becomes unreachable by the application, it is garbage collected and the finalize method on it is automatically called by the GC. The code in the finalize method then does the needful to release the unmanaged resource.

When exactly is the finalization method called is not deterministic, as it based upon when GC kicks in. So if we need to release the unmanaged resource deterministically, we can do so by implementing the Dispose pattern/interface. The dispose method is a public method. It first suppresses the calling of Finalize method by GC and then it calls a Boolean Dispose method, which is protected and virtual. This method synchronizes thread access by placing all code inside the code construct Lock() {…} and it releases/closes the unmanaged resource.

The Boolean Dispose method is called from the Dispose method as well as from Finalize method with the Boolean value indicating whether it is called from Dispose method OR Finalization method. Why? Because if the Boolean Dispose method is called from Finalize(), it is not advisable to access any other managed object. (there is no order in which the Finalize methods are called by GC i.e. an inner/contained object’s Finalize method may be called earlier than the containing object).

When the Boolean Dispose method is called from the Dispose method, it is free to call any managed object, and this indicated by passing true. Code below will clarify more.

----------------------------------------------------------------------

Public Void Dispose()

{

….

GC.SupressFinalize(this);

Dispose(true);

}

----------------------------------------------------------------------

Protected Virtual Void Dispose (Boolean Disposing)

{

Lock (this)

{

If (Disposing)

{…. Can Access other Objects….

}

If (Handle Valid)

{ Close(Handle)

Handle = Invalid

}

}

}

----------------------------------------------------------------------

Finalize TypeName()

{

Dispose(true);

}

----------------------------------------------------------------------

If a new type derives from this type which implements the dispose pattern, all it would need to do is override the Boolean Dispose method and provide it’s own implementation of clean up activity and ultimately call the base class’s Boolean Dispose method. And if this new derived type does not need to do any clean up, it need not override the Boolean Dispose method.

Finalization method and GC

For a type that implements Finalization, as soon as it is allocated in heap, GC marks that this type has a finalization method and stores a reference to such objects. When this object becomes unreachable, GC takes this object and places a reference to it in the FreeReachable Queue. At this point there is a reference to this object and technically it cannot be garbage collected. CLR spawns separate threads, other than the application threads, to run the finalize methods of these objects in FreeReachable Queue. And as the Finalize method is RUN, the entry for that object is taken off from the queue making it completely unreachable and can be garbage collected by the GC in its next run. (Please mark that the order in which the Finalize methods of the objects would run is not deterministic and hence any use of managed objects within the Finalize methods could led the application to in-deterministic state).

Thus an object having a Finalize method requires two GC turns for it to be garbage collected. Hence unless the type is using an unmanaged resource, it is strongly recommended not to use Finalize method.

Also a dispose method allows the clean up to happen before the Finalize method is called. This means that the type will not be used anymore down the line; something which is impossible to enforce. Hence it is advisable not to implement the Dispose interface, as when the type is unreachable GC will automatically call the Finalize method and the desired clean up would be done. But if the Dispose interface needs to be implemented for some reason, then before using the TYPE it should be checked whether the object of the TYPE is still Alive or has it been Cleaned up.

Weak-References

In cases when large objects are allocated but used sparingly, then instead of holding a strong reference to the object, a weak reference to the object can he held. A weak reference allows the object to be garbage collected, if there is a need for memory. Else the weak references are kept intact and the code, if required, can use the object in weak reference.

The code below demonstrates how to hold a weak reference. But please bear in mind that for weak reference to work, there should be no strong reference to that object anywhere in the stack trace. Else the weak reference would not work

WeakReference wkRef;

Protected Void WeakReference()

{

StrongRefObject StrngObj = new StrongRefObject();

wkRef = new WeakReference(StrngObj);

StrngObj = null; // Remove strong reference to the object

}

Protected Void RestoreStrongReference()

{

StrongRefObject StrngObj = wkref.Target;

If (StrngObj == null)

/// the object has been garbage collected

Else

/// the object has not been garbage collected and can be used

}

Resurrection and Object Pooling

Resurrection means bring back to life. When an object’s Finalize method is run, it means that the object is dead, there are no references to it and its memory will be reclaimed in the next cycle of GC. But what if the object in it’s Finalize method hooks itself to some global static variable. In that case, the object cannot be garbage collected as it is being referenced by the application. That means it is back to life, resurrection. And also you could register with GC to run the Finalize method for this object, if it is garbage collected again with the help of following code: GC.ReRegisterForFinalize(this);

When would you want to implement Resurrection?

Object Pooling. If there is a class of object that you would like to POOL, you could use Resurrection. First, create a pool of that object type. Then when an object from the pool is allocated, it’s reference from the pool structure holding the object collection is removed. When the object’s Finalize method runs, it cleans the object and reattaches itself to the pool structure and also re-register’s, its Finalize method with GC. Thus the object is cleaned and available for

Programmatic Control of Garbage Collector

You can force garbage collection by calling out any of the 2 static methods; though it is best to the let the GC run on its own accord as it also fine-tunes the generation threshold value/trigger based on the application behavior.

GC.Collect();

GC.Collect(Int32 GenenerationNumbertoCollectFrom); // Generation 2 will cause Gen 1 and Gen 0 to be collected too.

GC.WaitForPendingFinalizers(); //This method suspends the calling thread, till the FreeReachable queue is empty.

Int32 GetGeneration(Object obj);// Gives you which generation the object is currently in.

Monitor Garbage Collection

.Net Framework installs a number of performance counters that gives real time statistics about the CLR’s operations. These statistics can be viewed via the PerfMon.exe. Various GC related counters can be viewed and monitored.

Conclusion

The above should provide some insight into memory management and garbage collection in .Net. It also gives pointers on using Finalize and Dispose methods and their impact on performance; also where are Large objects allocated in heap and their impact on performance; and finally multiprocessor and their use by GC and how can concurrent collection be done in multiprocessor environment.

References

1. Applied Microsoft .Net Framework Programming – Jeffrey Ritcher; this is by far the best resource that I have come across on this subject.

2. CLR via C#, 2nd Edition – Jeffrey Ritcher. (I have not read this book but still including it because i have every reason to believe it must be a good read and add a lot of value on this topic in more detail).

No comments:

Post a Comment