IDisposable for Dummies #1 – Why? What?

- select the contributor at the end of the page -
IDisposable - What ? Why?It took me more than a year to finish writing this article – a “hot topic” and I wanted it to be clear, simple and right. Also, I was busy and did not want to publish anything “unfinished”. As usual, your feedback and comments are more than welcome. Thank you in advance)


Recently (February 2011), I had to review some .NET code and came across of some wrong implementations of IDisposable.

After discussing with the developers, there were many reasons for what I found, ranging from:



  • Not knowing the difference between a “CLR memory resource”, a “managed resource” and an “unmanaged resource”;

  • Not understanding “how and when”  resources are “released”;

  • Not knowing when to override the “Finalize()” method and what should be released by a finalizer?


Even if the recent publications are becoming better, documentation from Microsoft is not fully clear on this subject. There are still some ambiguous areas about when you should implement a Finalizer (I must admit, since I started to do .NET programming back in 2002 (with .NET 1.1), MSDN documentation has improved a lot, but it took me a lot of time to understand this topic – thanks to the books like the ones from Jeffrey Richter [REF-01], Don Box [REF-02] and Bill Wagner [REF-03]).

I have split this matter into two posts :

If you already know about the IDisposable pattern and what are the types of resources managed by the .NET CLR (Common Language Runtime – Microsoft implementation of the managed runtime environment on the .NET platform), please go directly to the second post. If you do not know or want to learn more about it, please keep reading.


What are memory resources on .NET CLR?
In managed environments (like the .NET CLR or the Java JVM), “memory management on the heap” is taken care by the runtime engine, i.e. CLR on the .NET platform – The rest of the discussion will focus only on the  .NET platform.

Like a supervisor, the runtime engine knows everything about heap memory allocation in its world, aka the managed environment. However, this managed environment lives and is hosted in a less “managed” environment, e.g. the Windows operating system.  Your application might need some resources that are not managed by the CLR and even living outside your process. They could be locally allocated on the same machine or remotely far away on another machine. The important point is that you can use them from your code in .NET.



From this picture we can see 3 types of memory resources for a .NET application:

  • CLR memory resources, such as _myNetObj or the string “Hello world!”. Their memory space and fields (int and string) are all allocated on the CLR's heap memory.

  • Managed resources, such as _myObjWindow, _myObjFile, _timer1, _file1,_connection1. They are .NET classes which have direct unmanaged fields and/orother managed resource as fields. Their memory space is managed by the CLR.

  • Unmanaged resources, such as window, timer, file, DB. They live outside the CLR's heap memory. They can be referred as “native resources” as well.


The CLR only controls and manages its memory. The CLR has its own algorithms to allocate/move/release its memory (heap, large object heap, …). The CLR's heap memory management is based on a “generational garbage collection”.  I am not digging too much in details inside the CLR implementation of its “garbage collector”, because it would require many posts :) , but the aim of the garbage collector is to reclaim not reachable managed memory and re-compact the memory to prevent memory fragmentation as much as possible. For further details, please see [REF-01] [REF-07] and [REF-08].

The bold statement above requires a shift of mind from the way development is done in a “native memory management”, such as in C and C++ language, compare to a managed environment like the .NET CLR.

In native memory management, the developer that you are is in charge of the memory allocation, you will do a malloc() (or new()) method to allocate memory, use your object and when you do not need it anymore you will call de-allocate() (or delete()) method to free the memory. You have “great powers”, but also great responsibilities about the management of your memory. This style of memory management is also referred as “deterministic memory management“. By opposition, CLR's style is referred as “non-deterministic memory management“.

In the “deterministic memory management” style, there are a few drawbacks:

  • First you end up managing handlers/pointers to memory area. You have to be careful and aware of the size of the “right portion of the  memory allocated” when writing and reading from it.

    • Overrun buffer, stack data corrupted, memory corrupted are results of misuse of those handlers/pointers.



  • For some applications, the repetition of the cycle  “allocation/de-allocation” on a native heap space will fragment the memory and the process will end up at some point with an “out-of-memory” error, even though it still has a lot of free space (the free space is fragmented and not contiguous).

  • From a security point of view, the system cannot verify the stack and its memory pointers and apply security checks/rules.


To address some of those issues, managed environments (which have existed since the 70s, “hello Smalltalk !”) have been revisited and refreshed with the latest features. For the CLR, this includes the following features (not a thorough list):  “generational garbage collectors” and “Code Access Security (CAS) for .NET”.

Now, in this new managed environment world, the developers are not concerns anymore about memory allocation/de-allocation of CLR memory resources. However, as your application does not lived (yet!) in a full managed operating system, you must pay attention to managed resources and unmanaged resources.

The other key point to retain is that once you have finished using an object allocated on the CLR's heap it might be de-allocated at some point but nothing is certain and there is no guaranty (i.e. about “when” and whether it would be de-allocated at all).

In fact, in the “non-deterministic memory management” style you have:

  • Not sure the CLR will re-use the memory space used be your object (if there is no need to claim memory, why bother).

  • You do not know when this space will be re-used.


That leaves the .NET developer with 2 problems:

  1. How can he “force/trigger” the release of resources?

  2. How can he make sure that managed resources and unmanaged resources will still be released (at some point)?


To help you with that task about releasing other resources than CLR memory resources, the .NET team came up with the IDisposable interface [Ref-04]  and the Dispose pattern [Ref-05]. The other .NET feature that will help you is the Finalize() method (a protected method define at the System.Object level).


What about IDisposable ?
.NET objects live in a managed world (aka CLR). We know that the CLR memory management style is non-deterministic, i.e. developers do not have to worry about releasing the CLR heap memory allocated by CLR memory resources. However, for other types, managed resources and unmanaged resources, we do not want those objects to stay in memory for long after we have finished to use them. Some of those objects have a Close() method, some have a Free() method, some have a Release() method, and some have even a mix of them.

So, how can I (as a developer) know which one to call to free up the managed and native resources? (like DB connections, files, …)

The .NET team came up with a single interface named IDisposable with a single method called Dispose() to standardize the way resources are disposed:

  • Because it is an interface, it does not prevent your class to inherit implementation from whatever class it needs to (remember, you are only allow one implementation class inheritance, but many interface inheritances);

  • As you decorate your class with that interface, it adds semantics (interfaces are like contracts) and intention to your class. So, developers using your class should know about it.

  • The dispose method is simple – no parameters – and its name is clear: dispose any resources you have.

  • Developers should only be aware if a class Implements IDisposable to call a single method, Dispose(), to release all resources by an instance of that class.


The .NET C# team went a step further by providing a construct in the language called “using” that will allow you to declare, use and call the Dispose() method in atry/finally block.


Why do you a need a Finalizer?
You only need to implement  a Finalizer (i.e. override Finalize() method) in your class ifyour class directly creates a native resource (aka unmanaged resource) like a memory space in the unmanaged heap, a GDI handle or a file handle.

Otherwise, if your class is using the .NET objects provided by the .NET framework (e.g.System.IO. Stream), those classes are managed resources and they already implement IDisposable and have a Finalize() method overridden. So, you only need to call their Dispose() method from your implementation. YOU DO NOT HAVE TO implement a Finalizer() in that case.

Jeffrey Ritcher [Ref-01] has some other examples that you might want to implement a Finalizer() in your class even though your class does not have a (direct) native resource (e.g. if you want to monitor garbage collection). I do think this is the exception to the rule.

In the “non-deterministic memory management” style the finalization is as follows:

  • Exact time of finalization is unspecified. (When? => you do not know)

  • Order of finalization is unspecified. (Order? => you do not know)

  • Thread running the Finalize() method is unspecified. (Which Thread? => you do not know)


Objects with a Finalize() method overridden have an extra cost associated with them in order to reclaim the CLR memory used by your object [REF-01][REF-06][REF-07][REF-08] . It will take at least two garbage collections:

  • First one to put your object in the “f-reachable” list;

  • Run your Finalizer() code

  • Then remove your object from the “f-reachable” list, so that your object becomes unreachable (i.e. no roots referencing it);

  • Its managed heap memory will be reclaimed in the next full garbage collection(i.e. generation 2, current maximum generation level in the CLR as I am writing these lines).


The .NET C# team provided a destructor syntax (only the syntax, not the semantics !!), but the piece of code written in that method will be placed inside the Finalize()method by the C# compiler in a try/finally block, with a call to the base class in the finally block. This is maybe one of the ambiguous area and syntax for C# developers to understand: the Finalize() method is not a destructor, i.e. like in C++.

If your class has one (or more) unmanaged resource then you will need to:

  • Override the Finalize() method;

  • Call GC.SuppressFinalize() from your public Dispose() method;

  • Your Dispose() method must release any managed and unmanaged resources;

  • Your Dispose() method can be called multiple times without failing.

  • Once your object has been disposed, it cannot be (re-)used anymore. It is garbage !!

  • Once your object has been disposed, all public instance methods should raise the exception “ObjectDisposedException”.

  • Only release unmanaged resources in your Finalize() method.


Please, refer to the second post, there is a detailed matrix about what needs to be implemented in various scenarios and some sample code.


End of part 1
I hope that by now you have a better understanding about:

  • The different types of memory resources available in the CLR (CLR memory resourcemanaged resource and unmanaged resource).

  • The IDisposable interface.

  • The Finalize() method and when to implement it.


My second post has a detailed matrix to help you find out, depending in your context, what needs to be implemented and some code samples.

References:

Contributed by Joao Morais at http://blog.ilab8.com/2012/04/26/idisposable-for-nuts-1-why/

Get our content first. In your inbox.

Contributor

Paul Ballard

is a Chief Architect specializing in large scale distributed system development and enterprise software processes. Paul has more than twenty years of development experience including being a former Microsoft MVP, a speaker at technical conferences such as Microsoft Tech-Ed and VSLive, and a published author. Prior to working on the Windows platform, he built software using a vast array of technologies including Java, Unix, C, and even OS/2.