Sunday, October 26, 2008

High Availability: Keep Your Code Running with the Reliability Features of the .NET Framework

SQL Server 2005 allows programmers to write stored procedures in C#, which means a CLR runtime is hosted in SQL Server 2005. However, if the stored procedures aren't written correctly so as to jeopardize its hosting environment, this feature could become a nightmare for SQL Server product team. In this MSDN article, Stephen Toub talked about many features introduced in .NET Framework 2.0 to meet the high availability requirement in a Hosting environment.

What 'vilainies' are there that would make our program unreliable? According to the author, they come in the form of OutOfMemoryException, StackOverflowException and ThreadAbortException.

Many operations in .NET would result in memory allocation. The obvious one is object construction. there are other not obvious ones. For example, boxing requires heap allocation to store a value type. Invoking a method of an assembly for the first time also results in the assembly being delay-loaded into memory. The first time a method is just-in-time compiled require memory allocations to store the generated code and associated runtime data structures. If any of these operations goes wrong and causes OutOfMemoryException being thrown, in .NET Framework 1.x, it's possible the code in the catch block and/or finally block is not executed.

StackOverflowException could be thrown in the managed code or within the runtime. For the exception thrown within the runtime, the process will be torn down. For the exception thrown in the managed code, it's good that one catches and handle the exception but if one is not careful enough, another StackOverflowException could be thrown in the catch block, which triggers the OS to kill the process.

ThreadAbortException could occur when Thread.Abort or AppDomain.Unload method is called. In .NET Framework 1.x, resource leak is still possible with or without catch/finally block.

The .NET Framework 2.0 introduces Constrained Execution Regions(CER) to deal with the asynchronous exceptions mentioned above. For code marked as a CER, the runtime will delay thread aborts for code that is executing in a CER. Also, the runtime will prepare CERs as soon as possible to avoid out-of-memory conditions. That is, the runtime will allocate memory up front to avoid OutOfMemory/StackOverflow Exceptions in the CERs.

To mark code as a CER in .NET Framework 2.0, call RuntimeHelpers.PrepareConstrainedRegions() method before entering try {...} finally {...} block. Also, the methods called in the block has to adhere to the the constraints required for execution within a CER. One expresses that the methods meet the constraints through ReliabilityContractAttribute.

A Reliability Contract express two concepts: what kind of state corruption could result from asynchronous exceptions being thrown during the method's execution, and, given valid input, what kind of completion guarantees the method can make if it is invoked in a CER and asynchronous exception are thrown. Because ThreadAbortException is being delayed in a CER, one needs to consider the other two failures: OutofMemoryException and StackOverflowException. Only the following three Reliability Contracts are valid for methods in a CER.

[ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)]

If the code in a CER calls an interface method, a virtual method, delegate or generic method, CLR needs further information to allocate memory up front. Developers can help by calling several methods defined in the RuntimeHelpers class, i.e., PrepareMethod and PrepareDelegate.

When StackOverflowException is thrown in the try block, CLR doesn't guarantee the back-out code is executed. To be sure that back-out code must absolutely execute under StackOverflowExceptions, the RuntimeHelpers class provides ExecutecodeWithGuaranteedCleanup method.

The description of the Thread.Abort() method on MSDN, reads as follows, "Raises a ThreadAbortException in the thread on which it is invoked, to begin the process of terminating the thread. Calling this method usually terminates the thread.". What actions are involved in the termination process? When does a thread can't be terminated by Abort() method, and what can we do about it?

When a thread is executing in a try-catch-finally block and get aborted, the CLR executes the finally block before terminating the thread. If the code catch ThreadAbortException or enters in a infinite loop in the finally block, that thread can't be terminated. Under this situation, a CLR Host in .NET framework 2.0 can abort the thread abruptly, i.e., rude thread abort. For 'graceful' thread aborts, CLR delay the action by default over CERs, finally blocks, catch blocks, static constractors and unmanaged code. For rude thread aborts, CLR delays the action only over CERs and unmanaged code.

Since rude thread aborts skip over finally and catch blocks, resources leak may result. To account for this problem, .NET Framework 2.0 introduces a new kind of finalizer, CriticalFinalizerObject. If one is concerned that important resources might leak because of rude thread abort, one can wrap the resource in a class derived from CriticalFinalizerObject. The resource is guaranteed to be released. SafeHandle is such an example that makes use of CriticalFinalizerObject. It is simply a managed wrapper around an IntPtr with a finalizer that knows how to release resource referenced by that IntPtr.

This article also mentioned several other methods introduced in .NET Framework 2.0 to make our code more reliable, such as,

  1. Thread.BeginCriticalRegion/Thread.EndCriticalRegion
  2. Environment.FailFast
  3. Runtime.MemoryFailPoint

It seems to me that how reliabe you want your application to be, .NET Framework can just provide the tools you need to make it that reliable. It's up to you to define the reliability of your application.

1 comment:

Petr Omacka said...

Is this article legally provided? It comes from MSDN.com.

I would like to read it.