Monday, June 30, 2008

Avoid Writing Finalize Method in C#

Say, a C# class, SomeClass, is defined as follows. If you think a destructor is defined in the class and it's a good practice to have a destructor in a class to clean up resources. Well, according to Jeffrey Richter's CLR via C#, you are wrong.

public class SomeClass

{

~SomeClass() { ... }

}

First, there is no destructor in C#, only Finalize method. The behavior of ~SomeClass() in C# is different from that in C++. That function will execute when CLR performs garbage collection. On the other hand, in C++, it is deterministic. When one deletes an instance of SomeClass, ~SomeClass() is executed. Although the goal of both methods in both languages is the same, that is, releasing resources, but the execution timing is different. Both have the same syntax with different semantics.

Second, for classes referencing managed resources, the book suggests one avoid writing a Finalize method for the reasons I quote below.

  • Finalizable objects take longer to allocate because pointers to them must be placed on the finalization list.
  • Finalizable objects get promoted to older generations, which increases memory pressure and prevents the object's memory from being collected at the time the garbate collector determines that the object is garbage. In addition, all objects referred to directly or inderectly by this object get promoted as well.
  • Finalizable objects cause your application to run slower since extra processing must occur for each object when collected.

Writing Finalize method to release managed resources actually adds overheads on the CLR at object construction and collection. Plus, you can't control when the Finalize method will execute. If you stop writing Finalize method in your class, your code will be simpler and performance will be better.

For types referencing unmanaged resources, one should implement IDisposable interface to deterministically dispose of resources. That is, in the Dispose method, one calls the Win32 CloseHandle function to release native resources, and then invokes GC.SuppressFinalize method as there is no need for the object's Finalize method to execute.

Further readings:

Friday, June 27, 2008

Covariance and Contravariance in C#2

I was reading Jon Skeet's C# in Depth the other day. Two terms, covariance and contravariance, were discussed in the book. I thought it would be helpful to summarized them here.

So what is covariance? Say, classB is derived from classA. Then, the following statement is valid in C#. The variable, array, can accept a more specific object. That is, arrays of reference-types in C# are convariant.

classA[] array = new classB[0];

However, there is some limit for you to do so in C#2. Convariance in generics is not supported. Generics is invariant and the following statement is not valid in C#2.

List<classA> list = new List<classB>;

Then, what is contravariance? Say, a function, func, is defined in classB as follows.

public classB : classA {

public int func(classB arg) { ... }

}

If contravariance were supported in C#, the following statements should be valid. The function, func, can accept a more general parameter and the parameter of func is said to be contravariant.

classA a = new classA();
classB b = new classB();

b.func(a); //not supported in C#

However, it's not supported this way in C#. C# supports contravariance in delegate and the following code snippet is valid.

void ProcessEvent(object sender, System.EventArgs e) { ... }

this.textBox1.KeyDown += this.ProcessEvent;
//valid in C#2
this.button1.MouseClick += this.ProcessEvent; //valid in C#2
One should note that KeyDown event takes KeyEventArgs which is derived from EventArgs while MouseClick event takes MouseEventArgs which is also derived from EventArgs. So an event in C# can accept a delegate whose signature contains more general parameters.

Further readings:

Thursday, June 26, 2008

Some Oracle Basics

I am currently reading Chapter 10 and 11 of Tom Kyte's Oracle Book to brush up my database knowledge and would like to summarize the terms about tables and indexes for my future reference.

Row Migration: Suppose the size of row X is 50 bytes and is stored in block A. After an update to row X, the row grows to 100 bytes and can't fit into block A. Oracle will move the updated row to another block where it can fit, say block D, and replace the original row X in block A with a pointer to point to block D. After the update completes, Row X migrates from block A to block D.

Index Organized Table(IOT): For IOT, data are stored in an index structure. Data are automacially sorted according to the keys defined by the index structure. Querying an IOT usually takes one scan because data, not rowids, are stored in the leaf nodes.

Secondary Index: It's an index on index. When one creates an index for an IOT, that index is a secondary index. Leaf nodes contain logical rowids pointing to the data in the IOT. Since leaf nodes in IOT would change places due to shape and size change in IOT, the logical rowids can be stale. When a query is hit in such a situation, it takes 2 scans to locate data, one on the secondary index and the other on the IOT, slower than querying an index on a regular table, which takes one scan to locate rowids and one read to retrieve data.

Index Clustered Table(ICT): For ICT, data are clustered together according to cluster keys. ICT is different from IOT in several ways. First, data in ICT are not sorted by cluster keys as in IOT. data are stored in a heap. Second, data from different tables can be clustered together by the same columns in a ICT. Third, a cluster index takes a cluster key value and returns a block address, not rowid, in the cluster.

Bitmap index: An entry in a bitmap index points to many rows in a table, while an entry in a B*Tree index points to a row in a table. A bitmap index is suitable for indexing data of low cardinality, like gender, blood type, etc, and is especially suitable for aggregation queries, like count the number of women with blood type A. However, since a bitmap entry points to many rows in the table, a update to the entry will cause Oracle lock all the rows pointed to by the entry. So, bitmap indexes are ill suited for a write-intensive environment.

Function-based indexes: Suppose I have a query like the following

SELECT *
FROM USER
WHERE UPPER(USERNAME)  = :USERNAME

Usually, Oracle performs a full-table scan to retrieve data from USER table. By running the following statement, Oracle will build a function-based index for us.

CREATE INDEX USER_UPPER_IDX on USER(UPPER(USERNAME));

For the above query, Oracle scans the index first, and then reads the rows pointed by the index. Essentially, Oracle creates a case-insensitive index for us.

Further readings: Understanding Indexes and Clusters