Showing posts with label Software Architecture. Show all posts
Showing posts with label Software Architecture. Show all posts

Tuesday, October 21, 2008

Maximize Locality, Minimize Contention

In this article published on Dr. Dobbs's Journal, Herb Sutter reminded us that spacial locality could inhibit software scalebility if we don't code our program carefully.

As depicted in the graph, memory is not accessed in bytes but in chunks. In the cache, data are accessed in terms of cache lines by the hardware. In the RAM, data are accessed in terms of pages by the OS. In the disk, data are accessed in terms of clusters. So any contension for the memory would definitely impact software performance.

For example, take a look at the following sample code.

// Thread 1
for(int i = 0; i < MAX; ++i ) {
++x;
}

// Thread 2
for(int i = 0; i < MAX; ++i ) {
++y;
}

If x and y are defined close together to fit into a cache line, due to the cache coherency protocol, only one thread can update the cache line at a time and the resulting behivor would like the code below. Originally, one would expect single-thread code to run twice as fast but only to find that the actual performance is not that much, because the cache line containing variable x and y becomes the hot spot.

// Thread 1
for(int i = 0; i < MAX; ++i ) {
lightweightMutexForXandY.lock();
++x;
lightweightMutexForXandY.unlock();
}

// Thread 2
for(int i = 0; i < MAX; ++i ) {
lightweightMutexForXandY.lock();
++y;
lightweightMutexForXandY.unlock();
}

So to avoid the convoy phenomenon, the author suggested several guidelines to follow.

  1. Keep data that are not used together apart in memory to avoid convoy phenomenon.
  2. Keep data that is frequently used together close together in memory to take advantage of locality.
  3. Keep "hot"(frequently accessed) and "cold"(infrrequently accessed) data apart.

Further readings:

  1. Windows with C++: Exploring High-Performance Algorithms
  2. .NET Matters: False Sharing

Saturday, May 10, 2008

Book - Release It!: Design and Deploy Production-Ready Software

I came across the book from the announcement of 2008 Jolt Productivity Award. This book talks about what could go wrong (antipatterns) once a system go into production and what we could do about it (patterns). The focus of the book is not about what's in the spec but about what's not in the spec. There are 4 parts in the book, stability, capacipity, general design issues and operations. I listed below several points which impressed me most.

  • Timeout - "Any resource pool that blocks threads must have a
    timeout to ensure threads are eventually unblocked whether resources become available or not
    ".
  • Conway's law - "Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations." It's communication structure that matters, not hierarchical structure.
  • Multiplier effect - For a webpage containing only 40kb of useless data, if that page is hit 1000 times a day, it means 40mb of bandwidth is wasted per day.
  • Multihomed Servers - Server sockets of an application should bind to the networks they listen to. For a server socket handling administrative requests, it should bind to the administration network, not backup network or production network.
  • Protocol Versioning - No matter what protocol systems use to interact, it will change. So having the version be part of the message exchange will help ease the chage of protocols.

Saturday, April 12, 2008

Comet - Pushing Data over Http

As described in this wiki article, Comet is a term, coined by Alex Russell, to describe the technologies for pushing data over Http. What is Comet? Comet is a web architecture allowing a web server to push data to a web browser using Http protocol. It is realized by taking advantage of the Http persistent connection. There are many solutions available, both open and proprietary. HTML 5 specification even attempts to standardize the Comet transport by adding a new HTML element, event-source and a new data format, called the DOM event stream. However, several issues have to be considered when one tries to adopt the technology.

First, most web servers assign a thread for handling a connection request. Since Comet keeps its http connection with a web server open, a web server can only serve limited amount of Comet connection and will run out of its threads pretty soon. A dedicated comet server should be deployed for pushing data.

Second, the traditional approach of scaling web applications by adding web servers might not be applicable to a Comet server. If a user connect to multiple event sources, it's difficult to distribute users among multiple Comet servers.

Third, proxy servers and firewalls could drop connections that have been open for too long. Though some Comet frameworks would tear down and recreate connection constantly, some proxies could send back buffered data and deceive clients into believing a connection is established.

Wednesday, March 12, 2008

An Introduction to Enterprise Service Bus

The first chapter of Open-Source ESBs in Action has a good introduction to Enterprise Service Bus. The followings are discussed in the chapter.
  • EAI vs. ESB
    1. EAI products are based on the hub and spoke model. All data exchange is centralized in the hub.
    2. ESB products are based on the bus model. Data are distributed to the destinations through the bus. In the distribution process, data/messages can be transformed or enhanced.
    3. The data exchange in ESB products is based on open standards, such as, JCA, XML, JMS, and web services standards.
  • Reasons to start thinking of an ESB
    1. Necessity to integrate applications
    2. Heterogonous environment
    3. Reduction of total cost of ownership
  • Core functionalities of an ESB
    1. Location transparency
    2. Transport protocol conversion
    3. Message transformation
    4. Message routing
    5. Message enhancement
    6. Security
    7. Monitoring and management
  • Current open source ESB projects
    1. Mule
    2. Apache ServiceMix
    3. Open ESB
    4. Apache Synapse
    5. JBoss ESB
    6. Apache Tuscany
    7. Fuse ESB
    8. WSO2 ESB
    9. PEtALS
    10. OpenAdapter
The authors also mentioned that Service Component Architecture (SCA) seems to be the next big thing in the ESB market. SCA is a specification based on the principles of service-oriented architecture. Vendors are investigating the possibility of transforming their ESB products to conform with the specification.