Wednesday, October 29, 2008

NLS_LANG in Oracle

Everyone working on Oracle in non English(American) environment should definitely take a look at the NLS_LANG faq. It contains many fundamental concepts one should grasp to work effectively with Oracle.

So what is NLS_LANG? According to the faq, "It sets the language and territory used by the client application and the database server. It also indicates the client's character set, which corresponds to the character set for data to be entered or displayed by a client program." Language component "specifies conventions such as the language used for Oracle messages, sorting, day names, and month names". Territory component "specifies conventions such as the default date, monetary, and numeric formats". Charset component "specifies the character set used by the client application".

The NLS_LANG setting has the following format, language_territory.charset and can be set at the client in Windows Registry (HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\HOMEx\ for Oracle Database versions 8, 8i and 9i) or as System or User Environment Variable. The setting I use on my client machine is TRADITIONAL CHINESE_TAIWAN.ZHT16MSWIN950. One can also use @.[%NLS_LANG%]. command to display the setting in SQL Plus. If the NLS_LANG is not set, Oracle assumes that the NLS_LANG at the client is AMERICAN_AMERICA.US7ASCII and do locale-specific translation accordingly.  So if you can't read the text selected from the database, it's very likely the character set at the client is different from that at the Oracle server, or Oracle Installer doesn't populate NLS_LANG and use the default US7ASCII.

On the server, NLS_LANG can be set as an session parameter, instance parameter, or database parameter. Former overrides latter if set, and former inherits from latter if not. To display the settings on the server, one can execute the following commands:

  1. SELECT * from NLS_SESSION_PARAMETERS;
  2. SELECT * from NLS_INSTANCE_PARAMETERS;
  3. SELECT * from NLS_DATABASE_PARAMETERS;

The setting on the server is more fine-grained than it on the client. To change session or instance parameters, use ALTER SESSION or ALTER SYSTEM command. For database parameters, it is set via init.ora file during database creation and can't be changed after that. There is no NLS_LANG but NLS_LANGUAGE and NLS_TERRITORY in init.ora. Also, the database character set is defined by the "CREATE DATABASE" command and can't be changed afterwards.

If the character set is the same at the client and the server, Oracle directly stores whatever is submitted by the client. No conversion is involved. If the character set defined at the client is different from that at the server, the conversion is usually done at the client. However, the conversion may fail. For example, a database created with NLS_LANG=TRADITIONAL CHINESE_TAIWAN.WE8MSWIN1215 can't store Chinese(Traditional) because WE8MSWIN1215 doesn't support Chinese, but a database with NLS_LANG=AMERICAN_AMERICA.UTF8 can store Chinese(Traditional), if the input text is encoded in ZHT16MSWIN950 or UTF8. So if the database character set can't support the character set submitted by the client, the database has to be recreated.

To troubleshoot the character set conversion problem, there are several places to look after.

  1. Database character set
  2. NLS_LANG setting at the server machine
  3. NLS_LANG setting at the client machine

To see the encoding used by Oracle to store text, use the DUMP command. The following is the result from the Oracle I tested.

SQL> SELECT DUMP('abc', 1016) FROM DUAL;

DUMP('ABC',1016)
------------------------------------------------------------------

Typ=96 Len=3 CharacterSet=ZHT16BIG5: 61,62,63

SQL>

2 comments:

Bamboo Man said...

抱歉 想跟您請教一下,像是"綉 堃 睸 椀 梦 碁 嫺 葳 綉 恒 瑠 璌 荆"

這些文字,如果存到了Oracle資料庫後(charset:ZHT16MSWIN950)

之後再從程式取出顯示到網頁上(utf-8 encoding)後,就會變成方格子(以"綉"字而已 格子裡面是標示8e,a7,跟dump函數印出來的一樣...)

對於讓它正常顯示在網頁上,我毫無頭緒,不知道從那邊開始著手。

Anonymous said...

Can anyone recommend the best RMM system for a small IT service company like mine? Does anyone use Kaseya.com or GFI.com? How do they compare to these guys I found recently: N-able N-central remote management
? What is your best take in cost vs performance among those three? I need a good advice please... Thanks in advance!