Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

Versions available for this page: CUBRID 9.0.0 | 

Charset and Collations of String Literals

Collation of charset and string literal is determined based on the following priority.

  1. The CHARSET introducer or the COLLATE modifier of the string literal
  2. The collation defined last by the charset and the SET NAMES statement
  3. Default collation set by the charset and the CUBRID_LANG environment variable
SET NAMES Statement

The SET NAMES statement changes the default client charset and the collation. Therefore, all sentences in the client which has executed the statement have the specified charset and collation. The syntax is as follows.

SET NAMES [ charset_name ] [{COLLATION | COLLATE} collation_name]

  • charset_name: Valid charset name is iso88591, utf8 and euckr.
  • collation_name: Collation setting can be omitted and all available collations can be set. The collation should be compatible with the charset; otherwise, an error occurs. To find the available collation names, look up the db_collation catalog VIEW (see Collation and Charset of Column).
CHARSET Introducer

In front of the constant string, the CHARSET introducer and the COLLATE modifier can be positioned. The CHARSET introducer is the charset name starting with a underscore (_), coming before the constant string. The syntax to specify the CHARSET introducer and the COLLATE modifier for a string is as follows.

[charset_introducer]'constant-string' [ {COLLATE|COLLATION} collation_name]

  • charset_introducer: a charset name starting with an underscore (_), can be omitted. One of _utf8, _iso88591, and _euckr can be entered.
  • constant-string: a constant string value.
  • collation_name: the name of a collation, which can be used in the system, can be omitted.

The default charset and collation of the constant string is determined based on the current database connected (the SET NAMES statement executed last or the default value). When the string CHARSET introducer is specified and the COLLATE modifier is omitted, the default collation (binary collation) of corresponding charset is set. When the CHARSET introducer is omitted and the COLLATE modifier is specified, the character is determined based on collation.

Example

The SET NAMES example is as follows.

SET NAMES iso88591;

SET NAMES utf8 COLLATE utf8_en_cs;

The following example shows how to specify the CHARSET introducer and the COLLATE modifier.

SELECT 'cubrid';

SELECT _utf8'cubrid';

SELECT _utf8'cubrid' COLLATE utf8_en_cs;

Remark

There is a little difference between the notation of SET NAMES charset and JDBC charset as follows.

SET NAME Statement Charset

JDBC Charset

iso88591

ISO-8859-1

utf8

UTF-8

euckr

EUC_KR

This is an example of the connection URL string used in JDBC.

url = "jdbc:cubrid:127.0.0.1:33000:demodb:dba::?charset=UTF-8";