Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

Versions available for this page: CUBRID 8.2.1 |  CUBRID 8.3.0 |  CUBRID 8.3.1 |  CUBRID 8.4.0 |  CUBRID 8.4.1 |  CUBRID 8.4.3 |  CUBRID 9.0.0 | 

Definition and Characteristics

Definition

CUBRID supports the following four types of character strings:

  • Fixed-length character string: CHAR(n)
  • Variable-length character string: VARCHAR(n)
  • Fixed-length national character string: NCHAR(n)
  • Variable-length national character string: NCHAR VARYING(n)

The followings are the rules that are applied when using the character string types.

  • In general, single quotations are used to enclose character string. Double quotations may be used as well depending on the value of ansi_quotes, which is a parameter related to SQL statement. If the ansi_quotes value is set to no, character string enclosed by double quotations is handled as character string, not as an identifier. The default value is yes. For more information, Statement/Type-Related Parameters.
  • If there are characters that can be considered to be blank (e.g. spaces, tabs, or line breaks) between two character strings, these two character strings are treated as one according to ANSI standard. For example, the following example shows that a line break exists between two character string.
  • 'abc'
  • 'def'
  • The two strings above are considered identical to one string below.
  • 'abcedf'
  • If you want to include a single quote as part of a character string, enter two single quotes in a row. For example, the character string on the left is stored as the one on the right.
  • ''abcde''fghij'            'abcde'fghij
  • The maximum size of the token for all the character strings is 16KB.
  • National character strings are used to store national (except for English alphabet) character strings in a multilingual environment. Note that N (uppercase) should be followed by a single quote which encloses character strings.
  • N'Ha rder'
Characteristics

Length

For a CHAR or VARVAHR type, specify the length (bytes) of a character string for a NCHAR or NCHAR VARYING type, specify the number of character strings (number of characters).

When the length of the character string entered exceeds the length specified, the characters in excess of the specified length are truncated if they are space characters (ASCII 32), or an error occurs if they are non-space characters. Note that the data is not truncated according to the length specified.

For a fixed-length character string type such as CHAR or NCHAR, the length is fixed at the declared length. Therefore, the right part (trailing space) of the character string is filled with space characters when the string is stored. For a variable-length character string type such as VARCHAR or NCHAR VARYING, only the entered character string is stored, and the space is not filled with space characters.

The maximum length of a CHAR or VARCHAR type to be specified is 1,073,741,823 the maximum length of a NCHAR or NCHAR VARYING type to be specified is 536,870,911. The maximum length that can be input or output in a CSQL statement is 8,192 KB.

Character Set, charset

A character set (charset) is a set in which rules are defined that relate to what kind of codes can be used for encoding when specified characters (symbols) are stored in the computer.

CUBRID supports the following character sets and you can specify them as the CUBRID_LANG environment variable.  You can store data in other character sets (e.g. utf-8), but string function or LIKE search are not supported.

Character Set

CUBRID_LANG

8 bits ISO 8859-1 Latin

en_US

KSC 5601-1992 (EUC_KR)

ko_KR.euckr

Any characters from the above character sets can be included in a character string (the NULL character is represented as '0').

Collating Character Sets

A collation is a set of rules used for comparing characters to search or sort values stored in the database when a certain character set is specified. Therefore, such rules are applied only to character string data types such as CHAR() or VARCHAR(). For a national character string type such as NCAHR() or NCHAR VARYING(), the sorting rules are determined according to the encoding algorithm of the specified character set.

Character String Coercion

Automatic coercion takes place between a fixed-length and a variable-length character string for the comparison of two characters, applicable only to characters that belong to the same character set. For example, when you extract a column value from a CHAR(5) data type and insert it into a column with a CHAR(10) data type, the data type is automatically coerced to CHAR(10). If you want to coerce a character string explicitly, use the CAST operator (See CAST Operator).