Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

Versions available for this page: CUBRID 8.2.1 |  CUBRID 8.3.0 |  CUBRID 8.3.1 |  CUBRID 8.4.0 |  CUBRID 8.4.1 |  CUBRID 8.4.3 |  CUBRID 9.0.0 | 

Definition and Characteristics

Definition

CUBRID supports the following four types of character strings:

  • Fixed-length character string: CHAR(n)
  • Variable-length character string: VARCHAR(n)
  • Fixed-length national character string: NCHAR(n)
  • Variable-length national character string: NCHAR VARYING(n)

All types of character strings are enclosed within single quotes. If are characters that can be considered to be blank (e.g. spaces, tabs, or line breaks) between two character strings, these two character strings are treated as one according to ANSI standard. For example, the following two character strings are identical.

'abcdef'

'abc'
'def'

If you want to include a single quote as part of a character string, enter two single quotes in a row. For example, the character string on the left is stored as the one on the right.

 ''abcde''fghij'        'abcde'fghij

The maximum size of the token for all the character strings is 16KB. National character strings are used to store character strings that are not part of the English alphabet. National character strings differ from non-national character strings in that they are prefixed by the character N (must be in uppercase). For example:

 'Härder'

Characteristics
Length

The length of a character string is represented by the number of characters in it. Whether it has a fixed- or variable- length, the size of the character string is given when an attribute is defined.
When the length of a character string exceeds the maximum length defined, the exceeding characters are truncated if they are space characters (ASCII 32), or processed as an error otherwise. When a character string shorter than the defined length is stored in a fixed-length character string, the remainder of the character string is filled with space characters. For a variable-length character string, however, only the entered character string is stored with no added trailing space.
The maximum length of a character string is 1,073,741,823 bytes or 1GB. A string longer than this is truncated. The maximum length of a string that can be entered or processed by a single CSQL statement is between 16 and 8192KB (i.e. 8,388,608 bytes). The maximum length of a national character string is 536,870,911 characters, equal to half of the non-national character string limit because more than one byte may be needed for a character in a national character string. For the same reason, the maximum length of a national character string that can be input or output in a CSQL statement is also 536,870,911 characters.

Character Code Set

CUBRID supports the following character code sets:

  • 8-bit ASCII
  • 8-bit ISO 8859-1 Latin
  • KSC 5601-1992 (Korean character standard)

Any characters from the above character sets can be included in a character string (the NULL character is represented as '\0').

Collating Character Code Sets

Character codes are sorted based on certain rules in the character code set. Such rules are called collation. The rules determine whether to compare character codes from the left to the right or vice-versa, and whether the trailing spaces will be used in the comparison. Each character code set includes a pre-defined basic collation.
For a national character set, the collation is determined by its encoding algorithm.

Character String Coercion

Automatic coercion takes place between a fixed-length and a variable-length character string for the comparison of two characters, applicable only to characters that belong to the same character code set. For example, when you extract a column value from a CHAR(5) data type and insert it into a column with a CHAR(10) data type, the data type is automatically coerced to CHAR(10). If you want to coerce a character string explicitly, use the CAST operator (See "CAST Operator").