Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

Working with CUBRID BLOB / CLOB Data Types


The scope of this tutorial is to present the new data types, BLOB and CLOB, introduced in CUBRID 8.3.1.

In the release 8.3.1, CUBRID comes up with two new data types, to improve the support for Large Objects (LOB), such as texts and binary files:

  • CLOB (Character Large Object)
  • BLOB (Binary Large Object)

Note:

Before CUBRID 3.1 release, Large Objects were supported in CUBRID through the GLO data type (Generalized Large Object). Starting from 3.1, the GLO data type is no longer supported. Thus, if you plan on upgrading a database that uses the GLO data type, then you will need to modify the structure of the database to use the new LOB data types.

Before we dive in, let's review some of the important terms relevant to the topic of Large Objects.

Terms

Term Definition
LOB (Large Object) Large-sized objects such as binaries or text.
FBO (File Based Object) An object that stores data of the database in an external file.
External LOB An object which stores LOB data in a file, outside the database.
Internal LOB Internal LOB is an object that stores LOB data inside the database.
External Storage External (file) storage to store LOB data (example: POSIX file system).
LOB Locator The path name of a file stored in external storage, which store LOB data.
LOB Data Content of a file in a specific location specified by a LOB Locator.

CLOB Data type

CLOB is a data type used to store large character string data.

Most importantly, LOB data (both BLOB and CLOB) is stored "outside" of the database in CUBRID, in external file(s).

CUBRID implementation specifics:

  • The maximum length of CLOB data is the maximum file size which can be created on the external storage, which depends on the OS specifics.
  • In SQL statements, the CLOB data type is compatible with CHAR(n), VARCHAR(n), NCHAR(n), and NCHAR VARYING(n) data types. However, only an explicit type change is allowed, and if data lengths are different from one another, the maximum length is truncated to fit to the smaller one.
  • When converting the CLOB type value to a character string, the length of the converted data cannot exceed 1 GB due to the constraints of character data types.
  • Likewise, when converting a character string to the CLOB type, the size of the converted data cannot exceed the maximum file size provided by the CLOB storage.

Note

In MySQL the equivalent data type for CUBRID CLOB is TEXT. See CUBRID vs. MySQL vs. Oracle Data Type Mapping.

BLOB Data type

BLOB is a data type used to store large binary data such as images, videos, music, etc.

CUBRID implementation specifics:

  • The maximum length of BLOB data is the maximum file size which can be created on the external storage, which depends on the OS specifics.
  • In SQL statements, the BLOB data type is compatible with the BIT(n) and BIT VARYING(n) types, and only an explicit type change is allowed. If data lengths differ from one type to another, the maximum length is truncated to fit the smaller one.
  • When converting the BLOB type value to a binary value, the length of the converted data cannot exceed 1GB.
  • When converting binary data to the BLOB type, the size of the converted data cannot exceed the maximum file size provided by the BLOB storage.

Note

The MySQL equivalent for this CUBRID data type is MySQL BLOB.

LOB Physical Storage

When LOB-type data is created and inserted, it will be stored in a file on the external storage, and the location information of the relevant file (LOB Locator) will be stored in the CUBRID database column.

Thus, LOB data is stored in the local file system of the CUBRID server. It is stored:

  • in the path specified in the -lob-base-path option value of cubrid createdb utility or, if this value is omitted,
  • the data will be stored in the db-vol path/lob path.

The following example shows how to create a database called testdb and specify "/home/data1" as a location for storing the LOB data files using the createdb utility provided by CUBRID. This command is executed in the command line terminal.

cubrid createdb --lob-base-path "file:/home1/data1" testdb

If the LOB Locator (the data in the database column) is deleted from the database, the file corresponding to the external storage will be deleted as well.

However, if the actual LOB data is accidentally (or intentionally) deleted from the physical storage, despite a LOB data file path still being registered in the database location file (databases.txt), the CUBRID services will not function properly. You should take appropriate actions to avoid the LOB files being deleted by mistake as consistency will not be guaranteed.

When storing LOB data in external storage, the following file naming convention will be used:

<table_name>_<unique_name>

Where:

  • table_name is a name of the table which will hold the LOB data information.
  • unique_name is a random name generated by CUBRID.

Using LOB Data types in SQL

Functions and Operators

By using the CAST operator, you can execute an explicit data type converstion between BLOB/CLOB and other compatible binary type/string data types.

Syntax:

CAST (<bit_type_column_or_value> AS CLOB)
CAST (<bit_type_column_or_value> AS BLOB)
CAST (<char_type_column_or_value> AS BLOB)
CAST (<char_type_column_or_value> AS CLOB)

The next table shows the SQL functions which can be used with BLOB/CLOB data types:

Function/Operator Description
CLOB_TO_CHAR(<CLOB_type_column>) Changes number data type, date/time data type, and CLOB data type to VARCHAR data type.
BLOB_TO_BIT(<BLOB_type_column>) Changes BLOB data type to VARYING BIT data type.
CHAR_TO_CLOB(<char_type_column_or_value>) Changes text string data type (CHAR, VARCHAR, NCHAR, and NVARCHAR) to CLOB data type.
BIT_TO_BLOB(<BLOB_type_column_or_value>) Changes bit array data type (BIT, VARYING BIT) to BLOB data type.
CHAR_TO_BLOB(<char_type_colulmn_or_value>) Changes text string data type (CHAR, VARCHAR, NCHAR, and NVARCHAR) to BLOB data type.
CLOB_FROM_FILE(<file_pathname>) Reads file data from the provide file path and changes to CLOB data type data.
BLOB_FROM_FILE(<file_pathname>) Reads file data from the provided file path and changes to BLOB data type data. The file path specified in is interpreted using the same method as the CLOB_FROM_FILE( ) function.
CLOB_LENGTH(<CLOB_column>) Returns the length of the LOB data stored in a CLOB file, in bytes.
BLOB_LENGTH(<BLOB_column>) Returns the length of the LOB data stored in a BLOB file, in bytes.
<BLOB_or_CLOB_column> IS NULL Use an IS NULL expression to compare whether it is a LOB column value (Locator) or a NULL; returns TRUE if NULL.

Tables structure definition

BLOB/CLOB type columns can be created/added/deleted by using a CREATE TABLE statement or an ALTER TABLE statement:

Example:

//creating a table and CLOB column
CREATE TABLE doc_t (
	doc_id VARCHAR(64) PRIMARY KEY,
	content CLOB
);

Create Table

Usage rules:

  • You cannot create an index for a LOB type column.
  • You cannot define the PRIMARY KEY, FOREIGN KEY, UNIQUE, and NOT NULL constraints for a LOB type column.
  • When creating or editing a table, you cannot define the SHARED property for them, and the DEFAULT property can only be defined by the NULL value.
  • A LOB type column/data cannot be the element of collection type data.
  • If you are deleting a record containing a LOB type column, all files located inside a LOB column value (LOB Locator) and the external (files) storage will be deleted.
  • When a record containing a LOB type column is deleted in a basic key table, and a record of a foreign key table that refers to the foregoing details is deleted at the same time, all LOB files located in a LOB column value (LOB Locator) and the external file(s) storage will be deleted. However, if the relevant table is deleted by using a DROP TABLE statement or a LOB column is deleted by using an ALTER DROP statement, only a LOB column value (LOB Locator) is deleted, and the LOB files inside the external storage to which a LOB column refers to will not be deleted.

Examples:

//an error occurs when UNIQUE constraint IS defined ON CLOB COLUMN
ALTER TABLE doc_t ADD CONSTRAINT content_unique UNIQUE(content);

UNIQUE cannot be set to LOB data type

//an error occurs when creating an INDEX ON CLOB COLUMN
CREATE INDEX ON doc_t (content);

Cannot create Index on LOB data type

//creating a TABLE AND BLOB COLUMN
CREATE TABLE image_t (
	image_id VARCHAR(36) PRIMARY KEY,
	doc_id VARCHAR(64) NOT NULL,
	image BLOB
);

//an error occurs when adding a BLOB COLUMN WITH NOT NULL constraint
ALTER TABLE image_t ADD COLUMN thumbnail BLOB NOT NULL;

Cannot set NOT NULL to LOB data type

//an error occurs when adding a BLOB COLUMN WITH DEFAULT attribute
ALTER TABLE image_t ADD COLUMN thumbnail2 BLOB DEFAULT BIT_TO_BLOB(X'010101');

Cannot set DEFAULT or SHARED on LOB data type

Querying data in SQL

When you retrieve/query a LOB type column value, the data stored in a file to which the column refers will be displayed. You can execute an explicit type change by using CAST operator, CLOB_TO_CHAR( ) function, and BLOB_TO_BIT( ) function.

Usage rules:

  • If the query is executed in CSQL, a column value (LOB Locator) will be displayed, instead of the data stored in a file. To display the “string” data in CSQL to which a BLOB/CLOB column refers, you must use the CLOB_TO_CHAR( ) function.
  • To use the string process function, the strings need to be converted by using the CLOB_TO_CHAR( ) function.
  • You cannot specify a LOB column in a GROUP BY clause and in a ORDER BY clause.
  • Comparison operators and relational operators (=, <>, IN, NOT IN, etc.) cannot be used to compare LOB columns. However, IS NULL expression can be used to compare whether it is a valid LOB column value (LOB Locator) or NULL. This means that TRUE will be returned when a column value is NULL, and if a column value is NULL, it means that there is no file storing the LOB data.
  • To perform comparison on LOB data (=, <>, etc.), the data should be converted to a string or bit array.
  • When a LOB column is created, and the corresponding file is deleted after data input, a LOB column value (LOB Locator) will be referring to an invalid file; this means that using CLOB_TO_CHAR( ), BLOB_TO_BIT( ), CLOB_LENGTH( ), and BLOB_LENGTH( ) functions on the columns that have mismatching LOB Locator and a LOB data file will return NULL.

Examples in CSQL in the command line terminal:

//displaying locator value when selecting CLOB and BLOB column in CSQL interpreter
csql> SELECT doc_t.doc_id, content, image FROM doc_t, image_t WHERE doc_t.doc_id = image_t.doc_id;
csql> ;xr

=== <Result of SELECT Command in Line 1> ===

  doc_id                content               image
==================================================================
  'doc-1'               file:/home1/data1/ces_658/doc_t.00001282208855807171_7329  file:/ home1/data1/ces_318/image_t.00001282208855809474_7474
  'doc-2'               file:/home1/data1/ces_180/doc_t.00001282208854194135_5598  file:/
home1/data1/ces_519/image_t.00001282208854205773_1215

2 rows selected.

//using string functions after coercing its type by CLOB_TO_CHAR( )
csql> SELECT CLOB_TO_CHAR(content), SUBSTRING(CLOB_TO_CHAR(content), 10) FROM doc_t;
csql> ;xr

=== <Result of SELECT Command in Line 1> ===

   clob_to_char(content)  substring( clob_to_char(content) from 10)
============================================
  'This is a Dog'       ' Dog'
  'This is a Cat'       ' Cat'

2 rows selected.

csql> SELECT CLOB_TO_CHAR(content) FROM doc_t WHERE CLOB_TO_CHAR(content) LIKE '
%Dog%';
csql> ;xr

=== <Result of SELECT Command in Line 1> ===

   clob_to_char(content)
======================
  'This is a Dog'

csql> SELECT CLOB_TO_CHAR(content) FROM doc_t ORDER BY CLOB_TO_CHAR(content)
csql> ;xr

=== <Result of SELECT Command in Line 1> ===

   clob_to_char(content)
======================
  'This is a Cat'
  'This is a Dog'

2 rows selected.

//an error occurs when LOB column specified in WHERE/ORDER BY/GROUP BY clauses
SELECT * FROM doc_t WHERE content LIKE 'This%';

SELECT * FROM doc_t ORDER BY content;

Storing and Updating Table Columns

Usage rules:

  • If a value is “entered” in a LOB column by using an INSERT SQL statement, a corresponding file will be created in an external storage and the relevant data is stored in the file; the file path (LOB Locator) is then stored in the actual column value.
  • If a record containing a LOB column is referred by a DELETE statement, the file to which the relevant LOB column refers will be deleted. If a LOB column value is changed using an UPDATE statement, the column value will be changed or not, according to whether the new value is NULL or not.
    • If a LOB type column value is changed to a value that is not NULL, if the LOB Locator that refers to an external file is already available in a LOB column, the relevant file will be deleted and a new file is created afterwards. After storing a value that is not NULL, a LOB Locator for a new file will be stored in the LOB column value.
    • If changing a LOB type column value to NULL, if a LOB Locator that refers to an external file is already available in a LOB column, the relevant file will be deleted. And afterwards a NULL is stored in the LOB column value.

Examples:

//Inserting DATA after explicit type conversion INTO CLOB type COLUMN
INSERT INTO doc_t (doc_id, content) VALUES ('doc-1', CHAR_TO_CLOB('This is a Dog'));
INSERT INTO doc_t (doc_id, content) VALUES ('doc-2', CHAR_TO_CLOB('This is a Cat'));

//Inserting DATA after explicit type conversion INTO BLOB type COLUMN
INSERT INTO image_t VALUES ('image-0', 'doc-0', BIT_TO_BLOB(X'000001'));
INSERT INTO image_t VALUES ('image-1', 'doc-1', BIT_TO_BLOB(X'000010'));
INSERT INTO image_t VALUES ('image-2', 'doc-2', BIT_TO_BLOB(X'000100'));

//Inserting DATA FROM a sub-query result
INSERT INTO image_t SELECT 'image-1010', 'doc-1010', image FROM image_t WHERE image_id = 'image-0';

//Updating CLOB COLUMN value TO NULL
UPDATE doc_t SET content = NULL WHERE doc_id = 'doc-1';

//Updating CLOB COLUMN value
UPDATE doc_t SET content = CHAR_TO_CLOB('This is a Dog') WHERE doc_id = 'doc-1';

//Updating BLOB COLUMN value
UPDATE image_t SET image = (SELECT image FROM image_t WHERE image_id = 'image-0') WHERE image_id = 'image-1';

//Deleting BLOB COLUMN value AND its referencing files
DELETE FROM image_t WHERE image_id = 'image-1010';

Transactions support

Commit/rollback for LOB data changes is supported as with other data types. This means that CUBRID ensures the validation of the mapping between a LOB Locator and the actual LOB data within transactions, and it supports recovery if needed. This means that an error will be displayed in case of mapping errors between a LOB Locator and the LOB data.

Usage rules:

  • When selecting LOB data in an application through a driver such as JDBC, the driver can get ResultSet from DB server and fetch the record while changing the cursor location on ResultSet. That is, only the LOB Locator is stored at the time ResultSet is imported, and the LOB data which is referred by a LOB File Locator will be fetched from the file Locator at the time a record is fetched.
  • Since backup/recovery is supported only for LOB type column value (LOB Locator), an error is likely to occur, if the mapping of the LOB Locator and the LOB data files is invalid.
  • If the database is operated on different machines - S1 and S2, and you want to copy LOB data from S1 to S2, you must read the LOB data from S1 and INSERT LOB on S2. The LOB column value (Locator) for S1 is valid only in the relevant/original system – S1.

Database Backup/Restore

It's important to note that LOB type storage is not restored when you backup/restore a database. CUBRID does not include the LOB type data in the database backup/restore procedure, because they do not exist in a database volume, but stored outside of the database in a separate storage. Likewise, since the LOB type storage is not backed up together during database backup, LOB type storage is not restored during restoration. LOB type storage must be maintained separately.


Using BLOB/CLOB from CUBRID APIs

JDBC

The interfaces that process LOB data in JDBC is implemented based on JDBC 4.0 specification. The current constraints of the CUBRID JDBC API are:

  • It supports sequential writes only when creating BLOB or CLOB. Writing to arbitrary locations is not supported.
  • You cannot change the data of a BLOB or CLOB by calling methods of BLOB or CLOB object which are retrieved from the ResultSet.
  • It does not support Blob.truncate, Clob.truncate, Blob.position, and Clob.position.
  • You cannot bind LOB data by calling PreparedStatement.setAsciiStream, PreparedStatement.setBinaryStream, and PreparedStatement.setCharacterStream methods of BLOB/CLOB type columns.

Note: To use BLOB/CLOB data types in an environment where JDBC 4.0 specification is not supported, such as JDBC version 1.5 or earlier, you must convert a Connection object to CUBRIDConnection, explicitly. See the example below:

// JDK 1.6 or later
import java.sql.*;

Connection conn = DriverManager.getConnection(url, id, passwd);
Blob blob = conn.createBlob();
…

// JDK 1.5 or earlier
import java.sql.*;
import cubrid.jdbc.driver.*;

Connection conn = DriverManager.getConnection(url, id, passwd);
Blob blob = ((CUBRIDConnection)conn).createBlob();
…

The recommended way to bind LOB data types is:

  • Create the java.sql.Blob or java.sql.Clob object and save the file contents in the object. Use setBlob() or setClob() methods of PreparedStatement.
  • Perform query and get a java.sql.Blob or a java.sql.Clob object from the ResultSet object. Then bind the object using PreparedStatement.

Example:

Class.forName("cubrid.jdbc.driver.CUBRIDDriver");

Connection conn = DriverManager.getConnection ("jdbc:cubrid:localhost:33000:image_db:::", "", "");
conn.setAutoCommit(false);

PreparedStatement pstmt1 = conn.prepareStatement("SELECT image FROM doc WHERE image_id = ? ");
pstmt1.setString(1, "image-21");

ResultSet rs = pstmt1.executeQuery();

while (rs.next())
{
	Blob bImage = rs.getBlob(1);

	PreparedStatement pstmt2 = conn.prepareStatement("INSERT INTO doc(image_id, doc_id, image) VALUES (?,?,?)");

	pstmt2.setString(1, "image-22")
	pstmt2.setString(2, "doc-22")
	pstmt2.setBlob(3, bImage);
	pstmt2.executeUpdate();
	pstmt2.close();
}

pstmt1.close();

conn.commit();
conn.setAutoCommit(true);
conn.close();

The recommended way to retrieve LOB data types is:

  • Get data directly from the ResultSet by using getBytes() or getString().
  • Get the java.sql.Clob object from the ResultSet by calling getBlob() or getClob() method and then get data by using getBytes() or getSubString().

Example:

Connection conn = DriverManager.getConnection ("jdbc:cubrid:localhost:33000:image_db:::", "", "");

//Getting Blob data from ResultSet and getting data from the Blob object
PrepareStatement pstmt2 = conn.prepareStatement(“SELECT image FROM image_t WHERE image_id = ?”);
pstmt2.setString(1,”image-20”);

ResultSet rs = pstmt2.executeQuery();

while (rs.next())
{
  Blob bImage = rs.getBlob(1);
  Bytes[] bArray = bImange.getBytes(1, (int)bImage.length());
}

PHP

The CUBRID 8.3.1 and newer PHP Drivers have been updated to support these new data types. The deprecated GLO-related functions were removed from the PHP Driver.

The CUBRID PHP Driver and the documentation are available at http://www.cubrid.org/wiki_apis/entry/cubrid-php-driver.

The following functions were added to support LOB data type’s manipulation:

Function Description
cubrid_lob_get Retrieve LOB data
cubrid_lob_close Close handle/Free memory
cubrid_lob_size Retrieve LOB data size
cubrid_lob_export Export data from a LOB field
cubrid_lob_send Save LOB data

The following GLO-related functions were removed:

  • cubrid_new_glo
  • cubrid_save_to_glo
  • cubrid_load_from_glo
  • cubrid_send_glo

More information about the CUBRID PHP API 8.3.1 release can be found in PHP API Release Notes.

We will present in detail how to use these functions from PHP in a separate tutorial – stay tuned!

You may also want to refer to CUBRID vs. MySQL PHP functions comparison.

CCI

It is not in the scope of this tutorial to introduce in detail the available CCI support for the new LOB data types. We will only mention that you can create LOB data and bind the data by using the following CCI functions:

  • To create LOB data file: cci_blob_new( ), cci_blob_write( ))
  • To bind LOB data: cci_bind_param( )
  • To free memory of LOB structure: cci_blob_free( )

For more information, please visit: http://www.cubrid.org/webmanual/3.1/api/api_cci_intro_lob.htm.

See also

This concludes the tutorial about the CUBRID 3.1 new data types. Please let us know your feedback and remember to periodically check the CUBRID web site for more tutorials and resources. For any issues, we always monitor @cubrid Twitter account. Feel free to leave your message there.

comments powered by Disqus
Page info
viewed 17036 times
translations en
Author
posted 2 years ago by
CUBRID
Contributors
updated 2 years ago by
View revisions
Share this article