Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

Important Facts to Know about CUBRID


All database systems are different one way or another. But utilizing best practices and recommendations are equally important on all systems. To improve your experience with CUBRID Database, we would like to provide you with a list of important facts about CUBRID. Learning them will help you better understand CUBRID, better tune the server configurations, better optimize your SQL queries and therefore improve the performance of your applications.

The below list is not exhaustive. Gradually we will be adding more tips. If you have additional recommendations based on your experience which you think would be helpful to other CUBRID users, please write them in the comments below. We will add your recommendations to this list.

Naming conventions

The name of this DBMS is CUBRID, written in capital letters, and not Cubrid. We would appreciate much if you followed this naming conventions. It should be fairly simple to remember, isn't it!?

License

CUBRID is distributed under the terms of GPL v2+. However its APIs are available under New BSD. Reason? Give users complete freedom over how they use the APIs and their source code.

We want to ensure that the new features, fixes and other improvements made by third-party developers to the Core Engine are open so that all CUBRID community users could benefit. By distributing the client libraries under the terms of BSD we want to provide OEMs, VARs, and ISVs full control of their products. Developers embedding CUBRID APIs in their applications are required neither to open their source code nor to purchase the commercial license (actually we do not have one). This is one of the most essential benefits developers can experience with CUBRID.

SQL in CUBRID

SQL syntax in CUBRID is very similar to MySQL and even Oracle because CUBRID complies with SQL-92 Standard. In fact, CUBRID provides over 90% SQL compatibility with MySQL. You can execute most of your existing SQL statements in CUBRID without any modifications. With every new release we add more functions and data types to increase MySQL and Oracle compatibility.

Despite almost full SQL compatibility, there are some syntax elements which are different in CUBRID:

  • CREATE TABLE IF NOT EXISTS syntax is not supported yet.
  • Though CUBRID fully supports ALTER TABLE ... RENAME TO as does MySQL, you can use a much cleaner and easier syntax RENAME TABLE ....

An interesting note is that CUBRID supports Hierarchical Queries compatible with Oracle syntax.

CUBRID is not a fork

CUBRID is not a fork of MySQL (or any other DBMS, for that matters), therefore it does not have anything similar like Storage Engines, e.g. ENGINE=InnoDB part in CREATE TABLE statement. So, if your SQL includes it, just remove it.

Reserved Words

Take a quick look at the reserved words in CUBRID. If your project uses any of them as table or column names, you need to always quote them using backticks (``), double quotes ("") or square brackets ([]), whenever they are referenced.

Besides these, there are some unescapable reserved words. They are the data type name such as INT, VARCHAR etc. You cannot use these words as identifiers, i.e. as names of your tables. The reason lies behind CUBRID's architecture. As you know CUBRID is a relational database but it also supports objects. This allows users to create tables where columns' data type can be another table. See the following example:

create table t1 (a int);
create table t2 (b t1);

Here t1 table can be a column type of column b in table t2. So considering this feature, look at this example below:

create table "int" (a int);
create table "t2 (b int);

If we allow a table to have a name like "int", then we have a problem: what is the type of column b in t2? Is it the table "int", or just the basic type INT?

This creates an ambiguity. This is why in CUBRID data type names are not allowed to be table names.

Database in CUBRID

  1. Every database in CUBRID must be explicitly started before working with them. By default they are stopped. See Start or Autostart CUBRID Database for more info.
  2. As of version 8.4.3, database names in CUBRID should not exceed 17 characters.
  3. CUBRID does not allow to create or drop a database from within the SQL query. Command line cubrid createdb, cubrid deletedb utilities or CUBRID GUI Tools should be used to accomplish this task.

Data Types

As of version 9.0.0 CUBRID does not support UNSIGNED, BOOL data types, if you want to use them. ENUM is supported since 9.0.0. The rest will be implemented in the future versions. See CUBRID vs. MySQL vs. Oracle Data Type Mapping for more information on recommended data type mapping.

In 8.4.1. if you set SYS_TIMESTAMP as a DEFAULT value of a TIMESTAMP column when creating a table, the TIMESTAMP value at the point of CREATE TABLE will be set as default, NOT the time at which the data is INSERTed. Therefore, in order to set the current time and not the time when the table was created, you must specify the SYS_TIMESTAMP value for the VALUES of the INSERT statement when entering data.

In 9.0.0 SYS_TIMESTAMPSYSDATE set as DEFAULT values will be based on the time/date of the query execution.

User Defined Data Types

In CUBRID users can create columns with custom data type. One table can be a data type for a column in another table. See an example given in the Reserved Words section above.

Data Type Boundaries

Also pay attention to permitted maximum and minimum values for data types.

Large Objects

  1. In CUBRID Large Objects (BLOB/CLOB) are stored outside the database on the external storage while the references to those files are stored in database columns.
  2. Since they are stored on the external disk, LOB data cannot be indexed. See the manual for more info.
  3. Likewise, UNIQUE index cannot be created.
  4. Columns with LOB data type cannot be NOT NULL and cannot have DEFAULT value.

Indexing in CUBRID

CUBRID uses B+tree indexes to improve search performance

  1. Works great for equality operator (=).
  2. Since indexes are represented as trees, Key range scan allows to significantly improve the search performance if WHERE clause contains range conditions (<, >, <=, >=, =). If Range conditions are not defined, the Optimizer will attempt to perform sequential table scan.
  3. Try to avoid using irregular conditions such as <>, !=, or NULL, since the Optimizer will not be able to take the full advantage of the index tree, and instead will perform a sequential scan.
  4. Covering Index is magic. Use it to improve the search performance.
  5. If possible, utilize LIMIT clauses to take advantage of Key limit optimizations.
  6. Learn about In-place Sorting feature in CUBRID to improve the performance of ORDER BY statements.
  7. Use COUNT(*) instead of COUNT(col_name), unless you really know what you are doing.
  8. Index names in CUBRID are unique per table. For better readability, you can indicate in an index name the name of the table the index belongs to like pk_tableName_col1Name_col2Name_etc.
  9. LOB data cannot be indexed in CUBRID.
  10. Avoid creating indexes on columns where values can exceed 16KB. For more information refer to Page Storage Structure section of How SQL UPDATE is performed in CUBRID RDBMS.
  11. Try to avoid low-selectivity edges (eg. indexes on columns where there are very few distinct values like gender). Since in CUBRID indexes are stored in B+tree, in low-selectivity case, same value records (eg. records where gender = 'M') will be stored in the same index node. If there are too many records which cannot fit into one page, CUBRID will need to maintain Overflow OID page for them. If the number of separate overflow pages increase, it will affect the performance as there will be a burden to navigate between these additional links. For more, refer to Overflow OID section in the same How SQL UPDATE is performed in CUBRID RDBMS blog.
  12. If you have created an index and would like to UPDATE/DELETE/INSERT a new data, try to use random inserts to avoid key locking. Refer to Key Lock section of How SQL UPDATE is performed in CUBRID RDBMS blog.

Click Counter

Take advantage of CUBRID’s unique Click Counter feature. For example, instead of executing two separate SQL statements like:

SELECT article FROM article_table WHERE article_id = 130,987;
UPDATE article_table SET read_count = read_count + 1 WHERE article_id = 130,987;

… in CUBRID you can execute only one:

SELECT article, INCR(read_count) FROM article_table WHERE article_id = 130,987

This will help you avoid expensive lock of the working record generated by UPDATE operation.

AUTO_INCREMENT

AUTO_INCREMENT in CUBRID is implemented as SERIALS which are not affected by transaction rollback. This is different from how it is implemented in MySQL. In practice, developers do not care about the implementation details. But with SERIALS implementation you can, for example, control the step of the increment or set the initial value. It is good to know what you can do in CUBRID. We have blogged about this in details in Things to Understand When Moving from MySQL to CUBRID. Please refer to this document.

Also, since AUTO_INCREMENT is implemented as SERIALS, when you manually provide a value for AUTO_INCREMENT column, instead of letting CUBRID to set auto incremented value, the SERIAL does not get incremented. For example, assume there is an empty table with an id AUTO_INCREMENT column. Then a user executes the following SQL:

INSERT INTO tbl_posts (id, title) VALUES (100, 'sometime');

Next time if a user tries to INSERT without providing the AUTO_INCREMENT value, then the AUTO_INCREMENT serial will start from 1 and continue. This may cause a problem one day when AUTO_INCREMENT serial reaches 100. It will generate an error saying a UNIQUE key violation, because a record with such value already exists.

So, the recommendation is to always use built-in serial or always set AUTO_INCREMENT column value manually.

Foreign Keys

Foreign keys in CUBRID must reference PRIMARY KEY columns of referenced tables. In other words, you cannot reference non-primary key columns even if they are indexed. Also, if the parent table has a compound primary key, i.e. PK that consists of multiple columns, all these columns must be referenced in the FOREIGN KEY and in the same order. Referencing only some of them is not allowed. How can a child record have multiple referenced parent records after all?

When querying in SQL you can define relations without using JOIN syntax but directly using OID (Object Identifiers, physical addressess of records on the disk). We have written a great article which explains OIDs. Refer to How data is stored in CUBRID RDBMS? Intro to Objects, Classes, OID and Inheritance for more information.

User Management

  • In CUBRID, the smallest grant unit of authorization is a table. This means that CUBRID does not support authorizations at the column level. Refer to Introduction to CUBRID Security for more information.
  • Also, in CUBRID there is a concept of users and groups. Any user can represent a group which will mean that any group member will inherit all the privileges of this group. The parent user which became a group is still a user, i.e. you can use it to connect. Thus, grouping in CUBRID allows to delegate all the privileges of one user to other users.
  • To learn more, we recommend to look at user management related questions and answers.

Native High-Availability feature

Did you know that CUBRID has its own built-in High-Availability feature? Oh, yes, it does. In fact, CUBRID HA is one of the main reasons why Enterprise solution providers choose CUBRID over other open source RDBMS. It is based on transaction log replication, and provides very accurate and predictable fail-over and fail-back features based on CUBRID Hearbeat. You can configure HA environment in three copy modes for transaction log replication: synchronous, semi-synchronous, and asynchronous modes.

With CUBRID HA, it is possible to build multi Master:Slave services. There is also Master:Slave:Replica configuration. This configuration will provide much lower networking and disk I/O loads for heavy read loaded Web services, than the single master and multiple slaves configuration does.

For small size Web services whose primary goal is to lower the cost, developers can configure multiple slaves in a single server. This configuration will allow to gather all (or some) slave databases on a single host. Thus, it will be possible to allocate a single physical machine for multiple slave databases, each of which are separately replicated from their own master databases. Refer to OSCON conference presentation material to see how you can develop your service architecture.

To learn more about CUBRID HA, read Overview of New High-Availability Features in CUBRID 8.4.0 as well as How CUBRID HA Works? Also you are highly encouraged to view our OSCON and RIT++ conference presentation materials. You will find very interesting information about CUBRID there.

If you want to try CUBRID HA, see how to Configure CUBRID HA with Vagrant and Chef Cookbook under 4 minutes.

Native Database Sharding feature

This is something no open source RDBMS can offer you for free. We have developed a native CUBRID SHARD feature which will be released with CUBRID 8.4.3 in October, 2012. All open source! All completely free!

Refer to Database Sharding with CUBRID to get a sneak peek of how will CUBRID SHARD work. We have also posted a blog about the difference between CUBRID SHARD and Spock Proxy for MySQL. We will publish the manual how to setup CUBRID SHARD along with the official announcement. Stay tuned!

IPv6 Support

As of version 9.1 CUBRID does not support IPv6. Only IPv4 is supported. We have not receivied any request to add support for IPv6 so far. If you are interested, please create a feature request at http://jira.cubrid.org/browse/ENGINE.

TravisCI, a CI platform for open source projects, uses only IPv6 on their VMs. So, to install and use CUBRID on TravisCI machines, there is a workaround. All that is necessary is to point the default hostname of the machine to 127.0.0.1. This will enable CUBRID to listen and receive requests properly. There is a working example we have submitted to Propel2 PHP ORM project on Github.

comments powered by Disqus
Page info
viewed 39459 times
translations en
Author
posted 2 years ago by
CUBRID
Contributors
updated last year by
View revisions
Share this article