Background Image




PrevPrev Article

NextNext Article

Larger Font Smaller Font Up Down Go comment Print Attachment


Author: Henry Kwon 


CUBRID 10 has added various new features. One of the significant new features is string compression function! Now, let’s take a look at it!


String compression is shown in the table below.

Annotation 2020-07-17 173728.png

CUBRID string compression is performed only for 255 bytes or more, and compression is not performed unless compression is efficient. In order to test how good the string compression ratio is, we entered 100,000 test data in CUBRID 9.3 and 10.1 without string compression and checked the table size.

  • “Case #1” entered non-overlapping string data and checked the compression rate.
  • “Case #2” entered duplicate data and checked the compression rate.



Case #1

  • ‘ajhdghfjasodfajsdfaksfapsidpf…’ 100,000 inputs of 501-byte data
  •  Check the table capacity of CUBRID 9.3 and 10.1

Case #2

  • ‘aaaaaaaaaaaaaaaaaaaaaaaaa…’ 100,000 inputs of 501-byte data
  • Check the table capacity of CUBRID 9.3 and 10.1


The table below shows how to generate 100,000 data for each case. First, one test data was entered, and the test data was generated using the catalog table and the Cartesian Product in the “insert table select ...” statement.


SQL Statement

Case #1

create table str_comp( col1 string );

insert into str_comp values('aoisioeopsvnasdfpoasdfoasjd’)

insert into str_comp select a.col1  from str_comp a, db_attribute b, db_attribute c, db_attribute d limit 99999;

Case #2

create table str_comp2( col1 string );

insert into str_comp2 values(' aaaaaaaaaaaaaaaaaaaaaaaaaaa ');

insert into str_comp2 select a.col1  from str_comp2 a, db_attribute b, db_attribute c, db_attribute d limit 99999;


We entered 100,000 data into the SQL statement in the table above and checked the table size. The table size is “show heap capacity of table name;” I ran the command and checked the Num_pages value, and the results are shown in the table below.



No. of input



Compression rate

Case #1




0 %

Case #2




938 %


"Num_pages" entered and verified the data of the non-duplicated string of "case #1" and the result was the same. The reason is that compression was not performed because the data being entered was the same as the original value, and the data was the same because the original data was entered. In "Case #2", duplicate data was entered and the compression rate was treated as 938%.


When the CUBRID 10 string compression function decompresses the compressed string, it is said that "decompress the compressed string when reading from the database". However, a question arises: How much time does it take to extract data?


To answer that, we wrote and tested the following script. The script below is a script that receives the SQL statement to be executed as a parameter, executes the SQL statement, records the result in resut_sql.txt, and outputs the value minus the completion time from the time before execution.



Script Content


StartTime=$(date +%s.%N)



#echo $RUN_SQL

csql -u dba  -c "$RUN_SQL" -o result_sql.txt  test_db


EndTime=$(date +%s.%N)


run_time=`echo "$EndTime - (($StartTime)) " | bc`

echo $run_time


The script was executed as follow:

sh "select * from str_comp limit 100000;"


The time required to extract 500,000 cases from the in-house test server equipment was measured as follows. The measurement time below is not an absolute value, and may be changed depending on equipment performance, OS setting, storage IO performance, etc.





Compression Rate

Time Difference


Page Count

Time (sec)

Page Count








938 %



The time taken to extract 100,000 data from CUBRID 9.3 was 1.04317 seconds, and 10.1 was 0.848578 seconds. The time difference was -0.194592 seconds. In another word, the processing speed is 0.19sec faster when decompressing and extracting 100,000 compressed 501 byte data.



In summary, the storage capacity in DISK was 57 MB (3,704 * page size 16K) in 9.3, and 5 MB (357 * page size 16K) in 10.1. The processing time is 0.19 seconds faster. The string compression function added in CUBRID 10 reduces disk usage and improves processing performance.


  1. eXERD Which Deals with Logical/Physical Models, Supports CUBRID

  2. [CUBRID Utility] Let’s learn about restoreslave!

  3. 17Jul
    by charischau
    2020/07/17 in Server

    What's New in CUBRID 10: 'String Compression'

  4. Utilizing the CUBRID Migration Tool (CMT)

  5. No Image 17Jul
    2020/07/17 in Server

    Using Two-way Encryption Functions with Java SP in CUBRID

  6. No Image 13Jul
    2020/07/13 in Server

    The Types of Error and Log Generated by CUBRID

  7. No Image 15Jul
    2020/07/15 in Server

    The CUBRID’s Access_control (ACL) Feature

  8. Running CUBRID Container Services in Docker, Kubernetes Environment

  9. Reverse Engineering CUBRID Database with ERwin

  10. No Image 14Jul
    2020/07/14 in Server

    Replace the Linux Version CUBRID Default Installation Directory

Board Pagination Prev 1 2 3 4 Next
/ 4

Join the CUBRID Project on