Open Source RDBMS - Seamless, Scalable, Stable and Free

English | Login |Register

CUBRID vs. MySQL Benchmark Test Results for SNS Data and Workload

As we have recently rolled out the new 8.4.0 version of the CUBRID Database, one of our CUBRID users has approached us with a proposal to test the new CUBRID on one of their live services which currently uses MySQL as a back-end database. It is a SNS (Social Networking Service) which operates a base of over 6 million active users. Their major queries mostly comprised of IN and UNION operators, where UNION prevailed. So they wanted to see if CUBRID's IN and UNION operators perform faster or not, and which one is actually more efficient.

This article explains the entire flow of the test we have conducted to determine the performance benefit of the new CUBRID 8.4.0 over MySQL 5.1 deployed at present on their live servers. The test environment, scenarios as well as results are discussed below.

Test Database Information

The tables used in the test

  • friends
  • posts

The indexes used in the test

  • friends table
    • unique(user_id,friend_id)
  • posts table
    • primary key(id)
    • index(reg_date)
    • index(author_id,reg_date)

Scenario

  1. The following is a list of scenarios where users are separated based on the number of friends they have:
    • T1: users with 50 or less friends
    • T2: users with 51~2000 friends
    • T3: users with 2001 or more friends
    • T4: this scenario contains 40% of T1, 50% of T2, 10% of T3 users
    • T5: this scenario contains 10% of T1, 50% of T2, 40% of T3 users
  2. When the scenarios are executed one by one, we retrieve the list of all friends and their posts in one transaction.
  3. Then all transactions used in the scenarios are divided into those which use IN query and those which use UNION query.

Test User Groups

  1. In this test users are divided into three groups based on the number of friends they have. Based on the real proportion of these groups retrieved from the live SNS service, each group is stored in a separate test database table as explained below.
    • user_group_1 table
      • stores 1,442,329 user IDs who have 50 or less friends
    • user_group_2 table
      • stores 85,568 user IDs who have 51~2000 friends
    • user_group_3 table
      • stores 836 user IDs who have 2001 or more friends
  2. After each user ID is mapped with a sequentially generated ID in each of the tables, we randomly choose one of the sequentially generated ID and retrieve the mapped user ID.

The following is the SQL statements used to create and populate these user group tables.

mysql> create table user_group_1 (id int auto_increment primary key, user_id int, num_friends int);
Query OK, 0 rows affected (0.03 sec)

mysql> create table user_group_2 (id int auto_increment primary key, user_id int, num_friends int);
Query OK, 0 rows affected (0.01 sec)

mysql> create table user_group_3 (id int auto_increment primary key, user_id int, num_friends int);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into user_group_1(user_id, num_friends) select user_id, count(friend_id) from friends group by user_id having count(friend_id) < 51;
Query OK, 1442329 rows affected (38.28 sec)
Records: 1442329  Duplicates: 0  Warnings: 0

mysql> insert into user_group_2(user_id, num_friends) select user_id, count(friend_id) from friends group by user_id having count(friend_id) > 50 and count(friend_id) < 2001;
Query OK, 85568 rows affected (28.95 sec)
Records: 85568  Duplicates: 0  Warnings: 0

mysql> insert into user_group_3(user_id, num_friends) select user_id, count(friend_id) from friends group by user_id having count(friend_id) > 2000;                          
Query OK, 836 rows affected (27.89 sec)
Records: 836  Duplicates: 0  Warnings: 0

UNION Transaction

  1. Retrieve an ID of a random user from user_group_#
  2. SELECT user_id FROM user_group_1 WHERE id = ?
    
  3. Retrieve 100 friends of the user from friends
  4. SELECT friend_id FROM friends WHERE user_id = ? LIMIT 100
    
  5. Pick up some 20 post IDs by these 100 friends from posts
  6. (SELECT id, author_id, reg_date FROM posts
    WHERE author_id = ?
    ORDER BY reg_date DESC LIMIT 20)
    UNION (SELECT …)
    ORDER BY reg_date DESC LIMIT 20
    
  7. Retrieve these 20 posts of friends from posts
  8. SELECT * FROM posts 
    WHERE id = ? OR id = ? OR … OR id = ?
    ORDER BY reg_date DESC LIMIT 20
    

IN Transaction

  1. Retrieve an ID of a random user from user_group_#
  2. SELECT user_id FROM user_group_1 WHERE id = ?
    
  3. Retrieve 100 friends of the user from friends
  4. SELECT friend_id FROM friends WHERE user_id = ? LIMIT 100
    
  5. Retrieve 20 posts of friends from posts
  6. (SELECT * FROM posts
    WHERE author_id IN (?, ?, …, ?)
    ORDER BY reg_date DESC LIMIT 20)
    

Test Environment

The following are the characteristics of the test machine used in this setting.

Characteristics
OS CentOS 5.2 x86_64
CPU Xeon 2.5GHzQuad*2
HDD RAID 0+1 SAS 300G*6
Memory 8 GB
Buffer 2 GB
Execution time 10 min

Both CUBRID and MySQL databases store 3 user group tables with 1,520,000 rows in total (table 1: 1,442,329 rows, table 2: 85,568 rows, and table 3: 836 rows). Each of the databases were undergone to 40 thread loads for 10 minutes.

Test Results

The following table illustrates the test results we have obtained from this experiment.

TPS MySQL 5.1 CUBRID 8.4.0
UNION IN UNION IN
T1 223 254 55 277
T2 54 9 3 128
T3 45 1463 20 1192
T4 64 15 5 118
T5 74 15 4 176

The following is a graphical illustration of the above results, excluding T3 which is separately illustrated lower due to its scale.

SELECT … UNION/IN Tests on SNS Workloads

The graph above illustrates the test results comparison of the T1, T2, T4, T5 scenarios.

  1. From these figures we can conclude that MySQL 5.1 UNION (illustrated in light blue) in average performs 6 times faster than CUBRID 8.4.0 UNION (illustrated in light green).
  2. On the other hand CUBRID 8.4.0 IN (illustrated in green) in average performs 3 times faster than MySQL 5.1 IN (illustrated in navy blue).
  3. On the entire scale CUBRID 8.4.0 IN in average performs 2 times faster than both MySQL 5.1 UNION and IN operations.
From the graph below which illustrates the test results comparison of the T3 scenario, we can conclude that:
  1. MySQL 5.1 IN operator performs slightly better than CUBRID 8.4.0 IN operator.

SELECT … UNION/IN Tests on SNS Workloads

Statistics Comparison

The major goal of this experiment was to obtain the performance difference of IN and UNION operators in the new CUBRID 8.4.0 and MySQL 5.1. As we expected, the new CUBRID 8.4.0 can leverage its improved database engine along with its new index structure and algorithm very well.

Besides this experiment, we have conducted a comparison test of the resource usage pattern between the new CUBRID 8.4.0 and previous CUBRID 8.3.1. With the same configurations we have executed the following query.

(SELECT * FROM posts
WHERE author_id IN (?, ?, …, ?)
ORDER BY reg_date DESC LIMIT 20)

The workload was generated with 40 threads. We have been monitoring the resource usage patterns for 10 minutes. As a result the following figures were obtained (the less, the better).

CUBRID 8.4.0 CUBRID 8.4.0 without in-place sorting CUBRID 8.3.1
Num_data_page_fetches 378 2836 229316
Num_data_page_dirties 16 2279 206182
Num_data_page_ioreads 135 2103 121851
  1. In the above table we can see that the new CUBRID 8.4.0 uses much less resources than the previous version.
    1. CUBRID 8.4.0 fetches over 600 times less data pages than 8.3.1.
    2. As a result CUBRID 8.4.0 leaves 1290 times less dirty data pages than 8.3.1.
    3. Moreover, CUBRID 8.4.0 performs for over 900 times less IO operations than 8.3.1.
  2. More than that we can evidently notice that the new in-place sorting feature we have implemented in 8.4.0 has a significant impact on resource usage. It is compared in the second and third column figures obtain when in-place sorting was enabled, then disabled. In fact, it is not possible for a user to enable or disable it, as it is an internal feature in CUBRID, which is always ON. We have tested this ourselves to see the impact of the new in-place sorting feature.

Results Analysis

  1. In case of the T4 test, which is the most realistic scenario, CUBRID’s IN query has performed twice faster than MySQL UNION query and 8 times faster than MySQL IN query.
  2. In case of the T3 scenario, the performance was much higher when compared to other scenario cases because the buffer hit ratio was very close to 100%.
  3. CUBRID 2008 R4.0 performance analysis:
    1. The effect of Key limit has decreased the IO operations for about 50 times
    2. When Key limit is applied the in-place sorting is resulted in approximately 10 times decrease of the IO operations.
      1. If in-place sorting is not used, the T4 scenario results in 6 TPS which will be similar to the test results of the UNION query.
    3. In case of the UNION query, when sorting, the partial results were repeatedly created in a temp file which had an impact on the performance which also resulted in high number of dirty pages.

See also

CUBRID 8.4.0 Key Features

The new CUBRID 8.4.0 features many significant improvements which includes Performance Improvements, Developer Productivity Improvements and HA Reliab...

Increasing Database Performance by Query Tuning

It is common that applications use various kinds of SQL queries when communicating with the database server. If queries are not well structured, they wi...

NBD Benchmark Result

The NBD (NHN Internet Bulletin Board Application Database) Benchmark is a benchmark to measure the performance of a DBMS used for a Bullet...

CUBRID 8.4.0 vs. 8.3.1 Volume Space Reusability Comparison

CUBRID 2008 R4.0 has brought great achievements in overall engine performance and usability of tools. In the first of these blog posts we have a...




You are either using a very old browser or a browser that is not supported.
In order to browse cubrid.org you need to have one of the following browsers:



Internet Explorer: Mozilla Firefox: Google Chrome: