Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

CUBRID Sharding with node-cubrid 2.0


In this guide we will go through the process of setting up a CUBRID SHARD environment and using the CUBRID Node.js 2.0 driver to insert and retrieve data from the shard DB nodes.

If you are not familiar with the CUBRID Node.js driver yet, check out our CUBRID Node.js Tutorials. Same goes for the CUBRID SHARD basic concepts – see CUBRID SHARD.

Setting up the CUBRID SHARD environment

The first thing we need to do is set up the CUBRID SHARD and the shard databases architecture. For simplicity we will use two shard DB nodes. 

So let’s get right to it. First, in each shard DB node we will create a database and a user account, and start the database:

  • Shard DB name: sharddb
  • Shard DB user account: shard
  • Shard DB user password: shard123

To do this, we will use the CUBRID command line tool “csql”. In each shard DB node, type the following commands:

# Creating CUBRID SHARD DB
cubrid createdb sharddb

# Creating CUBRID SHARD user account
csql -S -u dba sharddb -c "create user shard password 'shard123'"

# Starting CUBRID SHARD DB
cubrid server start sharddb

Next, we have to change the shard configurations. In the node that we choose to be the shard server we have to edit three shard configuration files located in the \CUBRID\conf directory:

  • shard.conf
  • shard_connection.txt
  • shard_key.txt

The first file, shard.conf, contains the shard broker parameters. The only thing to do here is to make sure that the following parameters are set:

SHARD_DB_NAME        = sharddb
SHARD_DB_USER        = shard
SHARD_DB_PASSWORD    = shard123

In the second file - shard_connection.txt, we will configure the network addresses of the shard DB nodes. In our setup, one node is on the same machine as the shard server, and the other one is located somewhere else in the local area network. 

# shard-id  	real-db-name  		connection-info
0           		sharddb       		localhost
1           		sharddb      		10.0.0.186

Make sure that you set the correct IP address for the second shard DB node.

The last file that we need to configure is shard_key.txt. This file configures the shard DB mapping configuration, for the shard key hash value, as follows:

  • [%shard_key]: Sets the shard key section
  • min: The minimum value range of the shard key hash results.
  • max: The maximum range of the shard key hash results.
  • shard_id: The shard identifier
[%shard_key]
#min		max		shard_id
0		127      		0
128		255     		1

This configuration means that an INSERT statement will execute on shard 0 when the shard key hash result created by default hash function is between 0 and 127, and will execute on shard 1 when the shard key hash result created by the default hash function is between 128 and 255.

We are now ready to start the CUBRID SHARD service:

cubrid shard start
@ cubrid shard start
++ cubrid shard start: success

We have one more step to complete before we can use Node.js with CUBRID SHARD – prepare our test data. 

Let’s create a table in each shard DB:
csql -C -u shard -p 'shard123' sharddb@localhost -c "create table student (s_no int, s_name varchar, s_age int, primary key(s_no))"

Using the CUBRID Node.js 2.0 driver with CUBRID SHARD

The code examples we will show in this guide use the CUBRID Node.js 2.0 driver to perform the followings operations:

  • Connect to the shard broker
  • Insert data into the shard DB nodes
  • Query the data we inserted

Inserting data:

Let’s see how to insert data into the shard DB nodes:

var CUBRID = require('node-cubrid'),
  Helpers = CUBRID.Helpers,
  ActionQueue = CUBRID.ActionQueue;

var client = CUBRID.createCUBRIDConnection('localhost', 45011, 'shard', 'shard123', 'sharddb');
ActionQueue.enqueue(
  
    function (cb) {
      client.connect(cb);
    },
    function (cb) {
      Helpers.logInfo('Connected.');
      var i = 0;
      ActionQueue.while(
        function () {
          return i < 255;
        },
        function (cb) {
          Helpers.logInfo('Executing insert number ' + i + '.');
          client.executeWithTypedParams('INSERT INTO student VALUES (/*+ shard_key */ ?,?,?)', [i, 'name_' + i, 20 + i % 8, // i % 8 to set the age of the student between 20 and 27
            'int', 'varchar', 'int', function () {
              client.commit(function () {
                i++;
                cb.call();
              });
            });
        },
        function () {
          cb.call();
        }
      );
    },
    function (cb) {
      client.close(cb);
    }
  ],
  function () {
    Helpers.logInfo('Connection closed.');
  }
);

The output is:

Connected.
Executing insert number 0.
Executing insert number 1.
Executing insert number 2.
Executing insert number 3.
(…)
Executing insert number 251.
Executing insert number 252.
Executing insert number 253.
Executing insert number 254.
Connection closed.

As you can see, there are a few differences in the above code when compared to connecting and inserting in a “normal” CUBRID database setup. The most important one is the connection port. By default, the CUBRID SHARD broker port is 45011, but you can change it in the shard.conf file.

Another key difference is in the INSERT statement that we execute. Here we have to specify the shard_key column using a shard SQL hint. It will tell the shard broker which shard DB to choose according to the setup we made in the shard_key.txt file.

Tip: For more information about shard SQL hints see Selecting a Shard DB through the Shard SQL Hint.

So, let’s see what CUBRID SHARD did when it executed these statements, by checking the shard broker log file located in \CUBRID\log\broker\proxy_log:

Select shard. (shard_id:0, key_column:shard_key_column, shard_key_id:0).
Select shard. (prev_shard_id:-1, curr_shard_id:0). context(cid:1, uid:12, is_busy:Y, is_in_tran:N, is_prepare_for_execute:N, free_on_end_tran:N, free_on_client_io_write:N, free_context:N, is_client_in_tran:Y, is_cas_in_tran:N, is_bypass_msg:N, waiting_event:((nil), -), func_code:-1, stmt_h_id:-1, stmt_hint_type:-1, wait_timeout:30, client_id:1, shard_id:-1, cas_id:-1, error_ind:0, error_code:0, error_msg:- ).
As expected, the CUBRID SHARD broker selected our first shard node and ran the statement on it. It is the same for the first 127 queries. So what about query number 128 and next?
Select shard. (shard_id:1, key_column:shard_key_column, shard_key_id:4).
Select shard. (prev_shard_id:-1, curr_shard_id:1). context(cid:1, uid:12, is_busy:Y, is_in_tran:N, is_prepare_for_execute:N, free_on_end_tran:N, free_on_client_io_write:N, free_context:N, is_client_in_tran:Y, is_cas_in_tran:N, is_bypass_msg:N, waiting_event:((nil), -), func_code:-1, stmt_h_id:-1, stmt_hint_type:-1, wait_timeout:30, client_id:1, shard_id:-1, cas_id:-1, error_ind:0, error_code:0, error_msg:- ).
For the 128th INSERT statement (“INSERT INTO student VALUES (/*+ shard_key */ 128,’name_128’,20)”) the broker selected the other shard node since according to our shard_key.txt configuration, INSERT statements with the key column value (here s_no = 128) bigger than 127 will execute on the second shard DB node.

Querying the CUBRID SHARD DB nodes

In the next section we will learn how to retrieve the data from our shard DB nodes. The only thing that is different here is that CUBRID SHARD uses the (old) 8.4.x TCP protocol for query execution, so before we can run a query we have to tell Node.js driver to use this protocol by calling the setEnforceOldQueryProtocol() method. Now we can execute our queries, again using the shard SQL hints:

var CUBRID = require('node-cubrid'),
  Helpers = CUBRID.Helpers,
  ActionQueue = CUBRID.ActionQueue,
  Result2Array = CUBRID.Result2Array;

var client = CUBRID.createCUBRIDConnection('localhost', 45011, 'shard', 'shard123', 'sharddb');
ActionQueue.enqueue(
  
    function (cb) {
      client.connect(cb);
    },
    function (cb) {
      client.setEnforceOldQueryProtocol(true);
      Helpers.logInfo('Querying: select * from student /*+ shard_id(0) */');
      client.query('select * from student /*+ shard_id(0) */', cb);
    },
    function (result, queryHandle, cb) {
      Helpers.logInfo('Resutls from the first shard (shard_id = 0):');
      Helpers.logInfo('Query result rows count: ' + Result2Array.TotalRowsCount(result));
      var arr = Result2Array.RowsArray(result);
      for (var j = 0; j < arr.length; j++) {
        Helpers.logInfo(arr[j.toString());
      }
      client.closeQuery(queryHandle, cb);
    },
    function (queryHandle, cb) {
      client.close(cb);
    }
  ],
  function () {
  }
);

Here we used the /*+ shard_id(0) */ SQL hint to specify which shard DB we are targeting. As expected, we have retrieved the data from the first shard DB node:

Querying: select * from student /*+ shard_id(0) */
Results from the first shard (shard_id = 0):
Query result rows count: 128
0,name_0,20
1,name_1,21
2,name_2,22
3,name_3,23
(…)
124,name_124,24
125,name_125,25
126,name_126,26
127,name_127,27

To get the data from the other shard DB the only thing we have to change is the parameter of the SQL hint to 1: /*+ shard_id(1) */ and we will get the desired results:

Querying: select * from student /*+ shard_id(1) */
Results from the first shard (shard_id = 1):
Query result rows count: 127
128,name_128,20
129,name_129,21
130,name_130,22
131,name_131,23
(…)
251,name_251,23
252,name_252,26
253,name_253,25
254,name_254,26

What if we want to get data from an invalid shard id, for example if we run “select * from student /*+ shard_id(2) */”? The shard broker will obviously return an error. So let’s look at the log file after we ran this query:

Create new sql statement. (index:0). statement(index:0, num_alloc:1, stmt_h_id:0, status:1, context id:0, context uid:1, num pinned:0, lru_next:(nil), lru_prev:(nil), sql_stmt:select * from student /*+ shard_id(2) */). context(cid:0, uid:1, is_busy:Y, is_in_tran:N, is_prepare_for_execute:N, free_on_end_tran:N, free_on_client_io_write:N, free_context:N, is_client_in_tran:Y, is_cas_in_tran:N, is_bypass_msg:N, waiting_event:((nil), -), func_code:-1, stmt_h_id:-1, stmt_hint_type:-1, wait_timeout:30, client_id:0, shard_id:-1, cas_id:-1, error_ind:0, error_code:0, error_msg:- ).
Invalid shard/CAS id is requested. (shard_id:2, cas_id:-1).
send error(msg) to the client. (error_ind:-1, error_code:-1001, errro_msg:-)

References and links

We hope you enjoyed using the CUBRID Node.js 2.0 driver in a sharding environment!

Please let us know your feedback and suggestions, so we can improve! And do not forget, tryout the CUBRID Node.js 2.0 driver!

comments powered by Disqus
Page info
viewed 2701 times
translations en
Author
posted last year by
CUBRID
Contributors
updated 11 months ago by
View revisions
Share this article