Versions available for this page: CUBRID 8.4.1 | CUBRID 8.4.3 | CUBRID 9.0.0 |
Replication rebuilding is required in CUBRID HA when data in the CUBRID HA group is inconsistent because of multiple failures in multiple-slave node structure, or because of a generic error. Rebuilding replications in CUBRID HA is perform done through a ha_make_slavedb.sh script. With the cubrid applyinfo utility, you can check the replication progress; however replication inconsistency is not detected. If you want to determine whether replication is inconsistent correctly, you must examine data of the master and slave nodes yourself.
For rebuilding replications, the following environment must be the same in the slave, master, and replica nodes.
To rebuild replications, use the ha_make_slavedb.sh script. This script is located in $CUBRID/share/scripts/ha. Before rebuilding replications, the following items must be configured for the environment of the user. This script is supported since the version 2008 R2.2 Patch 9 and its configuration is different from 2008 R4.1 Patch 2 or earlier. This document describes it in CUBIRD 2008 R4.1 Patch 2 or later.
The following are optional items:
Once the script has been configured, execute the ha_make_slavedb.sh script in slave node in which replication will be rebuilt. When the script is executed, rebuilding replication happens in a number of phases. To move to the next stage, the user must enter an appropriate value. The following are the descriptions of available values.
To replicate, you must copy the physical image of the database volume in the target node to the database of the node to be replicated. However, cubrid unloaddb backs up only logical images so replication using cubrid unloaddb and cubrid loaddb is unavailable. Because cubrid backupdb backs up physical images, replication is possible by using this utility. The ha_make_slavedb.sh script performs replication by using cubrid backupdb.
The following example shows how to configure an original node for rebuilding replications as a master mode and rebuild a slave node from the master node.

Rebuilding replications can be performed while the master node is running, however, it is recommended to execute this when there are just a few transactions per hour in order to minimize replication delay.
Before starting to rebuild replications by executing the ha_make_slavedb.sh script, stop the HA service of the slave node and configure the ha_make_slavedb.sh script as shown below. Configure the host name of the master node to replicate (nodeA) to target_host and configure the nome directory of the replication log (default value: $CUBRID_DATABASES) to repl_log_home.
[nodeB]$ cubrid heartbeat stop
[nodeB]$ cd $CUBRID/share/scripts/ha
[nodeB]$ vi ha_make_slavedb.sh
target_host=nodeA
After configuration, execute the ha_make_slavedb.sh script on the slave node.
[nodeB]$ cd $CUBRID/share/scripts/ha
[nodeB]$ ./ha_make_slavedb.sh
When any error occurs while executing the script in step-by-step order, or if the script should be restarted before being stopped by entering n, you can enter s for the steps which have been succeeded and go to the next step.
##### step 1 ###################################################################
#
# get HA/replica user password and DBA password
#
# * warning !!!
# - Because ha_make_slavedb.sh use expect (ssh, scp) to control HA/replica node,
# the script has to know these passwords.
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
Enter the password of a Linux account of the HA node and the password of DBA, the CUBRID database account. If you have not changed the password of DBA after installing CUBRID, press the <Enter> key without entering the password of DBA.
HA/replica cubrid_usr's password :
HA/replica cubrid_usr's password :
testdb's DBA password :
Retype testdb's DBA password :
##### step 2 ###################################################################
#
# ha_make_slavedb.sh is the script for making slave database more easily
#
# * environment
# - db_name : testdb
#
# - master_host : nodeA
# - slave_host : nodeB
# - replica_hosts :
#
# - current_host : nodeB
# - current_state : slave
#
# - target_host : nodeA
# - target_state : master
#
# - repl_log_home : /home/cubrid_usr/CUBRID/databases
# - backup_dest_path : /home/cubrid_usr/.ha/backup
# - backup_option :
# - restore_option :
#
# * warning !!!
# - environment on slave must be same as master
# - database and replication log on slave will be deleted
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
##### step 3 ###################################################################
#
# copy scripts to master node
#
# * details
# - scp scripts to '~/.ha' on nodeA(master).
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
[nodeB]$ tar -zcf ha.tgz ha
[nodeA]$ rm -rf /home/cubrid_usr/.ha
cubrid_usr@nodeA's password:
Connection to nodeA closed.
[nodeB]$ scp -l 131072 -r CUBRID/share/scripts/ha/../ha.tgz nodeA:/home1/brightest
cubrid_usr@nodeA's password:
ha.tgz
10KB 10.2KB/s 00:00
[nodeA]$ tar -zxf ha.tgz
cubrid_usr@nodeA's password:
Connection to nodeA closed.
[nodeA]$ mv ha /home/cubrid_usr/.ha
cubrid_usr@nodeA's password:
Connection to nodeA closed.
[nodeA]$ mkdir /home/cubrid_usr/.ha/backup
cubrid_usr@nodeA's password:
Connection to nodeA closed.
To skip the password entry while executing the scp command, configure the secret key of the scp to the slave node and the public key to the master node, as shown below. For more details, see How to Use ssh-keygen for Linux.
##### step 4 #####################################
#
# copy scripts to replication node
#
# * details
# - scp scripts to '~/.ha' on replication node.
#
##################################################
continue ? ([y]es / [n]o / [s]kip) : y
There is no replication server to copy scripts.
##### step 5 ###################################################################
#
# check environment of all ha node
#
# * details
# - test $CUBRID == /home1/ha_qaf/CSUS-7524_Apricot
# - test $CUBRID_DATABASES == /home1/ha_qaf/DB
# - test -d /home1/ha_qaf/DB/demodb
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
##### step 6 ###################################################################
#
# suspend copylogdb/applylogdb on master if running
#
# * details
# - deregister copylogdb/applylogdb on nodeA(master).
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
[nodeA]$ sh /home/cubrid_usr/.ha/functions/ha_repl_suspend.sh -l /home/cubrid_usr/CUBRID/databases -d testdb -h nodeB -o /home/cubrid_usr/.ha/repl_utils.output
cubrid_usr@nodeA's password:
[nodeA]$ cubrid heartbeat deregister 9408
suspend: (9408) cub_admin copylogdb -L /home/cubrid_usr/CUBRID/databases/testdb_nodeB -m sync testdb@nodeB
[nodeA]$ cubrid heartbeat deregister 9410
suspend: (9410) cub_admin applylogdb -L /home/cubrid_usr/CUBRID/databases/testdb_nodeB --max-mem-size=300 testdb@localhost
3. heartbeat status on nodeA(master).
[nodeA]$ cubrid heartbeat list
@ cubrid heartbeat list
HA-Node Info (current nodeA, state master)
Node nodeB (priority 2, state unknown)
Node nodeA (priority 1, state master)
HA-Process Info (master 8362, state master)
Copylogdb testdb@nodeB:/home/cubrid_usr/CUBRID/databases/testdb_nodeB (pid 9408, state deregistered)
Server testdb (pid 9196, state registered_and_active)
Connection to nodeA closed.
Wait for 60s to deregister coppylogdb/applylogdb.
............................................................
##### step 7 ###################################################################
#
# remove old copy log of slave and init db_ha_apply_info on master
#
# * details
# - remove old copy log of slave
# - init db_ha_apply_info on master
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
- 1. remove old copy log.
[nodeA]$ rm -rf /home/cubrid_usr/CUBRID/databases/testdb\_nodeB/*
cubrid_usr@nodeA's password:
Connection to nodeA closed.
- 2. init db_ha_apply_info.
[nodeA]$ csql -C -u dba --sysadm testdb@localhost -c "delete from db_ha_apply_info where db_name='testdb'"
cubrid_usr@nodeA's password:
Connection to nodeA closed.
[nodeA]$ csql -C -u dba --sysadm testdb@localhost -c "select * from db_ha_apply_info where db_name='testdb'"
cubrid_usr@nodeA's password:
=== <Result of SELECT Command in Line 1> ===
There are no results.
Connection to nodeA closed.
##### step 8 ###################################################################
#
# remove old copy log of slave and init db_ha_apply_info on replications
#
# * details
# - remove old copy log of replica
# - init db_ha_apply_info on master
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
There is no replication server to init ha_info
##### step 9 ###################################################################
#
# online backup database on master
#
# * details
# - run 'cubrid backupdb -C -D ... -o ... testdb@localhost' on master
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
[nodeA]$ cubrid backupdb -C -D /home/cubrid_usr/.ha/backup -o /home/cubrid_usr/.ha/backup/testdb.bkup.output testdb@localhost
cubrid_usr@nodeA's password:
Backup Volume Label: Level: 0, Unit: 0, Database testdb, Backup Time: Thu Apr 19 18:52:03 2012
Connection to nodeA closed.
[cubrid_usr@nodeA]$ cat /home/cubrid_usr/.ha/backup/testdb.bkup.output
cubrid_usr@nodeA's password:
[ Database(testdb) Full Backup start ]
- num-threads: 2
- compression method: NONE
- backup start time: Thu Apr 19 18:52:03 2012
- number of permanent volumes: 1
- HA apply info: testdb 1334739766 715 8680
- backup progress status
-----------------------------------------------------------------------------
volume name | # of pages | backup progress status | done
-----------------------------------------------------------------------------
testdb_vinf | 1 | ######################### | done
testdb | 6400 | ######################### | done
testdb_lgar000 | 6400 | ######################### | done
testdb_lgar001 | 6400 | ######################### | done
testdb_lginf | 1 | ######################### | done
testdb_lgat | 6400 | ######################### | done
-----------------------------------------------------------------------------
# backup end time: Thu Apr 19 18:52:06 2012
[ Database(testdb) Full Backup end ]
Connection to nodeA closed.
##### step 10 ###################################################################
#
# copy testdb databases backup to current host
#
# * details
# - scp databases.txt from target host if there's no testdb info on current host
# - remove old database and replication log if exist
# - make new database volume and replication path
# - scp database backup to current host
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
- 1. check if the databases information is already registered.
- thres's already testdb information in /home/cubrid_usr/CUBRID/databases/databases.txt
[nodeB]$ grep testdb /home/cubrid_usr/CUBRID/databases/databases.txt
testdb /home/cubrid_usr/CUBRID/databases/testdb nodeA:nodeB /home/cubrid_usr/CUBRID/databases/testdb/log file:/home/cubrid_usr/CUBRID/databases/testdb/lob
- 2. get db_vol_path and db_log_path from databases.txt.
- 3. remove old database and replication log.
[nodeB]$ rm -rf /home/cubrid_usr/CUBRID/databases/testdb/log
[nodeB]$ rm -rf /home/cubrid_usr/CUBRID/databases/testdb
[nodeB]$ rm -rf /home/cubrid_usr/CUBRID/databases/testdb_*
- 4. make new database volume and replication log directory.
[nodeB]$ mkdir -p /home/cubrid_usr/CUBRID/databases/testdb
[nodeB]$ mkdir -p /home/cubrid_usr/CUBRID/databases/testdb/log
[nodeB]$ mkdir -p /home/cubrid_usr/.ha
[nodeB]$ rm -rf /home/cubrid_usr/.ha/backup
[nodeB]$ mkdir -p /home/cubrid_usr/.ha/backup
- 5. copy backup volume and log from target host
cubrid_usr@nodeA's password:
testdb_bkvinf 100% 49 0.1KB/s 00:00
cubrid_usr@nodeA's password:
testdb_bk0v000 100% 1540MB 7.8MB/s 03:18
testdb.bkup.output 100% 1023 1.0KB/s 00:00
##### step 11 ###################################################################
#
# restore database testdb on current host
#
# * details
# - cubrid restoredb -B ... testdb current host
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
[nodeB]$ cubrid restoredb -B /home/cubrid_usr/.ha/backup testdb
##### step 12 ###################################################################
#
# set db_ha_apply_info on slave
#
# * details
# - insert db_ha_apply_info on slave
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
1. get db_ha_apply_info from backup output(/home/cubrid_usr/.ha/backup/testdb.bkup.output).
- dn_name : testdb
- db_creation : 1334841057
- pageid : 78
- offset : 7912
- log_path : /home/cubrid_usr/CUBRID/databases/testdb_nodeA
2. select old db_ha_apply_info.
[nodeB]$ csql -u dba -S testdb -l -c "SELECT db_name, db_creation_time, copied_log_path, page_id, offset, required_page_id FROM db_ha_apply_info WHERE db_name='testdb'"
=== <Result of SELECT Command in Line 1> ===
There are no results.
3. insert new db_ha_apply_info on slave.
[nodeB]$ csql --sysadm -u dba -S testdb -c "DELETE FROM db_ha_apply_info WHERE db_name='testdb'"
[nodeB]$ csql --sysadm -u dba -S testdb -c "INSERT INTO db_ha_apply_info VALUES ( 'testdb', datetime '04/19/2012 22:10:57', '/home/cubrid_usr/CUBRID/databases/testdb_nodeA', -1, -1, NULL, NULL, 0, 0, 0, 0, 0, 0, 0, 78, NULL )"
[nodeB]$ csql -u dba -S testdb -l -c "SELECT db_name, db_creation_time, copied_log_path, page_id, offset, required_page_id FROM db_ha_apply_info WHERE db_name='testdb'"
=== <Result of SELECT Command in Line 1> ===
<00001> db_name : 'testdb'
db_creation_time: 10:10:57.000 PM 04/19/2012
copied_log_path : '/home/cubrid_usr/CUBRID/databases/testdb_nodeA'
page_id : -1
offset : -1
required_page_id: 78
##### step 13 ###################################################################
#
# make initial replication active log on master, and copy archive logs from
# master
#
# * details
# - remove old replication log on master if exist
# - start copylogdb to make replication active log
# - copy archive logs from master
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
- 1. remove old replicaton log.
[nodeB]$ rm -rf /home/cubrid_usr/CUBRID/databases/testdb_nodeA
[nodeB]$ mkdir -p /home/cubrid_usr/CUBRID/databases/testdb_nodeA
- 2. start copylogdb to initiate active log.
- cubrid service stop
[nodeB]$ cubrid service stop >/dev/null 2>&1
- start cub_master
[nodeB]$ cub_master >/dev/null 2>&1
- start copylogdb and wait until replication active log header to be initialized
[nodeB]$ cub_admin copylogdb -L /home/cubrid_usr/CUBRID/databases/testdb_nodeA -m 3 testdb@nodeA >/dev/null 2>&1 &
...
- cubrid service stop
[nodeB]$ cubrid service stop >/dev/null 2>&1
- check copied active log header
[nodeB]$ cubrid applyinfo -L /home/cubrid_usr/CUBRID/databases/testdb_nodeA testdb | grep -wqs "DB name"
- 3. copy archive log from target.
cubrid_usr@nodeA's password:
testdb_lgar000 100% 512MB 3.9MB/s 02:11
##### step 14 ###################################################################
#
# restart copylogdb/applylogdb on master
#
# * details
# - restart copylogdb/applylogdb
#
################################################################################
continue ? ([y]es / [n]o / [s]kip) : y
[nodeA]$ sh /home/cubrid_usr/.ha/functions/ha_repl_resume.sh -i /home/cubrid_usr/.ha/repl_utils.output
cubrid_usr@nodeA's password:
nodeA ]$ cub_admin copylogdb -L /home/cubrid_usr/CUBRID/databases/testdb_nodeB -m sync testdb@nodeB >/dev/null 2>&1 &
resume: cub_admin copylogdb -L /home/cubrid_usr/CUBRID/databases/testdb_nodeB -m sync testdb@nodeB
nodeA ]$ cub_admin applylogdb -L /home/cubrid_usr/CUBRID/databases/testdb_nodeB --max-mem-size=300 testdb@localhost >/dev/null 2>&1 &
resume: cub_admin applylogdb -L /home/cubrid_usr/CUBRID/databases/testdb_nodeB --max-mem-size=300 testdb@localhost
- check heartbeat list on (master).
nodeA ]$ cubrid heartbeat list
@ cubrid heartbeat list
HA-Node Info (current nodeA, state master)
Node nodeB (priority 2, state unknown)
Node nodeA (priority 1, state master)
HA-Process Info (master 11847, state master)
Server testdb (pid 11853, state registered_and_active)
Connection to nodeA closed.
##### step 15 ##################################################################
#
# completed
#
################################################################################
After the ha_make_slavedb.sh script has been stopped, check the HA status from the slave node and then run the HA.
[NodeB]$ cubrid heartbeat status
@ cubrid heartbeat list
++ cubrid master is not running.
[NodeB]$ cubrid heartbeat start
@ cubrid heartbeat start
@ cubrid master start
++ cubrid master start: success
@ HA processes start
@ cubrid server start: testdb
This may take a long time depending on the amount of recovery works to do.
CUBRID 9.0
++ cubrid server start: success
@ copylogdb start
++ copylogdb start: success
@ applylogdb start
++ applylogdb start: success
++ HA processes start: success
++ cubrid heartbeat start: success
[nodeB ha]$ cubrid heartbeat status
@ cubrid heartbeat list
HA-Node Info (current nodeB, state slave)
Node nodeB (priority 2, state slave)
Node nodeA (priority 1, state master)
HA-Process Info (master 26611, state slave)
Applylogdb testdb@localhost:/home/cubrid_usr/CUBRID/databases/testdb_nodeA (pid 26831, state registered)
Copylogdb testdb@nodeA:/home/cubrid_usr/CUBRID/databases/testdb_nodeA (pid 26829, state registered)
Server testdb (pid 26617, state registered)