Open Source RDBMS - Seamless, Scalable, Stable and Free

한국어 | Login |Register

Current Events
Join our developers event to win one of the valuable prizes!
posted 4 years ago
viewed 6500 times
Share this article

Apache DdlUtils & Scriptella: Migrating to CUBRID + Video

Today we will talk about Database Migration: how to move your data from other databases to CUBRID. There are two very powerful and at the same time easy-to-use tools you can have to get this work done. One is Apache DdlUtils, a small, easy-to-use component for working with Database Definition (DDL) files, the other is Scriptella ETL (Extract-Transform-Load) and script execution software tool written in Java. In this tutorial we will show you how to move the data from other databases to CUBRID using both of these tools. The idea is to combine these tools together and get the optimal results in terms of:

  • Extracting the source database schema and importing the schema into the target CUBRID database
  • Transferring the data between the databases (and applying data transformations, if needed)
We have recorded a video tutorial for this article. Before you watch the video make sure you read this article thoroughly so that you better understand the details of how to migrate your data to CUBRID.

Contents

  1. About DdlUtils and Scriptella
  2. Our today's scenario
  3. Prerequisites
  4. Preparing scripts
  5. Executing scripts
  6. Post-migration operations
  7. Video Tutorial
  8. Questions?
  9. License and Disclaimer

About DdlUtils & Scriptella

DdlUtils is a small, easy-to-use open source component for working with Database Definition (DDL) files. These are XML files that contain the definition of a database schema, e.g. tables and columns. These files can be fed into DdlUtils via its Ant task or programmatically in order to create the corresponding database or alter it so that it corresponds to the DDL. Likewise, DdlUtils can generate a DDL file for an existing database.

DdlUtils uses the Turbine XML format. This format expresses the database schema in a database-independent way by using JDBC datatypes instead of raw SQL datatypes which are inherently database specific. You can read more about DdlUtils on DdlUtils Home Page.

We will use DdlUtils to export/import the database schema.

Scriptella is an open source ETL (Extract-Transform-Load) and script execution tool, which can be used to export/import the data from the databases. In addition, Scriptella provides excellent support for data transformation, which is proven to be very useful when migrating the databases.

In our scenario, we will use Scriptella to transfer the data between the databases.

Our today's scenario

We have a database (“source database”) which we need to migrate to CUBRID (“target database”). The source database has support for JDBC and is one of the supported databases for both Scriptella and Apache DdlUtils.

In order to perform the migration, we need to:

  • Step 1: Extract the schema for the objects in the source DB.
  • Step 2: Create the objects in the target DB, using the exported definitions from Step 1.
  • Step 3: Transfer the data between the databases. In our scenario, the transfer will be done directly, without using intermediate files.
  • Step 4: Test the target DB. It is an important step to perform in order to make sure that the process completed successfully. We will cover it in this tutorial.
While theoretically it is possible to do all these steps in just one script, this is probably not a real life scenario, because of the specific differences between the databases. After you create the schema in the target databasebefore you will transfer the data, you will probably want to perform some specific schema operations:

  • Adjust some column data types to better match CUBRID data types.
  • Perform specific CUBRID optimizations.
Finally, we will have to prepare 2 Ant scripts:

  • One script to:

    • Extract the source database schema;
    • Create the target CUBRID schema;
  • Another script to:

    • Transfer the data to the CUBRID target database

Prerequisites

To begin with, you need to install the required software:

  • Download and install the latest Java JDK.
  • As both DdlUtils and Scriptella provide support for Ant, we will use Ant scripts for our needs. Download and install the latest Apache Ant. Set the ANT_HOME environment variable and add the Ant “bin” folder to your PATH.
  • Download and install the latest official release of Apache DdlUtils and apply CUBRID patch and compile it. If you do not want to deal with compilation, we have already prepared the DdlUtils compiled jar for you.
  • Download and install the latest CUBRID release.
  • Create CUBRID target database. This step can’t be done automatically from the tools, therefore, it must be executed manually as part of the environment preparation.
  • Download the latest CUBRID JDBC driver for Windows or for Linux.
In addition to this setup, depending on your source database, you will need to:

  • Have the specific JDBC driver available
  • Have access to the source database – make sure you have the appropriate user permissions.
The next step is to create a working directory, where you will create the scripts, generate the files, etc.

Preparing scripts

The first Ant script we need to prepare will perform the following operations:

  • Extract the source database schema; we will use DdlUtils.
  • Create the target CUBRID schema; we will use DdlUtils.
Let’s name this Ant script file “Step1.xml”. The script will contain 2 targets – ExportDDLImportDDL.

Remember: The code listed below is only a generic example of the relevant DdlUtils part of the Ant script file. You will need to customize it to match your setup.


<target name="ExportDDL">

<taskdef name="databaseToDdl" classname="org.apache.ddlutils.task.DatabaseToDdlTask">

<classpath refid="runtime-classpath"/>

</taskdef>

<delete file="${outputfile-schema}" verbose="true"/>

<databaseToDdl verbosity="debug"

modelname="${modelname}">

<database url=<provide your source db specific info>

driverClassName=<provide your source db specific info>

username=<provide your source db specific info>

password=<provide your source db specific info>/>

<writeSchemaToFile outputFile="${outputfile-schema}"

failonerror="false"/>

</databaseToDdl>

</target>


<target name="ImportDDL">

<taskdef classname="org.apache.ddlutils.task.DdlToDatabaseTask"

name="ddlToDatabase"

classpathref="runtime-classpath"/>

<ddlToDatabase schemaFile="db-schema.xml"

usedelimitedsqlidentifiers="true">

<database url=<provide your target db specific info>

driverClassName="cubrid.jdbc.driver.CUBRIDDriver"

username=<provide your target db specific info>

password=<provide your target db specific info>/>

<writeSchemaToDatabase dodrops="false"

alterdatabase="false"

failonerror="false"/>

</ddlToDatabase>

</target>


The second Ant script file will be named “Step2.xml”. The script will contain just one (Scriptella) target – TransferData. The script itself will contain a reference to an external Scriptella script file – “etl.xml”.

<target name="TransferData">

<taskdef resource="antscriptella.properties">

<classpath>

<pathelement location="../lib/scriptella.jar"/>

</classpath>

</taskdef>

<etl debug="true">

<fileset file="etl.xml"/>

</etl>

</target>


Database-Migration-Schema.gifThe “etl.xml” script will contain the required calls to transfer (and transform, if needed) the data between the databases. For example, to transfer the Product table in the image above, the script will contain these instructions:

<query connection-id="in">

SELECT * FROM Product WHERE category='software';

<script connection-id="out">

INSERT INTO Product(id, category, product_name) values (?id, ?{category}, ?name);

</script>

</query>

</etl>


Again, you will have to customize the scripts to match your specific environment. We do not intend here to provide the complete listings of these scripts, nor enhance documentation for using DdlUtils and Scriptella. To find the details you need to visit the online documentation available for both Apache DdlUtils and Scriptella.

Executing scripts

Now it’s time to run the scripts we have prepared before. Make sure the JDBC drivers and DdlUtils and Scriptella libraries are in CLASSPATH (and the location specified in the script files). Execute, the following command in command line:

ant -f "Step1.xml" ExportDDL ImportDDL

Verify that the script execution completed without errors. The next thing to do is to login to Cubrid (use Cubrid Manager, for example) and:

  • Verify that the objects were created
  • Perform any adjustments required to the schema
  • Perform any optimizations regarding the objects definition
Then execute the second script:

ant -f "Step2.xml" TransferData

Verify that the script execution completed without errors. Depending on the size of the data to be imported, it might take some time before it completes. After all this is completed, you will have migrated to CUBRID your source objects and data. Remember, you will need to perform some testing, to verify the migration process.

Post-migration operations

The tools cannot help with the migration of some specific objects, like stored procedures, for example. This means that depending on your specific case, you might need to perform some extra steps to ensure the complete database migration. Make sure you take into account all the objects you have in your source database and re-create them in the CUBRID database as well, in order to have the complete migration.

Video Tutorial

Questions?

If you have any questions related to Database Migration or would like to provide your feedbacks, post your message on CUBRID Forum or send us an email to support@cubrid.org. Let us know how we can improve this document!

License and Disclaimer

Copyright (C) 2010 Cubrid. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.



comments powered by Disqus