<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
    <channel>
        <title>CUBRID 8.4.0 Key Features</title>
        <link>http://www.cubrid.org/?mid=cubrid_840_key_features</link>
        <description>CUBRID 8.4.0 Key Features</description>
        <language>en</language>
        <pubDate>Mon, 11 Jul 2011 08:10:55 -0800</pubDate>
        <lastBuildDate>Wed, 31 Oct 2012 00:25:56 -0800</lastBuildDate>
        <generator>XpressEngine 1.4.4.1</generator>
                        										        <item>
            <title>CUBRID 8.4.0 Key...</title>
            <dc:creator>CUBRID</dc:creator>
            <link>http://www.cubrid.org/cubrid_840_key_features</link>
            <guid isPermaLink="true">http://www.cubrid.org/cubrid_840_key_features</guid>
                                    <description><![CDATA[<h1>CUBRID 8.4.0 Key Features</h1>

<div class="contents-table"></div>

<p>The new CUBRID 8.4.0 features many significant improvements which includes Performance Improvements, Developer Productivity Improvements and HA Reliability Improvements. Below you can find overview of how each of these improvement dramatically boosts CUBRID's performance.&nbsp;</p>

<h2>Performance Improvement</h2>

<h3>Database volume reduction</h3>

<p>In CUBRID 8.4.0 the database volume size has been decreased by whopping 218%. In this new release we have changed how indexes are stored to reduce the index volume size. As a result the improved storage structure allows CUBRID users to benefit from increased performance.</p><p>The following figure illustrates the database volume size comparison between the previous version 8.3.1 and the new&nbsp;CUBRID 8.4.0. In this example, both databases contain 64,000,000 records with a PRIMARY KEY column defined. The numbers represent&nbsp;gigabytes&nbsp;of data.</p>

<p style="text-align:center"><img editor_component="image_link" src="http://blog.cubrid.org/wp-content/uploads/2011/5-/db-volume-usage-comparison.png" alt="Database volume usage comparison." width="409" height="255"/></p>

<h3>Improved concurrency in Windows builds</h3>

<p>CUBRID 8.4.0 provides improved&nbsp;concurrency on Windows OS through enhanced&nbsp;Mutex. The following graph illustrates the basic performance test comparison between the previous version and the new release.</p>

<p style="text-align:center"><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-windows-concurrency.png" alt="Improved concurrency in Windows builds" width="506" height="273" editor_component="image_link"/></p>

<h3>Index optimizations</h3>

<p>CUBRID 8.4.0 features twice faster database engine than previous 8.3.1 release. It provides several significant built-in index optimizations like:</p>

<ul><li>Covering Index</li><li>LIMIT clause processing optimizations</li><ul><li>Key Limit</li><li>Multi Range</li></ul><li>GROUP BY clause processing optimizations</li><li>Descending Index Scan</li><li>Index Scan support in the LIKE clause</li></ul>

<p>Let's see how the indexing structure is organized in CUBRID 8.4.0. In CUBRID indexing is implemented in <a href="/cubrid_covering_index#why-do-we-need-covering-index" target="_self">B+ tree</a>&nbsp;where index values are stored in leaf nodes.</p>

<p>For a practical example, let's consider the following table structure:</p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
CREATE TABLE tbl (a INT, b STRING, c BIGINT);
</div>

<p>And we create a multi-column index on columns <b>a</b> and <b>b</b>.</p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">CREATE INDEX idx ON tbl (a, b);</div>

<p>The following picture shows index nodes pointing to data stored in the heap file on the disk.</p><p></p><ol><li>The index values (both <b>a</b>&nbsp;and <b>b</b>) are sorted in ascending order (by default).</li><li>Each index node has a pointer to a corresponding data (table row) in the heap file illustrated by an arrow.</li><li>The data in heap file is stored in random order.</li></ol>

<p style="text-align: center;"><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-index-structure.png" alt="CUBRID Index Structure" width="382" height="500" editor_component="image_link"/><br /></p>

<h4>Index Scan</h4>

<p>Let's see how the index scanning is usually performed. On the table defined above, we will execute the following SELECT query.</p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
SELECT * FROM tbl
WHERE a &gt; 1 AND a &lt; 5
AND b &lt; ‘K’
AND c &gt; 10000
ORDER BY b;</div>

<ol><li>CUBRID will first find all index nodes where&nbsp;<b><span style="color: rgb(255, 0, 0); ">a</span><span style="color: rgb(255, 0, 0); ">&nbsp;&gt;1</span><span style="color: rgb(255, 0, 0); "> and a &lt; 5</span></b>.</li><li>Then among these nodes, it will find all nodes where&nbsp;<b><span style="color: rgb(0, 117, 200); ">b &lt; 'K'</span></b>.</li><li>Since column <b>c</b>&nbsp;is not indexed, to obtain its value, it is necessary to look up the heap file.</li><li>Each index node has OID (Object Identifier) which tells CUBRID where exactly on the disk a particular row is located.</li><li>Based on these OID, the server&nbsp;will look up the heap file to retrieve values of column <b>c</b>.</li><li>Then CUBRID will find all those records where <b><span style="color: rgb(0, 158, 37); ">c &gt; 10000</span></b>.</li><li>It will take all these records and sort by column <b>b</b>.</li><li>And return to a client application.</li></ol>

<p style="text-align: center;"><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-index-scan.png" alt="CUBRID Index Scan" width="686" height="500" editor_component="image_link"/>
<br /></p><h4>Covering Index</h4><p>Now let's see how Covering Index improves CUBRID's performance.&nbsp;It allows to return the requested data immediately by skipping heap file look-up, which also reduced the number of I/O operations, which is the most expensive part of the process in terms of time spent.</p><p>However, the magic of Covering Index can be applied only&nbsp;<b>when all columns, which appear in the query, are in the same compound index</b>, in other words their values are stored in the same node in the indexing tree. For example, see the following query.</p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
SELECT a, b FROM tbl
WHERE a &gt; 1 AND a &lt; 5
AND b &lt; ‘K’
ORDER BY b;</div>

<p></p><ul><li>This SQL statement contains only those columns which are in the same multi-column index.</li><li>The WHERE clause references only&nbsp;those columns which are&nbsp;in the same multi-column index.</li><li>And the ORDER BY clause reference only&nbsp;those columns which are&nbsp;in the same multi-column index.</li></ul><p></p><p>So, if we execute the above query:</p><p></p><ol><li>As a part of the normal index scanning process, CUBRID will first find all index nodes where&nbsp;<b><span style="color: rgb(255, 0, 0); ">a</span><span style="color: rgb(255, 0, 0); ">&nbsp;&gt;1</span><span style="color: rgb(255, 0, 0); ">&nbsp;and a &lt; 5</span></b>.</li><li>Then among these nodes, it will find all nodes where&nbsp;<b><span style="color: rgb(0, 117, 200); ">b &lt; 'K'</span></b>.</li><li>Since the server already has both <b>a</b> and <b>b</b> column values in its index nodes, there is no need to look into the heap file again to obtain these values. So after the second step, it sort the result records by column <b>b</b>.</li><li>And returns the data to the client application.</li></ol><ol></ol><p></p>

<p style="text-align: center;"><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-covering-index.png" alt="CUBRID Covering Index" width="544" height="500" editor_component="image_link"/><br /></p><p>Now let's see how much Covering Index can increase your database performance. For the same example defined above, we will assume the database table contains some large amount of records.</p><p><b>Q1.</b> Here is a query which references columns <b>a</b> and&nbsp;<b>b</b>&nbsp;that are covered by the same index.</p><p></p><div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">SELECT a, b FROM tbl WHERE a BETWEEN ? AND ?</div><p></p><p><b>Q2.</b>&nbsp;And this query uses an indexed column <b>a</b> and a column&nbsp;<b><span style="color: rgb(255, 0, 0); ">c</span></b>&nbsp;without index.</p><p></p><div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">SELECT a, c FROM tbl WHERE a BETWEEN ? AND ?</div>

<p>The following graph shows how fast queries can be executed if they use Covering Index.</p><p style="text-align: center;"><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-covering-index-graph.png" alt="CUBRID Covering Index Performance" width="459" height="250" editor_component="image_link"/><br /></p>

<h3 id="limit-optimizations">LIMIT clause processing optimizations</h3><h4>Key Limit</h4><p>CUBRID 8.4.0 has a way smarter LIMIT clause analyzer. It has been greatly optimized so that only the requested amount of records/indexes are analyzed and the data is returned right after that LIMIT is reached.</p><p>For instance, let's run the following query.</p><p></p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
SELECT * FROM tbl
WHERE a = 2
AND b &lt; ‘K’
ORDER BY b
LIMIT 3;</div>
<p></p><p></p><ol><li>CUBRID finds the index nodes where <b>a = 2</b>.</li><li>It stops traversing further as it reaches the requested amount of nodes, i.e. 3. Thus it greatly improves the overall performance.</li><li>Since the values of column <b>b</b>&nbsp;are already sorted in the index nodes, it looks up the heap file.</li><li>And returns the required amount of requested data to a client application.</li></ol><p></p><p><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-key-limit.png" alt="CUBRID Key Limit" width="552" height="500" editor_component="image_link"/><br /></p>

<h4>Multi Range</h4>

<p>Multi Range optimization is another great future implemented in CUBRID 8.4.0. When users request data which lies within a certain fixed range like <i>a &gt; 0 AND a &lt; 5</i>, it is easy task for database systems. However, multiple ranges like <i>a &gt; 0 AND a &lt; 5 AND a = 7 AND a &gt; 10 AND a &lt; 15</i>&nbsp;makes things more complicated. However, CUBRID has a great optimization for it called <b>In-place sorting</b>&nbsp;which allows to perform both:</p><p></p><ol><li>Key Limit</li><li>And sort records in place.</li></ol><p></p>

<p>Considering the following for a Multi Range example.</p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
SELECT * FROM tbl
WHERE a IN (2, 4, 5)
AND b &lt; ‘K’
ORDER BY b
LIMIT 3;</div>

<p></p><p></p><ol><li>CUBRID will spot the first index nodes where <b>a IN (2, 4, 5) AND b &lt; 'K'</b>.</li><li>As it requires only 3 records sorted by column <b>b</b>, it will perform in-place sorting starting from first nodes in the (2, 4, 5) range.</li><ol><li>It will find an index node (2, AAA), which makes 1.</li><li>Then it will find an index node (2, ABC), which makes 2.</li><li>Then it will find an index node (2, CCC), which makes 3.</li><li>As it reaches the LIMIT of 3, it will search other ranges for values of <b>b</b>&nbsp;smaller than those already found.</li><ol><li>It will find an index node (4, DAA), which is bigger, so it will skip it.</li><li>It will find an index node (5, AAA), which is smaller than ABC and CCC, so it will take it into its count, removing CCC from the final list.</li><li>It will find an index node (5, BBB), which is bigger than ABC, so it will skip it and stop here.</li></ol></ol><li>Finally, CUBRID will look up the heap file to retrieve all the values.</li></ol><p></p><p>This allows CUBRID to perform much faster on large amount of data.</p>

<p><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-multi-range-limit.png" alt="CUBRID Multi Range LIMIT" width="610" height="500" editor_component="image_link"/><br /></p><h4>Test results</h4><p>The following are the actual performance test comparison results obtained by analyzing the data from active Social Networking Service called Me2Day (Twitter analogue in Korea).</p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
SELECT * FROM posts 
WHERE author_id IN (?, ?, ..., ?) AND registered &lt; :from ORDER BY reg_date DESC
LIMIT 20;</div>

<p></p><p>The posts table has a multi-column index.</p><p></p><div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">INDEX (author_id, registered DESC)</div><p></p><p></p><ul><li><b>Query rate:</b></li><ul><li>50% of the table data make users with 1~50&nbsp;friends.</li><li>40% of the table data make&nbsp;users with 51~2000&nbsp;friends.</li><li>10% of the table data make&nbsp;users with 2001+&nbsp;friends.</li></ul><li>The test randomly selects 20 post of user's friends.</li><li>The test was run for 10 minutes.</li></ul><p></p><p>The following graph compares MySQL UNION (which is faster that MySQL IN) performance with CUBRID IN performance.</p><p><img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-limit-test.png" alt="CUBRID IN operator test results" width="418" height="250" editor_component="image_link"/><br /></p>

<h3>GROUP BY processing optimizations</h3>

<p>CUBRID 8.4.0 has improved execution of queries containing ORDER BY and GROUP BY clauses. When CUBRID uses multi-column indexes in GROUP BY or ORDER BY, the values are already sorted in the index tree so there is no need to sort again. This&nbsp;new optimization allows CUBRID to perform much faster.</p>

<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
SELECT COUNT(*) FROM tbl
WHERE a &gt; 1 AND a &lt; 5
AND b &lt; ‘K’ AND c &gt; 10000
GROUP BY a;
</div>

<p></p><p></p><ol><li>As a part of the normal index scanning process, CUBRID will first find all index nodes where&nbsp;<b><span style="color: rgb(255, 0, 0); ">a</span><span style="color: rgb(255, 0, 0); ">&nbsp;&gt;1</span><span style="color: rgb(255, 0, 0); ">&nbsp;and a &lt; 5</span></b>.</li><li>Then among these nodes, it will find all nodes where&nbsp;<b><span style="color: rgb(0, 117, 200); ">b &lt; 'K'</span></b>.</li><li>Since column&nbsp;<b>c</b>&nbsp;is not indexed. To obtain its value it is necessary to look up the heap file.</li><li>It finds all those records where&nbsp;<b><span style="color: rgb(0, 158, 37); ">c &gt; 10000</span></b>.</li><li>Since all the index values are already sorted, CUBRID performs GROUP BY without sorting which significantly increases overall performance.</li><li>Finally it returns the grouped data to the client application.</li></ol><p></p><p>

<img src="http://www.cubrid.org/files/attach/images/49/753/202/cubrid-group-by.png" alt="cubrid-group-by.png" width="668" height="500" editor_component="image_link"/>&nbsp;</p>

<h2>Developer Productivity Improvement</h2>

<p>CUBRID 8.4.0 supports over 90% of MySQL SQL syntax. Additionally implicit type conversion has been improved much to let developers focus more on the functionality of their applications while CUBRID makes sure their queries run as expected. Here are just some of the new syntax developers can use in the new CUBRID 8.4.0.</p>

<ul><li>Implicit type conversion<br />
<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
CREATE TABLE x (a INT);
INSERT INTO x VALUES (‘1’);
</div>
</li><li>SHOW queries<br />
<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
SHOW TABLES; SHOW COLUMNS; SHOW INDEX; …
</div></li><li>ALTER TABLE ... CHANGE/MODIFY COLUMN ...<br />
<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
CREATE TABLE t1 (a INTEGER);
ALTER TABLE t1 CHANGE a b DOUBLE;
ALTER TABLE t2 MODIFY col1 BIGINT DEFAULT 1;
</div></li><li>UPDATE ... ORDER BY<br />
<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
UPDATE t
SET i = i + 1
WHERE 1 = 1
ORDER BY i
LIMIT 10;
</div></li><li>DROP TABLE IF EXISTS ...<br />
<div editor_component="code_highlighter" code_type="Sql" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
DROP TABLE IF EXISTS history;
</div></li></ul><p></p><p>There are 23 DATE/TIME related, 5 TEXT, 5 Aggregate related function have been added.&nbsp;For the full of new SQL syntax, refer to <a href="http://blog.cubrid.org/cubrid-life/roadmap-what-to-expect-in-cubrid-3-2/" target="_self">Roadmap: What to expect in CUBRID 8.4.0</a>.</p>

<h2>HA Reliability Improvement</h2>

<h3>Next Key Locking Improvement</h3><p>In the new CUBRID 8.4.0 the locking mechanism has been greatly improved to minimize the occurrence of deadlock. For example, deadlocks will not occur between transactions that enter data to a table at the same time.</p>]]></description>
                        <pubDate>Mon, 11 Jul 2011 07:11:22 -0800</pubDate>
                        <category>performance</category>
                        <category>test</category>
                        <category>mysql</category>
                        <category>covering index</category>
                        <category>limit</category>
                        <category>index</category>
                                </item>
            </channel>
</rss>
