<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
    <channel>
        <title>CUBRID Blog</title>
        <link>http://blog.cubrid.org</link>
        <description>Latest stories in the CUBRID Blog</description>
        <language>en</language>
        <pubDate>Wed, 07 Sep 2011 03:12:42 +0900</pubDate>
        <lastBuildDate>Tue, 11 Jun 2013 18:11:23 +0900</lastBuildDate>
        <generator>XpressEngine 1.4.4.1</generator>
                        										        <item>
            <title>Test Automation using Hudson and Selenium at NHN Services</title>
            <dc:creator>Hyehwan Ahn</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/test-automation-using-hudson-and-selenium-at-nhn-services/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/test-automation-using-hudson-and-selenium-at-nhn-services/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/test-automation-using-hudson-and-selenium-at-nhn-services/#comment</comments>
                                    <description><![CDATA[<p><a href="/blog/tags/NHN">NHN</a>, the company behind <a href="/wiki_tutorials/entry/important-facts-to-know-about-cubrid">CUBRID</a> open source database development, is automating tests for diverse browsers (Chrome, Firefox, Internet Explorer, Opera, iOS Safari and Android) that run on a variety of operating systems (Linux, Microsoft Windows, Mac OS, iOS and Android) with a set of test codes by using Hudson and Selenium WebDriver.&nbsp;In this article, I will briefly summarize the test automation process we use at NHN's NAVER portal and the benefits of Hudson and Selenium WebDriver for automation.</p>
<h2>How to Automate Tests</h2>
<p>I registered a test code to <a href="http://hudson-ci.org/">Hudson</a>, Continuous Integration (CI) tool, and set the test code to be executed by using Selenium WebDriver and JUnit whenever source is committed in SVN. It takes about 2 seconds to test a browser. When a problem occurs, the problem is reported within 10 minutes.</p>
<p>Hudson is commonly used in most development departments of NHN. By clicking <strong>Build Now</strong>, users can view the test progress status. The report system provides a good environment for "Fast Fail, Fast Feedback".</p>
<p style="text-align: center;"><img height="423" width="570" alt="multi_browser_test_with_one_set_of_test_code.png" src="/files/attach/images/220547/619/675/multi_browser_test_with_one_set_of_test_code.png" /></p>
<p style="text-align: center;"><strong>Figure 1: Multi-browser Test with One Set of Test Code.</strong></p>
<h3>Automating Firefox Test in Linux</h3>
<p>Install <b>xserver</b> package on Linux server with Hudson. Modify <b>run level</b> to <b>5</b> and then add the following setting to an account where CI runs in order to execute a browser in a text console.</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">Xvfb&nbsp;:1&nbsp;-screen&nbsp;0&nbsp;1024x768x24&nbsp;&gt;&nbsp;/home1/irteam/log/xvfb.log&nbsp;&amp;</div>
<p>Install the latest Firefox version under the <strong>/usr/lib64</strong> directory and replace the symbolic link. Now you can execute automation tests by running Firefox in Hudson.</p>
<ul>
<li><a href="http://www.youtube.com/watch?v=jxq4p8GXmK0&amp;feature=g-upl">Firefox Test Automation Demo Video</a>&nbsp;(Youtube)</li>
</ul>
<h3>Automating iPhone and iPad Tests in Mac OS X and Simulator</h3>
<p>For tests, install Hudson on Mac OS X. It requires the iPhone or iPad Simulator, so you should install Xcode in advance.</p>
<ul>
<li><a href="http://youtu.be/pyk58vRZuaU">iPhone Test Automation Demo Video by Using Simulator</a>&nbsp;(Youtube)</li>
</ul>
<h3>Automating Internet Explorer and Opera Tests in Microsoft Windows</h3>
<p>For more details on how to automate tests in Microsoft Windows, see <a href="http://seleniumhq.org/docs/03_webdriver.html">http://seleniumhq.org/docs/03_webdriver.html</a>.</p>
<h2>Why Is Browser Test Automation with Hudson Required?</h2>
<p>If tests for several browsers are automatically executed whenever a source code is changed, it would clearly help developers to reduce bugs and keep the source tree that can be released anytime. In addition, QA time can be significantly reduced. There are many benefits of browser tests with Hudson, in addition to quality and cost. The following are the benefits of Hudson.</p>
<h3>Keeping a Living Document and Building up Domain Knowledge</h3>
<p>Whenever performing a test in Hudson, a document with illustration and description can be automatically generated by using Javadoc <i>package.htm</i> in the test code. This document is called a "<b>living document</b>" which is continuously updated and used by operators, developers and QAers for a variety of purposes such as hand-off of domain knowledge and preservation of revision history.</p>
<p style="text-align: center;"><strong><img src="/files/attach/images/220547/619/675/javadoc_document_created_from_tests.png" alt="javadoc_document_created_from_tests.png" width="305" height="304" /></strong></p>
<p style="text-align: center;"><strong>Figure 2: Javadoc Document Created from Tests.</strong></p>
<h3>Visualizing Service Quality by Browser and Domain Knowledge</h3>
<p>Anyone who clicks <strong>Build Now</strong> in Hudson can check the service quality by browser.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/619/675/verifying_mobile_news_service_in_firefox.png" alt="verifying_mobile_news_service_in_firefox.png" width="520" height="304" /></p>
<p style="text-align: center;"><strong>Figure 3: Verifying Mobile News Service in Firefox.</strong></p>
<p>For example, in some cases, the news service should be provided differently by day due to the service characteristics. It is very difficult to hand off all domain information to all related staffs. For example, the KOSPI/KOSDAQ indexes are not showed on the news on weekends but after the opening of the market on Monday morning. Anyone who clicks the <strong>Build Now</strong> button can see the news service operation process by status.</p>
<p>Tests with Hudson help users to hand off domain knowledge and build up the information as well as enhance quality and cut costs.</p>
<h3>Best Practices and Examples</h3>
<p>We once needed to update the version of Spring Framework used for the mobile news service. We modified the version information in the pom.xml file of Maven and performed the unit test and the integration test for web server source, and the result was no problem. Developers opened some new web pages by running the web server and checked that all of them were fine.</p>
<p>However, there were more than 80 runtime errors. Fortunately, we had the multi-browser test in Hudson test, so we could fix all bugs and problems before releasing the service.</p>
<h2>Selenium WebDriver</h2>
<h3>Why Selenium WebDriver?</h3>
<p>We use Selenium WebDriver because of its reliable quality through a version up process for a long time, and it is easy-to-use. The test is performed by running a browser, so it is easy to perform UI test by browser.</p>
<p>The virtual browser test such as <a href="http://htmlunit.sourceforge.net/">HtmlUnit</a> cannot support a variety of browser versions (it is limited to Firefox 3.6 and lower and Internet Explorer 8 and lower). In addition, it does not support mobile browsers. Therefore, I recommend Selenium WebDriver over HtmlUnit, even though Selenium WebDriver requires more effort.</p>
<p>With Java API provided by Selenium WebDriver at the JUnit test code, you can easily produce the browser test code. <b>Code 1</b> below is now used for the mobile news service test.</p>
<p style="text-align: center;"><strong>Code 1. News Service Test Code.</strong></p>
<div editor_component="code_highlighter" code_type="Java" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">@Test(timeout&nbsp;=&nbsp;1000&nbsp;*&nbsp;60)<br /> public&nbsp;void&nbsp;뉴스홈에_최근3일_최종편집시간이_표시된다()&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;Driver.get(driver,&nbsp;"http://m.news.naver.com/home.nhn");<br /> &nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;expectedResult&nbsp;=&nbsp;"최종편집";<br /> &nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;result&nbsp;=&nbsp;Driver.getTextByClass(driver,&nbsp;"last_update");<br /> &nbsp;&nbsp;&nbsp;&nbsp;assertThat(result.startsWith(expectedResult),&nbsp;is(true));<br /> }</div>
<h3>Benefits of Selenium WebDriver</h3>
<p>Benefits of Selenium WebDriver are various: it allows developers to produce the test code with a variety of languages (Java, C#, Python, PHP, Ruby, Perl) and provides fast feedback by using the implicit wait function. The following are the benefits of Selenium WebDriver.</p>
<h4>Test by Specifying ID/Class/XPath</h4>
<p>If considering HTML maintenance, ID is the best choice to specify HTML/CSS elements, the second best is Class, and finally, XPath. Selenium WebDriver supports all three.</p>
<h4>Multi-browser Tests with One Set of Test Code</h4>
<p>You can test several browsers with one set of test code because you can change the test browser by replacing WebDriver only. As of 2012, WebDriver supports Firefox, Internet Explorer, Chrome, Opera, iPhone Safari, iPad Safari and Android browsers.&nbsp;<b>Code 2</b> below is a part of code that creates WebDriver.</p>
<p style="text-align: center;"><strong>Code 2. Example of Creating WebDriver.</strong></p>
<div editor_component="code_highlighter" code_type="Java" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">public&nbsp;WebDriverFactory()&nbsp;throws&nbsp;Exception&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;browsetype&nbsp;=&nbsp;TestConfigParam.getBrowseType();<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;("firefox".equals(browsetype))&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;FirefoxDriver();<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;if&nbsp;("chrome".equals(browsetype))&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;ChromeDriver();<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;if&nbsp;("ie".equals(browsetype))&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;InternetExplorerDriver();<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;if&nbsp;("iphone".equals(browsetype))&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;IPhoneDriver();<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;if&nbsp;("ie".equals(browsetype))&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;InternetExplorerDriver();<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;if&nbsp;("android".equals(browsetype))&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;AndroidDriver();<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;if&nbsp;("htmlunit".equals(browsetype))&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;HtmlUnitDriver(false);<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;this.driver&nbsp;=&nbsp;new&nbsp;FirefoxDriver();<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;driver.manage().timeouts().implicitlyWait(Driver.TIMEOUT,&nbsp;TimeUnit.SECONDS);<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }</div>
<h2>Conclusion</h2>
<p>So far, we have reviewed test automation applied to NAVER services. NHN has already automated tests for many services by using Hudson and Selenium WebDriver. I guess other companies have automated tests in the similar ways. Therefore, I hope this article to be a chance to communicate with others to create a Best Practice, rather than showing it for test methods.</p>
<p><i>By Hyehwan Ahn, Software Engineer at News Service Development Team, NHN Corporation.</i></p>
<blockquote>
<p><i>I am a developer who is interested in automation of routine tasks in releasing Alpha/Beta/QA/Distribution phases to improve efficiency.</i></p>
</blockquote>]]></description>
                        <pubDate>Tue, 11 Jun 2013 14:25:57 +0900</pubDate>
                        <category>test automation</category>
                        <category>browser</category>
                        <category>Hudson</category>
                        <category>Selenium</category>
                        <category>Selenium WebDrive</category>
                        <category>NHN</category>
                        <category>NAVER</category>
                                </item>
        										        <item>
            <title>Embrace SQL with CUBRID and jOOQ</title>
            <dc:creator>Lukas Eder</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-appstools/embrace-sql-with-cubrid-and-jooq/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-appstools/embrace-sql-with-cubrid-and-jooq/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-appstools/embrace-sql-with-cubrid-and-jooq/#comment</comments>
                                    <description><![CDATA[<p><em>This is a guest post by Lukas Eder, the creator of jOOQ open source Java API for typesafe SQL modeling.&nbsp;</em><em><b>If you develop or use an open source application</b>&nbsp;and would like to tell the world about it, CUBRID open source database project is&nbsp;<a href="/blog/cubrid-life/tell-about-your-open-source-project-on-cubrid-blog/" title="Tell about your open source project on CUBRID Blog. We'll pay for Facebook Ads.">accepting guest posts</a>.</em></p>
<h2>Big Data, the Web and SQL</h2>
<p>In recent years, software companies have started to raise millions up to billions of dollars getting acquired by a big player, such as Google, Facebook, Yahoo! or Microsoft. Very often, the assumed value of such deals lay in the fact that <em>Big Data</em> could be purchased along with such acquisitions. "Social" <em>Big Data</em> was generated by millions of users over the web. It seemed too big to fit in classic relational databases, which is why the purchases also included buying the proprietary, rather short-lived technologies used to maintain <em>Big Data</em>. Most of the new companies thus experimented with <em>NoSQL</em> in one form or another.</p>
<p>SQL, on the other hand, has come a long way. SQL is a very expressive and powerful language used to model queries against any type of data, albeit mostly relational. At the same time, SQL is standardised and quite open. CUBRID is a good example of an object-relational database, which combines the expressiveness of SQL with high availability, sharding, and many other features needed to manage <em>Big Data</em>! In other words, CUBRID is the proof that SQL can be an adequate technology for the modern web.</p>
<h2>Querying CUBRID with jOOQ</h2>
<p><a href="http://www.jooq.org">jOOQ</a> is a Java API modelling SQL as an internal domain-specific language directly in Java. It features a built-in code generator to generate Java classes from your database model. These generated classes can then be used to create typesafe SQL queries directly in Java. A simple example of how this works with CUBRID can be seen in this <a href="/wiki_apps/entry/jooq-cubrid-tutorial">jOOQ CUBRID tutorial</a>.</p>
<p>The idea of creating fluent APIs in Java is not new. Usually, Martin Fowler takes most credits for his <a href="http://martinfowler.com/bliki/FluentInterface.html">elaborations on the subject</a>. After that, many approaches towards building internal domain-specific languages have surfaced, mostly in unit testing environments (e.g. <a href="http://jmock.org">JMock</a> and <a href="https://code.google.com/p/mockito">Mockito</a>). Apart from <a href="http://www.jooq.org">jOOQ</a>, there are also a couple of fluent APIs that model SQL as a language in Java. These include:</p>
<ul>
<li><a href="http://www.h2database.com/html/jaqu.html">JaQu</a></li>
<li><a href="http://onewebsql.com">OneWebSQL</a></li>
<li><a href="http://quaere.codehaus.org">Quaere</a></li>
<li><a href="http://www.querydsl.com">QueryDSL</a></li>
<li><a href="http://java.net/projects/squill">Squill</a></li>
</ul>
<p>Among the above, QueryDSL is the only other API with a comparable traction to <a href="http://www.jooq.org">jOOQ</a>'s. While QueryDSL hides the full SQL expressiveness behind a <a href="http://msdn.microsoft.com/en-us/library/vstudio/bb397926.aspx">LINQesque API</a>, jOOQ strongly focuses on SQL only. Unlike any of the above SQL abstraction APIs, jOOQ combines these features:</p>
<h3>A BNF defines jOOQ's fluent API</h3>
<p>jOOQ uses next generation techniques to implement its fluent API. These techniques involve a formal <a href="http://blog.jooq.org/2012/01/19/jooqs-fluent-api-in-bnf-notation">BNF notation</a> specifying API type and method hierarchies:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" height="441" width="700" alt="jooq-select-02.png" src="/files/attach/images/220547/747/674/jooq-select-02.png" /></p>
<p>With a formal BNF, jOOQ's fluent API is much more robust and typesafe, as it will dictate syntax correctness in a more formal way than ordinary builder APIs.</p>
<h3>jOOQ embraces usage of stored procedures</h3>
<p>When closely coupling with your favourite relational database, you will likely want to make use of stored procedures and functions, directly in your SQL. jOOQ embraces this fact and allows for <a href="http://www.jooq.org/doc/3.0/manual/sql-execution/stored-procedures">typesafe embedding of stored functions</a>.</p>
<h3>jOOQ embraces usage of row value expressions</h3>
<p><a href="http://www.jooq.org/doc/3.0/manual/sql-building/column-expressions/row-value-expressions">Row value expressions</a> (also called tuples, records) are at the heart of SQL. Few libraries outside of the SQL world will be able to model the fact that the following predicates are type-safe:</p>
<p><code> </code></p>
<p>&nbsp;</p>
<div editor_component="code_highlighter" code_type="Sql" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">SELECT&nbsp;*&nbsp;FROM&nbsp;t1&nbsp;WHERE&nbsp;t1.a&nbsp;=&nbsp;(SELECT&nbsp;t2.a&nbsp;FROM&nbsp;t2)<br /> --&nbsp;Types&nbsp;must&nbsp;match:&nbsp;&nbsp;&nbsp;^^^^&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;^^^^<br /> <br /> SELECT&nbsp;*&nbsp;FROM&nbsp;t1&nbsp;WHERE&nbsp;(t1.a,&nbsp;t1.b)&nbsp;IN&nbsp;(SELECT&nbsp;t2.a,&nbsp;t2.b&nbsp;FROM&nbsp;t2)<br /> --&nbsp;Types&nbsp;must&nbsp;match:&nbsp;&nbsp;&nbsp;^^^^^^^^^^^^&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;^^^^^^^^^^<br /> <br /> SELECT&nbsp;t1.a,&nbsp;t1.b&nbsp;FROM&nbsp;t1&nbsp;UNION&nbsp;SELECT&nbsp;t2.a,&nbsp;t2.b&nbsp;FROM&nbsp;t2<br /> --&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;^^^^^^^^^^&nbsp;Types&nbsp;must&nbsp;match&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;^^^^^^^^^^</div>
<p>&nbsp;</p>
<p>jOOQ will leverage the Java compiler to help you check the above:</p>
<p><code> </code></p>
<p>&nbsp;</p>
<div editor_component="code_highlighter" code_type="Sql" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">select().from(t1).where(t1.a.eq(select(t2.a).from(t2));<br /> //&nbsp;Type-check&nbsp;here:&nbsp;-----------------&gt;&nbsp;^^^^<br /> <br /> select().from(t1).where(row(t1.a,&nbsp;t1.b).in(select(t2.a,&nbsp;t2.b).from(t2)));<br /> //&nbsp;Type-check&nbsp;here:&nbsp;----------------------------&gt;&nbsp;^^^^^^^^^^<br /> <br /> select(t1.a,&nbsp;t1.b).from(t1).union(select(t2.a,&nbsp;t2.b).from(t2));<br /> //&nbsp;Type-check&nbsp;here:&nbsp;-------------------&gt;&nbsp;^^^^^^^^^^</div>
<p>&nbsp;</p>
<h3>jOOQ emulates built-in functions and SQL clauses</h3>
<p>Providing support for simple SQL clauses is easy: <code>SELECT</code>, <code>DISTINCT</code>, <code>FROM</code>, <code>JOIN</code>, <code>GROUP BY</code>, etc. Implementing "real" SQL is much harder, though. Take the above row value expressions, for instance. They are currently not supported in CUBRID, but you can use them nonetheless with jOOQ. jOOQ emulates missing functions and SQL clauses for you as can be seen <a href="http://java.dzone.com/articles/sql-query-transformation-fun">in this syndicated blog post</a>.</p>
<h3>jOOQ renders specialised SQL for 14 major RDBMS vendors</h3>
<p>Instead of generalising and abstracting advanced standard and vendor-specific SQL features, such as JPA and tools built upon JPA, jOOQ sees good things in each vendor-specific syntax element. You know your database well, so you want to leverage it, not abstract it.</p>
<h3>jOOQ is a platform</h3>
<p>jOOQ is much more than just a SQL library. For example, it features the very useful <a href="http://www.jooq.org/doc/3.0/manual/tools/jooq-console">jOOQ Console</a>, which helps you debug and profile your jOOQ-generated SQL statements in any environment, without the need for expensive third-party tools:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" height="525" width="700" alt="jooq-console-01.png" src="/files/attach/images/220547/747/674/jooq-console-01.png" /></p>
<p>The jOOQ Console also includes on-the-fly SQL editing tools as well as breakpoint capability for advanced debugging.</p>
<h3>More feature comparisons</h3>
<p>More feature comparisons can be found here, <a href="http://blog.jooq.org/2012/05/17/onewebsql-another-competitor-in-the-sql-schema-generation-business">in this blog post</a>.</p>
<h2>Getting productive with jOOQ</h2>
<p>jOOQ is a vision where SQL matters again to the Java developer. While some have called ORM to be the <a href="http://www.codinghorror.com/blog/2006/06/object-relational-mapping-is-the-vietnam-of-computer-science.html">Vietnam of Computer Science</a>, jOOQ is the <a href="http://www.jooq.org">Peace Treaty Between SQL and Java</a>. Using the above and many more features, you can be productive again when writing high-performing, specialised SQL against your favourite database directly in Java, typesafely compiled by your Java compiler.</p>
<p><em>By Lukas Eder, the creator of jOOQ.&nbsp;</em><em>Follow him on Twitter&nbsp;<a href="https://twitter.com/JavaOOQ">@JavaOOQ</a>.</em></p>
<blockquote>
<p><i>I'm a Java and SQL enthusiast developer currently contracting for Adobe Systems in Basel, Switzerland. Originating from the E-Banking field, I have a strong Oracle SQL background. I'm the creator of jOOQ, a comprehensive SQL library for Java.</i></p>
</blockquote>]]></description>
                        <pubDate>Mon, 10 Jun 2013 16:55:25 +0900</pubDate>
                        <category>jOOQ</category>
                        <category>Java</category>
                        <category>API</category>
                        <category>ORM</category>
                        <category>ActiveRecord</category>
                        <category>SQL</category>
                        <category>CUBRID Affiliates</category>
                                </item>
        										        <item>
            <title>ApiAxle - open source API management and analytics proxy</title>
            <dc:creator>Phil Jackson</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-appstools/apiaxle-open-source-api-management-analytics-proxy/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-appstools/apiaxle-open-source-api-management-analytics-proxy/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-appstools/apiaxle-open-source-api-management-analytics-proxy/#comment</comments>
                                    <description><![CDATA[<p><img src="/files/attach/images/220547/524/674/apiaxle-fp-diagram.png" alt="apiaxle-fp-diagram.png" width="438" height="323" style="display: block; margin-left: auto; margin-right: auto;" /></p>
<p><em>This is a guest post by Phil Jackson, the creator of ApiAxle open source API management and analytics solution.&nbsp;</em><em><b>If you develop or use an open source application</b> and would like to tell the world about it, CUBRID open source database project is <a href="/blog/cubrid-life/tell-about-your-open-source-project-on-cubrid-blog/" title="Tell about your open source project on CUBRID Blog. We'll pay for Facebook Ads.">accepting guest posts</a>.</em></p>
<p><a href="http://apiaxle.com/">ApiAxle</a> is an API management solution which is open source and free. The basic premise is that you build your API, put ApiAxle in front of it and it will handle user authentication, rate limiting, statistics, etc. I represent a business which generates a revenue through support and consultancy, but first and foremost I am a developer who loves APIs and the idea of companies exposing their data to enable people to build some brilliant things. As a company we are enjoying building a great product and watching it gain traction amongst fellow hackers.</p>
<h2 id="theproblem">The API problem</h2>
<p>We noticed a space in the market for an open source, on-premise proxy which did not cost the earth and did not involve sending your data over multiple, high-latency hops out into the cloud.</p>
<p>Where a developer these days can type <code>apt-get install nginx</code> to get a solid webserver, there was not really an equivalent for an API management system. Building out the features we provide can be a time-consuming, monotonous and error-prone process that we really hope people do not keep having to perform.</p>
<h2 id="thesolution">The solution</h2>
<p>That is where we come in. We want to bury our head in security documentation and RFCs so that you can concentrate on making a great API. With a few <a href="http://apiaxle.com/products.html">simple commands</a> you can be up and running within 20 minutes. Within 30 you can be on-boarding customers, authenticating them and getting detailed statistics about their usage. You will also get <a href="http://apiaxle.com/docs/caching/">caching</a>, rate limiting, HTTPS support and a highly configurable logging system.</p>
<h2 id="howitworks">How it works</h2>
<p>There are three components in ApiAxle:</p>
<h3 id="therepl">The REPL</h3>
<p>You probably want this first. The repl allows you to configure ApiAxle from the command line. Setting up an API and API keys ready for the proxy to work with is easy. It fires up an instance of ApiAxle&rsquo;s own HTTP API in the background and uses that to modify aspects of the system. Anything you can do in the REPL you can do programmatically with the API too.</p>
<p>Install:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;sudo&nbsp;npm&nbsp;install&nbsp;apiaxle-repl</div>
<p>Start:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;apiaxle</div>
<p>Configure an API and a new key to use the API with:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">axle&gt;&nbsp;key&nbsp;"05050c14643dc"&nbsp;create<br /> axle&gt;&nbsp;api&nbsp;"acme"&nbsp;create&nbsp;endPoint="localhost:81"<br /> axle&gt;&nbsp;api&nbsp;"acme"&nbsp;linkkey&nbsp;"05050c14643dc"</div>
<h3 id="theproxy">The Proxy</h3>
<p>The kernel of the system. This goes between the Internet and your API and does the authentication, throttling, caching and statistics collection. It&rsquo;is fast, secure and easy to setup. You will need either the REPL or the API to configure it first.</p>
<p>It does not matter what your API actually outputs - ApiAxle never modifies the body of a response. With regards to errors (e.g. user over quota) you can tell ApiAxle what format they should be in. We support XML or JSON. If you have Node.js installed, installation is as simple as:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;sudo&nbsp;npm&nbsp;install&nbsp;apiaxle-proxy</div>
<p>Then start the proxy with:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;apiaxle-proxy</div>
<h3 id="theapi">The API</h3>
<p>Bear with me, this gets a bit meta. This is ApiAxle&rsquo;s own HTTP API which gives you full control over your APIs and the API keys and keyrings used to access them. You can view statistics about individual APIs and API keys from week long granularities right down to&nbsp;near-real-time hits at a single second granularity.</p>
<p>Install:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;sudo&nbsp;npm&nbsp;install&nbsp;apiaxle-api</div>
<p>Start:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;apiaxle-api</div>
<p>Find out which APIs you have configured:</p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;curl&nbsp;"localhost:3000/v1/apis"</div>
<h2 id="wherewereheaded">Where we are headed to</h2>
<p>We have lots planned, to summarise:</p>
<ul>
<li>Client drivers for the API. Demand is high for PHP and Ruby so we will get them done ASAP.</li>
<li>Now that OAuth2 has been ratified we will be working on getting that in as an authentication method.</li>
<li>We are working on a dashboard which will give you a way to manage your APIs, keys, keyrings and give you a way to view real-time and historical statistics - this will be a paid-for service.</li>
<li>We will be pushing out a user registration system soon so that you can on-board and bill customers without any manual intervention.</li>
</ul>
<p>So the future is exciting. We are really looking forward to meeting more developers that are interested in APIs and the huge ecosystem that is formed around them. Please feel free to <a href="mailto:phil@apiaxle.com">get in touch</a> with any questions or just to say hi!</p>
<p><em>By Phil Jackson, the creator of ApiAxle. Follow him on Twitter <a href="https://twitter.com/philjackson">@philjackson</a>.</em></p>
<blockquote>
<p><em>After receiving his computing degree from Teesside University, Phil became a software engineer. As he moved through industry, he found himself becoming increasingly fascinated with APIs and, after helping the BBC write their iPlayer API, founded Qwerly, a company that aggregated social profile information and offered it to companies for insight/marketing purposes. After selling Qwerly to a competitor, taking some time off, Phil wrote the code which would eventually make up ApiAxle. Now he's running ApiAxle full-time and hopes it will become the ubiquitous, open source tool for managing APIs.</em></p>
</blockquote>]]></description>
                        <pubDate>Mon, 10 Jun 2013 13:43:54 +0900</pubDate>
                        <category>ApiAxle</category>
                        <category>API</category>
                        <category>analytics</category>
                        <category>proxy</category>
                        <category>CUBRID Affiliates</category>
                                </item>
        										        <item>
            <title>My impressions after giving talks at RIT++ and Percona conferences</title>
            <dc:creator>Esen Sagynov</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-life/my-impressions-after-giving-talks-at-rit-and-percona-conferences/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-life/my-impressions-after-giving-talks-at-rit-and-percona-conferences/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-life/my-impressions-after-giving-talks-at-rit-and-percona-conferences/#comment</comments>
                                    <description><![CDATA[<p><img style="display: block; margin-left: auto; margin-right: auto;" height="120" width="366" alt="rit_percona_conference_logo.png" src="/files/attach/images/220547/820/650/rit_percona_conference_logo.png" /></p>
<p>Three weeks ago&nbsp;on behalf of&nbsp;the CUBRID team&nbsp;a few of my colleagues and me have attended and given talks at two international conferences.&nbsp;Today I would like to share my impressions of these events. I will write a separate post about various sharding solutions introduced at these conferences. So, stay tuned!</p>
<h2>RIT++</h2>
<p>The first presentation at RIT++ (Russian Internet Technologies) was held on Monday April 22nd, 2013, in Moscow, Russia. The second one at Percona MySQL Conference &amp; Expo was held on Wednesday the same week on April 24th, 2013, in Santa Clara, CA,&nbsp;US. At both conferences the agenda was the same: "Easy MySQL Database Sharding with CUBRID SHARD". At RIT++, though, the presentation was given in Russian language. Very exciting! The following is a list of resouces related to the talks.</p>
<ul>
<li><a href="/blog/cubrid-life/cubrid-shard-talk-at-2013-percona-mysql-conference-dont-miss/">The presentation abstract in English</a></li>
<li><a href="http://ritconf.ru/2013/abstracts/560.html">The presentation abstract in Russian</a></li>
<li><a href="http://www.slideshare.net/cubrid/easy-mysql-database-sharding-with-cubrid-shard-2013-percona">Slideshare in English</a></li>
<li><a href="http://www.slideshare.net/cubrid/mysql-cubrid-shard-2013-rit">Slideshare in Russian</a></li>
<li><a href="http://www.youtube.com/watch?v=GDwg66iPR7c">Full recording at Youtube</a></li>
</ul>
<p>This was the third time we, the CUBRID team, have attended the conferences organized by&nbsp;Russian&nbsp;Ontico company. Previously we have attended to&nbsp;<a href="/blog/cubrid-life/meet-cubrid-developers-at-russian-internet-technologies-2012-conference/">RIT++ 2012</a>&nbsp;and&nbsp;<a href="/blog/cubrid-life/database-sharding-the-right-way-easy-reliable-open-source/">HighLoad++ 2012</a>&nbsp;conferences.&nbsp;This year at RIT++ 2013 there were over 800 attendees, and 13 categories of talks ranging from client-side development to server-side, to database scalability, to project management, to analytics, and so on. Annually after the conference is over the organizers conduct after-event survey and&nbsp;assess the past experience. I think because of users' past feedback this year RIT++ organizers have accepted more talks related to client-side development than usually.</p>
<p>Besides us from Korea, there were presenters from the States, representing Facebook, and Brasil, representing PUC-Rio University of Brasil. My personal impression was that&nbsp;this year there were fewer&nbsp;foreign speakers than last year at <a href="/blog/cubrid-life/meet-cubrid-developers-at-russian-internet-technologies-2012-conference/">RIT++</a> or <a href="/blog/cubrid-life/database-sharding-the-right-way-easy-reliable-open-source/">HighLoad++</a>.</p>
<p>At my session about MySQL database sharding with CUBRID SHARD there were over 100, close to 150, I guess attendees. The audience welcomed my speech in Russian language very well. Next time I should talk in Russian again. They like it! When the presentation was over, there were slew of questions. I think CUBRID SHARD as an easy sharding middleware for MySQL was received very well.</p>
<p>To my surprise there were many questions unrelated to CUBRID SHARD. The audience asked a lot about CUBRID itself and its HA feature. Later I learned that many&nbsp;attendees listened to my talks about&nbsp;<a href="http://www.cubrid.org">CUBRID</a> open source relational database system&nbsp;from the last year. One from the audience said that he'd been looking into CUBRID for a while already and was considering to use it in production. His most favorite feature in CUBRID was its built-in support for HA and very clever 3-tier architecture. Overall the unofficial Q&amp;A session lasted for over 1 hour 30 minutes.</p>
<p>It was a great experience for me to present CUBRID SHARD at RIT++ this year and a great opportunity to our CUBRID team. The conference lasted two days, but I could not attend the second day as I had to head to Santa Clara, CA, to give a talk at Percona MySQL Conference &amp; Expo.</p>
<h2>Percona</h2>
<p>It was the first time I have talked at Percona. Previously we have spoken at <a href="/blog/news/cubrid-will-present-at-oscon-2011/">OSCON 2011</a>&nbsp;about <a href="/cubrid_ha_oscon">CUBRID HA</a>, and <a href="/blog/cubrid-life/mysql-conference-outcome/">2010 MySQL Conference &amp; Expo</a>&nbsp;about CUBRID Database. When compared to OSCON, Percona MySQL conference was a lot more specific (obviously about MySQL). There were more quality talks about scalability and performance tuning. If I was to choose where to go next year, I would definitly select Percona. That interesting it was!</p>
<p>Unlike at RIT++, our session at Percona conference had attracted only about 20 attendees. The presentation went well, but I should accept that the number of listeners plays a big role. There were fewer questions, less enthusiasm. On the other hand, Facebook, two Percona, Continuent, and&nbsp;Tokutek presentations, which were held at the same time at 3:30 PM, attracted hundreds of listeners each. After realizing this I came to a conclusion that it is the brand recognition that plays a significant role in attracting listeners. Even though NHN is very popular in Korea and Asia in general, it is almost unknown in Western countries. In fact, when I asked the audience at Percona if they had ever heard about NHN, their answer was negative. Very pitty. I think NHN has to seriously reconsider its&nbsp;strategy on increasing its&nbsp;worldwide brand recognition.&nbsp;Nevertheless, I am very glad we had this chance to present our open source sharding middleware at a well-known conference like Percona.</p>
<p>Like I mentioned at the beginning of this post, I will write another post covering various sharding solutions presented at Percona conference. It was very interesting to learn about different techniques used by large scale service providers who have developed their own sharding solutions.</p>
<p>After my presentation was over and I had answered all the questions, I headed to one of the lounge rooms where I had made an oppointment to meet with Ryan Walsh, a Corporate Account Executive at Percona. We have discussed about various opportunities for cooperation between Percona and <a href="/blog/tags/NHN">NHN</a>, the company behind CUBRID development.</p>
<p><a href="http://www.percona.com/">Percona</a> is a widely-known and reputable MySQL support and consulting company. It is known to be the oldest and largest independent company which provides not only MySQL support, consulting and training but also&nbsp;<a href="http://www.percona.com/development/mysql">develops a custom MySQL server</a>, i.e. provides patches, "backport changes to older MySQL versions to obtain a key patch without a full version upgrade".</p>
<p>During our conversation Rayn had introduced his company and told about large scale cases their company has worked on so far. One that I would like to mention today is that some of the services at Amazon Web Services have been actually developed by Percona. Amazon RDS was said to have been developed by Percona team. Percona database tools <a href="https://forums.aws.amazon.com/message.jspa?messageID=318349">seem to work with RDS</a> natively.</p>
<p>Also Percona is <a href="https://www.percona.com/live/mysql-conference-2013/sessions/using-percona-server-database-service-openstack">cooperating</a> with HP to build RedDwarf DaaS&nbsp;as part of the OpenStack open source cloud project. At Percona conference HP engineers have presented how to use RedDwarf&nbsp;APIs to use and administer the features of Percona Server.&nbsp;Such vast knowledge and experience of Percona in bulding cloud database services may be quite benefitial to NHN to develop and provide its own cloud computing service.</p>
<p>Overall, both presentations went well. I have talked to many attendees and answered to quite a lof of their questions about CUBRID SHARD and CUBRID Database. One thing which requires more attention from NHN is its global brand recognition. The more developers will recognize NHN and its services, the more will be eager to listen to and learn from NHN enginneers.</p>
<p>If you have any feedback or suggestions, feel free to comment below. Also you should follow us on twitter <a href="http://twitter.com/cubrid">here</a>.</p>]]></description>
                        <pubDate>Mon, 13 May 2013 15:58:31 +0900</pubDate>
                        <category>conference</category>
                        <category>RIT++</category>
                        <category>Percona</category>
                        <category>Sharding</category>
                        <category>NHN</category>
                                </item>
        										        <item>
            <title>Understanding Encryption and Security through Java Cryptography Architecture</title>
            <dc:creator>Jaehee Ahn</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/understanding-encryption-security-through-java-cryptography-architecture/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/understanding-encryption-security-through-java-cryptography-architecture/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/understanding-encryption-security-through-java-cryptography-architecture/#comment</comments>
                                    <description><![CDATA[<p>For a long time, Java has provided security-related functions. Among the security-related functions, <a href="http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html">Java Cryptography Architecture</a> (JCA) is the core one. JCA uses a provider structure with a variety of APIs related to security. These functions are essential for modern IT communication encryption technology, including Digital Signature, Message Digest (hashs), Certificate, Certificate Validation, creation and management of Key, and creation of Secure Random Number.</p>
<p>With JCA, even developers who do not have specialized knowledge of encryption can successfully implement security-related functions.   You don't need to use algorithms like those you had to rack your brain for a long time to understand in computer science classes and cryptology-related classes. JCA allows you to implement the algorithms with a few lines of codes. Of course, utilizing the APIs well will be highly valuable for business. But, it does not mean that you do not need to understand how JCA runs. Understanding how JCA runs internally will be important to using the functions more efficiently.</p>
<p>To be a better software developer and architect, you may need to trace how the result, JCA, was created from the cryptology and security-related algorithms. This article is a summary of JCA architecture that I learned while producing the nClavis (Symmetric-key cryptography) at <a href="/blog/tags/NHN">NHN</a>. Of course, I do not understand all of JCA yet. However, I was so happy to understand JCA at this level that I decided to write this article to share my experience with you.</p>
<h2>Design Principles</h2>
<p>As I mentioned, JCA is a Java security platform, based on the provider structure, having implementation independence, implementation interoperability, and algorithm extensibility.</p>
<p>An application can utilize the information protection encryption technology just by requesting security services on the Java platform, without implementing security algorithms. JCA-provided security services are implemented by the provider mounted on the Java security platform. An application can introduce a variety of security functions by using several independent providers. The list of providers is described in the <strong>jre/lib/security/java.security</strong> file. The Java platform includes many providers and installs them by default when JRE is installed.</p>
<div></div>
<p style="text-align: center;"><strong>Code 1: java.security file.</strong></p>
<div editor_component="code_highlighter" code_type="Java" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#<br /> # List of providers and their preference orders (see above):<br /> #<br /> security.provider.1=sun.security.provider.Sun<br /> security.provider.2=sun.security.rsa.SunRsaSign<br /> security.provider.3=com.sun.net.ssl.internal.ssl.Provider<br /> security.provider.4=com.sun.crypto.provider.SunJCE<br /> security.provider.5=sun.security.jgss.SunProvider<br /> security.provider.6=com.sun.security.sasl.Provider<br /> security.provider.7=org.jcp.xml.dsig.internal.dom.XMLDSigRI<br /> security.provider.8=sun.security.smartcardio.SunPCSC<br /> security.provider.9=sun.security.mscapi.SunMSCAPI</div>
<div></div>
<p>The providers mounted on the Java security platform by default are compatible with all Java applications and so widely used to regard them as trusted. Of course, JCA supports mounting of custom providers for applications which want to introduce the latest security technology that has not been implemented yet.</p>
<h2>Architecture</h2>
<h3>Cryptographic Service Providers</h3>
<p>All providers are an implementation of java.security.Provider. This provider implementation includes the list of security algorithm implementations. When an instance of a specific algorithm is necessary, the JCA framework searches the proper implementation class of the corresponding algorithm from the provider repository and creates a class instance. The providers defined in the java.security file are included in the repository by default. In this way, a provider can be statically included. In addition, it can be dynamically added in runtime. When several providers are defined, they may implement an identical encryption algorithm in different ways. In this case, an application can specify the provider or specify the preference in the repository.</p>
<p>To use JCA, an application simply requests a specific object type (such as MessageDigest) and an algorithm or service (e.g., MD5). Then, the application obtains an implementation from one of the installed providers. Of course, it can explicitly request an object of a specific provider.</p>
<div></div>
<p style="text-align: center;"><b style="text-align: center;">Code 2: Requesting an Object of a Provider.</b></p>
<div editor_component="code_highlighter" code_type="Java" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">md = MessageDigest.getInstance("MD5");<br /> md = MessageDigest.getInstance("MD5", "ProviderC");</div>
<pre class="brush:java"></pre>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/provider_framework_of_jca_java.png" alt="provider_framework_of_jca_java.png" width="490" height="413" /></p>
<p style="text-align: center;"><b>Figure 1: Provider Framework of JCA</b> (Source: <a href="http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html">http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html</a>).</p>
<p>Oracle JRE (Sun JDK) has a variety of providers (Sun, SunJSSE, SunJCE, and SUnRsaSign) included by default. The criteria of classifying the providers are made based on the production process; functions or algorithms used by each provider are not so different from each other. JRE except Oracle has no need to include the providers mandatorily. Therefore, it is not recommended to implement an application in the provider-dependent way. All encryption technology implementations required by an application are provided by default and implemented with a fully reliable level. Therefore, developers do not need to pay attention to a provider itself.</p>
<h3>Key Management</h3>
<p>Two of the most important things in JCA are <b>provider</b> and <b>key management</b>. Java uses a kind of key database, called "<b>keystore</b>", to manage the key/certificate repository. KeyStore can be usefully used in an application that needs information for authentication, encryption, or signature.</p>
<div></div>
<p>An application can access the KeyStore by using the implementation of the <code>java.security.KeyStore</code> class. The default KeyStore implementation is provided by Sun Microsystems (the package name still starts with com.sun even though Oracle acquired it long time ago :D). The KeyStore is created as a file with the naming rule of &ldquo;jks&rdquo;. It can also be converted to the type of &ldquo;<b>jceks</b>&rdquo; or &ldquo;<b>pkcs12</b>&rdquo; in order to suppoprt applications which use another format of KeyStore implemetation.</p>
<p>"<b>jceks</b>" format is in PBE format which uses triple DES, which uses even stronger encryption algorithm than "<b>jks</b>" type, to protect KeyStore.</p>
<p>"<b>pkcs12</b>" format is a standard syntax to exchange personal information, based on RSA. For machines, applications, and browser Internet kiosks that support this standard, users can export, import, or activate the personally identifying information (certificates for identification, <b>pkcs12</b> format certificates). Safari, Chrome, and IE browsers follow this standard. Therefore, when the <b>pkcs12</b> certificate file is installed once, it is applied to all browsers. However, Firefox does not follow this standard.</p>
<p>KeyStore is to save and to manage SecretKey (SymmetricKey), Public/Private KeyPair (AsymmetricKey), self-signed certificate, and the certificates signed by trusted CA (Certificate Authority) or the private CA on a file. Here, experienced developers may recall OpenSSL which was used to create a certificate file. Both certificate files have the same purpose but different file format. However their formats are convertible.</p>
<p>Java even provides '<b>keytool</b>' command-line utility in JDK_HOME/bin directory, which is similar as OpenSSL's utility. Therefore you can handle certificates using a keytool on Windows, unlikely OpeSSL only on Linux. Of course, this tool runs with the KeyStore implementation provided by Oracle JRE. If JDK is installed in the system, a certificate can be created by using the keytool. However, note that the keytool can provide the functions at the same level provided by OpenSSL from JAVA. A KeyStore file created by using the keytool is compatible with the lower Java versions, so you do not need to worry about it.</p>
<h4>In-depth 1: Certificate.</h4>
<p>Here, I need to address the correct meaning of <i>certificate</i>. In a narrow sense, KeyStore is a kind of certificate. A certificate is used for two purposes; a "lock" required for encrypting the information and a tool for "identification" to identify the opponent technically. The authenticated certificate used for bank transactions is a security technology which utilizes both purposes of a certificate.</p>
<p>The cryptographic meaning of a certificate is an electronic document that uses a digital signature to sign the public key created by the RSA algorithm (asymmetric-key cryptography) with the private key of the certificate authority (CA).</p>
<p>It is popular practice that a pair of keys created by using the RSA algorithm is solid. It had been proven long ago that calculating the other key by using one key within a meaningful time is impossible. So, why is electronic signature required? When "A" and "B" communicate with each other by encrypting their data, A opens its public key and then protects the private key paired with the public key. Then B encrypts the data to be sent to A by using the public key of A. In this case, there is a problem of how B will know it can trust the public key; is it really provided by A? When a malicious attacker "C" disguises its public key as A's to deceive B, the public key cannot be used. To solve this problem, a trusted certificate authority "D" is necessary. The CA has a chain structure from the top root certificate authority to the sub certificate authority. In this structure, the upper layer certifies the lower layer by using signature. As it had been rooted as a worldwide standard so long ago, the signature chain of the certificates around the world includes a few common top root CAs (e.g. VeriSign, Thawte). Countries of which IT communication environment reaches a certain standard have their national root CA (e.g., KISA in Korea). The top root CAs may have a circular structure that allows signatures with each other. In some cases, one CA can sign the public key with its private key.</p>
<p>So, the signature of CA should not be valid limitlessly but updated regularly or irregularly by other security events.</p>
<p>It is possible for an individual or a company to establish a private certificate authority if required. Certificates signed by the private certificate authority have a more complex certification process. You may have seen the following <b>Figure 2</b> on your browser:</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/browser_display_when_private_certificate_authority_is_used.png" alt="browser_display_when_private_certificate_authority_is_used.png" width="700" height="287" /></p>
<p style="text-align: center;"><b>Figure 2: Browser Display when Private Certificate Authority is used.</b></p>
<p>It is displayed when the website uses the private certificate authority. On the browser (even though it is not recommended), the user can skip this situation by clicking the mouse. However, for server-to-server connection, the opponent's certificate issued by the private certificate authority should be imported to <code>JAVA_HOME/jre/lib/security/cacerts</code> or added to <code>SSLContext</code> when creating the connection of the program code.</p>
<p>The certificates from official certificate authorities are widely granted as trusted CAs and included to the OS or JRE by default. Therefore, the situation illustrated in <b>Figure 2</b> does not occur. If a private certificate authority can obtain the pkcs12 format certificate, the private certificate authority is considered as a trusted CA. Therefore, it can be easily installed in the system.</p>
<p style="text-align: center;"><b>Code 2: Certificate Verification Program installed in the system.</b></p>
<h4>
<div editor_component="code_highlighter" code_type="Java" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">@Windows&gt;certmgr.msc<br /> @MAC OS X&gt;Keychain Access<br /> @Linux&gt; keychain</div>
</h4>
<h4>In-depth 2:  HTTPS</h4>
<p>To understand encryption, you need to understand SSL/TLS, the certificate-based cryptographic protocols, as well as certificate. The purpose of a certificate can be easily misunderstood when encrypting the HTTPS protocol communication section. When a server and a client communicate via HTTPS, if only the communication section is encrypted without identifying the client (setting the clientAuth attribute of HTTPS Connector to false in the server.xml setting of Tomcat), only a simple certification verification is executed.  Symmetric-key cryptography is used for encryption of data.</p>
<p>Here, I will describe how a certificate is used through HTTPS.</p>
<div></div>
<div><ol>
<li>A client connects to a server via the HTTPS protocol (at this time, the SSL connection-defined server port is used).</li>
<li>The server sends its certificate public key to the client.(including several meta information for validation and Cipher supported by the server)</li>
<li>The client validates the server with the public key and meta information sent by the server.(checking whether the public key is signed with a trusted official Root CA)<ol>
<li>When the certificate of the server is signed with the official CA, it is passed (the official CAs are registered to the system as a trusted authority by default).</li>
<li>When the certificate of the server is signed with a private CA, checking the trust manager of the SSL socket (SSLContext) created by the client. If the certificate is registered, it is passed.</li>
</ol></li>
<li>When the server's certificate passes the validation check, the client creates a symmetric key, encrypts the symmetric key and the cipher as a public key of the server, and then sends them to the server.(the symmetric key is created by selecting one of cipher algorithms supported by the server)</li>
<li>The server decodes the [symmetric key and cipher: encrypted as the public key of the server] to its private key and acquires the symmetric key to be used for encrypted communication.</li>
<li>After that, data communication between the server and the client is made to be encrypted with the symmetric key created by the client.</li>
</ol></div>
<h2 class="xe_content">JCA Structure</h2>
<p class="xe_content">JCA structure can be described with Engine class and algorithm. In JCA, an Engine class provides interfaces for all encryption service types regardless of a specific encryption algorithm or provider. The engine class provides one of the following functions:</p>
<ul>
<li>Encryption operations (encryption, digital signatures, message digests, etc.) </li>
<li>Creating and converting the elements (keys and algorithm parameters) required for encryption </li>
<li>Objects (keystores or certificates) which imply the encryption data or can be used by an object or the upper abstraction layer </li>
</ul>
<p>Let's take a look at the Engine classes provided by JCA and talk about encryption in detail.</p>
<h3>SecureRandom</h3>
<p><code>SecureRandom</code> class is used to create a Pseudo Random number.&nbsp;In Java, random refers to Pseudo Random, as a more accurate expression. If so, are random and pseudo random different? Technically, they are different. There are two random types: <b>True Random</b> and <b>Pseudo Random</b>.</p>
<p><b>True Random</b> is a random number which cannot be forecasted. You may say that Pseudo Random cannot be forecasted. Pseudo Random is a random progression determined by seed and a mathematical algorithm. It has a sequence which is eventually repeated even if it takes a very long time or its probability is very low. In addition, if you know the seed and the random algorithm, you can forecast the sequence of the Pseudo Random. True Random creates a random number based on atomic physical phenomena, not the mathematical way used by the Pseudo Random. If there is no hardware equipment to measure the atomic physical phenomena, e.g., electromagnetic noises and radioactive element decay, it is impossible to create True Random.</p>
<p>JCA SecureRandom class is an engine class that provides a powerful function to create random numbers. As I described, it is not easy to implement a True Random Number Generator (TRNG). Therefore, many implementations implement Pseudo Random Number Generator (PRNG). As mentioned before, the random level of the pseudo random is incomplete. The popular random class cannot satisfy the minimum level that is required cryptographically. Therefore, the implementation of SecureRandom should be verified that it satisfies the requirements of cryptographic level (CSPRNG).</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/classification_of_encryption_type.png" alt="classification_of_encryption_type.png" width="530" height="305" /></p>
<p style="text-align: center;"><strong>Figure 3: Classification of Encryption Type</strong> (source: <a href="http://en.wikipedia.org/wiki/Cipher">http://en.wikipedia.org/wiki/Cipher</a>).</p>
<p>Now, you may ask why creating the random number is considered such an very important thing for encryption.</p>
<p>The core of modern cryptology is the key used for encryption. Previously, cryptology had been based on a conversion table like Base64 or UTF8 encoding. For modern cryptology, that kind of traditional method is not considered as encryption any more. The key is a random sequence generated by the random sequence generator. We naturally think of an encryption algorithm as thinking of cryptology. However, the open symmetric key encryption algorithm can be simplified to XOR operation (or multiplication/division operation) for the input values and key streams. As I said, the core is the key.</p>
<p>If the random level of a random sequence generator is not ensured, the entire outline of the key may be revealed to an attacker when a part of created key or some sequential random sequences are leaked out to the attacker. For modern cryptology, the key used for encryption is the core. So, the random sequence generator is very important.</p>
<p>JCA uses SecureRandom as the random sequence generator. Implementation of an algorithm to create random numbers is provided by a provider, like other encryption algorithms.</p>
<h3>MessageDigest</h3>
<p>MessageDigest is used to calculate the message digest (hash) of input data.</p>
<p>The purpose of message digest is the integrity check to check whether the original file is reserved as it is. Message digest algorithm processes a variable-length original message into a fixed-length hash output. Message digest algorithm consists of a unidirectional hash function, so it is not possible to draw the original value from the hash value. When A and B are communicating with each other, A sends the original message, message hash value of the original message, and the message digest algorithm to B. B calculates the message hash value by using the algorithm and original message sent from A. When the message hash value calculated by B is identical to the message hash value sent from A, it means that the original message sent from A has not been changed or modified until B receives it via the network.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/using_messagedigest_at_download_site.png" alt="using_messagedigest_at_download_site.png" width="477" height="215" /></p>
<p style="text-align: center;"><b>Figure 4: Example of Using MessageDigest at a Download Site.</b></p>
<p>You can frequently see Checksums or digital fingerprints at download sites. It is an alternative name for MessageDigest. MD5 or SHA1 is the well-known message digest algorithm.</p>
<h3>Signature</h3>
<p>Signature is used to sign data and to decide validity of the digital signature with a key received during initialization. Receiving a key means that key-based encryption is executed.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/flow_of_actions_of_signature_object.png" alt="flow_of_actions_of_signature_object.png" width="444" height="172" /></p>
<p style="text-align: center;"><b>Figure 5: Flow of Actions of Signature Object</b>&nbsp;(source: <a href="http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html">http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html</a>).</p>
<p>In initialization, Signature Object receives the private key and the original data to be signed as parameters and finishes preparation for signing. The sign() method of Signature signs the original data with the private key and returns Signature Bytes. To validate the signed data, the verification signature object is initialized by using the public key paired with the private key used for signing. The object additionally receives the original data, signature output and Signature Bytes, and the verify method checks whether the two parameters are identical to determine the reliability of the original data. Signature can be made only by the person who holds the private key. However, verification is made by using the public key. So, anyone who acquired the public key can perform verification.</p>
<h2>Digital Signature vs Cryptography vs MessageDigest</h2>
<p>For cryptography, users can select either symmetric key method or asymmetric key method based on the user's request. Digital signature is also a kind of cryptography. However, asymmetric key encryption is a prerequisite for digital signature. In addition, digital signature is a combination of MessageDigest and asymmetric key encryption. Large-capacity data with a variable length is compressed to a fixed-length format which is easy to manage by the MessageDigest and then signed with a private key to create fixed-length signature bytes. When creating a signature instance, you can see the principle of digital signature from the signature algorithm names, such as SHA1withRSA, MD5withRSA, SHA1withDSA; the signature algorithms are sent as a signature.getInstance() parameter and their names are made by combining RSA (asymmetric key encryption algorithm), MessageDigest algorithm, SHA1, and MD5.</p>
<div></div>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">Signature dsa = Signature.getInstance("SHA1withDSA");</div>
<div></div>
<h3>Signed Certificate</h3>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">Keystore Type: jks<br /> Keystore Provider: SUN<br /> <br /> Keystore includes the following two items:<br /> <br /> Alias: rootcaalias<br /> Written on: 2012. 9. 26<br /> Input Type: trustedCertEntry<br /> <br /> Holder: CN=NSYMKEY Root CA, OU=NHN NBP, O=NHN INC<br /> Issuer: CN=NSYMKEY Root CA, OU=NHN NBP, O=NHN INC<br /> Serial Number: <br /> Opened on: Fri Apr 06 10:17:08 KST 2012 Expired on: Sun Mar 13 10:17:08 KST 2112<br /> Certificate Fingerprint:<br /> MD5: 0C:FC:12:C5:68:E5:95:0B:95:7D:B0:2F:FA:4F:DB:B4<br /> SHA1: 90:37:1C:E6:F4:64:AD:E6:27:AA:4F:58:88:16:11:24:6D:A5:EB:2B<br /> <br /> <br /> *******************************************<br /> *******************************************<br /> <br /> Alias: nplatform<br /> Written on: 2012. 9. 26<br /> Input Type: keyEntry<br /> Length of Certificate Chain: 2<br /> Certificate[1]:<br /> Holder: O=NHN INC, OU=NHN NBP, CN= NPLAFORM, UID=1<br /> Issuer: CN=NSYMKEY Root CA, OU=NHN NBP, O=NHN INC<br /> Serial Number: <br /> Opened on: Fri Sep 21 17:26:22 KST 2012 Expired on: Sun Aug 28 17:26:22 KST 2112<br /> Certificate Fingerprint:<br /> MD5: 48:8C:46:A3:E7:54:58:97:60:0D:5C:56:08:B0:D1:E7<br /> SHA1: 12:64:3C:DA:C1:2C:94:1A:2B:EB:E9:98:2B:DA:8F:06:78:6E:26:1E<br /> <br /> Certificate[2]:<br /> Holder: CN=NSYMKEY Root CA, OU=NHN NBP, O=NHN INC<br /> Issuer: CN=NSYMKEY Root CA, OU=NHN NBP, O=NHN INC<br /> Serial Number: <br /> Opened on: Fri Apr 06 10:17:08 KST 2012 Expired on: Sun Mar 13 10:17:08 KST 2112<br /> Certificate Fingerprint:<br /> MD5: 0C:FC:12:C5:68:E5:95:0B:95:7D:B0:2F:FA:4F:DB:B4<br /> SHA1: 90:37:1C:E6:F4:64:AD:E6:27:AA:4F:58:88:16:11:24:6D:A5:EB:2B</div>
<div></div>
<p>Let's review certificate and signature, which were described in the previous in depth section, with JCA Signature object mechanism.</p>
<p>The above text box is the KeyStore certificate file created by using the keytool of Java. From Holder and Issuer of Certificate[1], you can see that &ldquo;CN=NSYMKEY Root CA, OU=NHN NBP, O=NHN INC&rdquo; has signed the certificate of Holder &ldquo;O=NHN INC, OU=NHN NBP, CN= NPLAFORM, UID=1&rdquo; by using its private key. The result of the signature is the Certificate Fingerprint. The length of the Certificate Fingerprint is decided by the MessageDigest algorithm (MD5 or SHA1). As following the Certificate Chain, you can see that the Holder and Issuer of Certificate[2] are identical. It means that &ldquo;CN=NSYMKEY Root CA, OU=NHN NBP, O=NHN INC&rdquo; is self-signed by using its private key. As the Certificate[2] has self-signed, the Certificate Chain is ended here.</p>
<div></div>
<h3>Cipher Class</h3>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/flow_of_actions_of_cipher_object.png" alt="flow_of_actions_of_cipher_object.png" width="451" height="136" /></p>
<p style="text-align: center;"><strong>Figure 6: Flow of Actions of Cipher Object</strong> (source: <a href="http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html">http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html</a>).</p>
<p>Cipher class provides encryption/decryption functions. The encryption/decryption algorithms are variously classified as follows: Symmetric bulk encryption (AES, DES, DESede, Blowfish, IDEA), Stream encryption (RC4), Asymmetric encryption (RSA), and Password-based encryption (PBE).</p>
<p>I will not describe classification of encryption to Symmetric and Asymmetric because the classification is so well-known.</p>
<div></div>
<h3>Stream vs. Block Cipher</h3>
<div></div>
<p>Symmetric bulk encryption can be classified into Stream and Block Cipher. Block Cipher encodes the data in the fixed-length block unit. Data whose length does not fit the fixed length is padded with dummy values. Bytes padded are removed while decrypting the data. This padding is executed by the padding type (e.g., PKCS5PADDING) which is sent as a parameter while initializing Cipher. On the contrary, Stream Cipher processes input data in the unit of byte or bit. Therefore, it can process variable-length data without padding.</p>
<div></div>
<h3>Modes Of Operation</h3>
<div></div>
<p>The important concept of Block Cipher you should know is Feedback Modes. Assume a very simple block cipher. If the input data is identical, the encrypted result is identical. From this characteristic, attackers obtain a hint to decrypt the encrypted data with a repeated same pattern.</p>
<p>To avoid security vulnerabilities and make Cipher more complex, Feedback Mode was introduced. Feedback Mode is an operation which combines (XOR operation) the Nth input data block (or the Nth encrypted result data block) and the N-1st input data block (or the N-1st encrypted result data block) at the Nth encryption process. Therefore, when the input data blocks are identical, the result values are different corresponding to the variables used in the previous encryption process. Note one more thing: if N = 1, any variable cannot be acquired from the N-1st encryption process. In this case, Initial Value (IV) takes the role instead of the variable in the previous process. To use Feedback Mode, the IV value should be randomly created and prepared for encryption. The IV value used for encryption should be stored because it is necessary for decryption as well.</p>
<p>The feedback modes provided by JCA are CBC, CFB, and OFB. The mode that no feedback mode is used is called ECB for distinction. More detailed description of each mode will not be provided here.</p>
<p>Figure 7 shows the importance of feedback modes. If the original image data is encrypted without using the feedback mode (ECB MODE), identical input data is used and an identical encryption result is acquired. Therefore, the entire outline is drawn up.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/image_encryption.png" alt="image_encryption.png" width="608" height="328" /></p>
<p style="text-align: center;"><strong>Figure 7: Image Encryption</strong> (source: <a href="http://en.wikipedia.org/wiki/Modes_of_operation">http://en.wikipedia.org/wiki/Modes_of_operation</a>).</p>
<h3>Creating Cipher Object</h3>
<p>The essential thing for creating a Cipher instance is to specify transformation. Transformation consists of encryption algorithms (/feedback modes/paddings) described before. Only the encryption algorithm values can be specified. But, the default feedback mode/padding (ECB/PKCS5Padding) is internally specified.</p>
<div></div>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">Cipher c1 = Cipher.getInstance("DES/ECB/PKCS5Padding");<br /> <br /> or<br /> <br /> Cipher c1 = Cipher.getInstance("DES");</div>
<div></div>
<pre class="brush:java"></pre>
<p>The Cipher class instance can be initialized by selecting one from four modes (opmode: Encryption, Decryption, Wrap, Unwrap) for initialization.</p>
<ul>
<li>WRAP_MODE: Wraps Java.security.Key to convert it to the byte unit for secured key transmission </li>
<li>UNWRAP_MODE: Unwraps the wrapped key to the Java.security.Key object </li>
</ul>
<div>
<p>When initializing the cipher class instance, the <code>init()</code> method is called as its parameter. It requests opmode, key(certificate), params, and random as its parameters. Here, note the AlgorithmParameters-type params parameter. This instance is used to store the IV value of feedback mode and the salt value and the iteration count value of the PBS algorithm. These values are not required when initializing cipher of <code>ENCRYPTION_MODE</code>. These can be randomly created by ScureRandom and used for the encryption process. The values created are stored in the AlgorithmParameters field of the encryption cipher object. On the other hand, the params value is required for initializing <code>DECRYPTION_MODE</code> Cipher. In the decryption process, the params value identical to the value used for the encryption process is required. When the <code>init()</code> method is called, all existing values are deleted from the cipher class. Therefore, before initializing the cipher instance again, the <code>getParameters()</code> method should be called to store the AlgorithmParameter object used for the encryption process.</p>
<p>To make the jobs simpler, SealedObject can be used in the encryption result. <code>SealedObject</code> class receives a target statement to encrypt and Cipher object as arguments(sealing process  in SealedObject class). The <code>SealedObject</code> itself is an encrypted data and it manages algorithm arguments used for encryption. If a key used in encryption process is passed, you can  obtain decrypted data(unsealing process in <code>SealedObject</code> class).</p>
</div>
<p style="text-align: center;"><strong>Code 4: Encryption using SealedObject.</strong></p>
<pre class="brush:java"><div editor_component="code_highlighter" code_type="Java" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">//Create Cipher object
Cipher c = Cipher.getInstance("DES");<br />
c.init(Cipher.ENCRYPT_MODE, sKey);<br />
// Create SealedObject: it is an encryped data
SealedObject so = new SealedObject("This is a secret", c);</div></pre>
<p style="text-align: center;"><strong>Code 5 Decryption using SealedObject</strong></p>
<pre class="brush:java"><div editor_component="code_highlighter" code_type="Java" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">//Note: sKey is as same as encryption key
//Note: so is SealedObject which was previously created. 
//Decryption using SealedObject #1
//Decrypt using Cipher object
c.init(Cipher.DECRYPT_MODE, sKey);<br />
try {
	String s = (String)so.getObject(c);
} catch (Exception e) {
	//do something
};<br />
//Decryption using SealedObject #2
//Decrypt using encryption key
try {
	String s = (String)so.getObject(sKey);
} catch (Exception e) {
	//do something
};</div></pre>
<h3></h3>
<h3>Message Authentication Codes(MAC)</h3>
<p>MAC is similar to MessageDigest because it creates the hash value; however, it is different from MessageDigest in that it requires SecretKey (symmetric key) for initialization. MessageDigest allows any receiving party to execute integrity check for the received message. However, MAC allows only the party which has the identical SecretKey to execute integrity check for the received message. MAC is used among those who share the SecretKey.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/824/629/flow_of_actions_of_mac.png" alt="flow_of_actions_of_mac.png" width="308" height="170" /></p>
<p style="text-align: center;"><strong>Figure 8. Flow of Actions of MAC</strong> (source: <a href="http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html">http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html</a>).</p>
<div>
<p>HMAC is a MAC based on the encryption hash function (MessageDigest algorithm: MD5 or SHA1). HMAC is a combination of MessageDigest algorithm and shared SecretKey.</p>
<p>Signature is different from HMAC because Signature uses the asymmetric key. HMAC allows identifying the opponent faster than the signature that uses the RSA algorithm. So, some services strategically use HMAC.</p>
</div>
<h2>Conclusion</h2>
<div>
<p>So far, I have described half of JCA functions. However, the rest of the functions are also important even if they are not described here. This article does not include the other core of JCA, such as Key, KeyPair, KeyFactory, KeyGenerator, KeyStore, CertificateFactory, and CertStore. I think that the functions should be deeply and fully described. For lack of space, I won't deal with the functions here. They will be described in the next article if possible.</p>
<p>It was very difficult to study JCA and prepare this article. I felt that there were more things to study and research as I prepared the article and was left with even more questions while writing. I hope my article will help you to understand the "vague" concept of encryption more clearly.</p>
<p>By Jaehee Ahn, Software Egnineer at Web Platform Development Lab, NHN Corporation.</p>
</div>
<h2>References</h2>
<ul>
<li><a href="http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html">http://docs.oracle.com/javase/6/docs/technotes/guides/security/crypto/CryptoSpec.html </a></li>
<li><a href="http://en.wikipedia.org/wiki/Modes_of_operation">http://en.wikipedia.org/wiki/Modes_of_operation </a></li>
<li><a href="http://en.wikipedia.org/wiki/Cipher">http://en.wikipedia.org/wiki/Cipher </a></li>
<li><a href="http://en.wikipedia.org/wiki/Stream_Cipher">http://en.wikipedia.org/wiki/Stream_Cipher </a></li>
<li><a href="http://luxsci.com/blog/how-does-secure-socket-layer-ssl-or-tls-work.html">http://luxsci.com/blog/how-does-secure-socket-layer-ssl-or-tls-work.html</a></li>
</ul>]]></description>
                        <pubDate>Tue, 09 Apr 2013 11:36:20 +0900</pubDate>
                        <category>Java</category>
                        <category>JCA</category>
                        <category>security</category>
                        <category>encryption</category>
                        <category>HTTPS</category>
                        <category>certificate</category>
                                </item>
        										        <item>
            <title>Common uses of CUBRID Node.js API with examples</title>
            <dc:creator>Esen Sagynov</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-appstools/common-uses-of-cubrid-nodejs-api-with-examples/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-appstools/common-uses-of-cubrid-nodejs-api-with-examples/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-appstools/common-uses-of-cubrid-nodejs-api-with-examples/#comment</comments>
                                    <description><![CDATA[<p>Recently <a href="http://nodejs.org/">Node.js</a> has become one of the most favorite tools developers choose to create new Web services or network applications. Some of the reasons are its <strong>event-driven</strong> and <strong>non-blocking I/O</strong> architecture which allow developers to create very lightweight, efficient and highly scalable real-time applications that run across distributed servers.</p>
<p>Node.js has been widely adopted by individual developers as well as large corporations such as LinkedIn, Yahoo!, Microsoft, and others. It has become so popular that developers have started writing and publishing so called <strong>Node Packaged Modules</strong> which further extend the functionality of the Node.js platform. In fact, there are over 17,000 registered modules at <a href="https://npmjs.org/">https://npmjs.org/</a> which have been downloaded over 12,000,000 times during the last month only. <i>That</i> popular the Node.js platform is.</p>
<h2 id="node-cubrid">node-cubrid</h2>
<p>To allow Node.js developers to connect and work with <a href="http://www.cubrid.org">CUBRID Database Server</a>, we have developed the <a href="https://github.com/CUBRID/node-cubrid">node-cubrid</a> module and published it at NPM.</p>
<p><strong>node-cubrid</strong> provides a set of APIs to connect to and query CUBRID databases. Besides the database specific APIs, the module also supplies several <em>helper</em> APIs which are useful to sanitize and validate user input values, format and parameterize SQL statements.</p>
<h3 id="compatibility">Compatibility</h3>
<p><strong>node-cubrid</strong> has been developed in pure JavaScript, therefore it has no dependency on any external library. This allows users to develop CUBRID Database based Node.js applications on any Node.js compatible platform such as Linux, Mac OS X, and Windows. For the same reason <strong>node-cubrid</strong> is designed to work with any version of CUBRID RDBMS. However, for the time being it has been tested only with CUBRID 8.4.1.</p>
<p>This is different from other <a href="/wiki_apis">CUBRID drivers</a> such as PHP/PDO, Python, Perl, Ruby, OLEDB, and ODBC which have dynamic dependency on CUBRID C Internface (CCI). Since CUBRID is available only on Linux and Windows OS, these drivers are also limited to these platforms as well as specific CUBRID versions. However, CUBRID&rsquo;s Node.js as well as ADO.NET drivers do not have any dependency, therefore can be used on any platform where that particular run-time environment is capable of running on.</p>
<h3 id="installation">Installation</h3>
<p>Installing and using <strong>node-cubrid</strong> is easy. To install, one has to initiate <code>npm install</code> command with <code>node-cubrid</code> module name as an argument in the directory where a Node.js application is located.</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">npm&nbsp;install&nbsp;node-cubrid</div>
<p>&nbsp;</p>
<p>This will install the latest version available at <a href="https://npmjs.org/package/node-cubrid">https://npmjs.org/</a>. Once installed, the module can be accessed by requiring the <code>node-cubrid</code> module:</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">var&nbsp;CUBRID&nbsp;=&nbsp;require('node-cubrid');</div>
<p>&nbsp;</p>
<p>The <strong>node-cubrid</strong> module exports the following properties and functions:</p>
<ul>
<li><code>Helpers</code>: an object which provides a set of <em>helper</em> functions.</li>
<li><code>Result2Array</code>: an object which provides functions to convert DB result sets into JS arrays.</li>
<li><code>createDefaultCUBRIDDemodbConnection()</code>: a function which returns a connection object to work with a local <a href="/wiki_tutorials/entry/getting-started-with-demodb-cubrid-demo-database">demodb</a> database.</li>
<li><code>createCUBRIDConnection()</code>: a function which returns a connection object to work with a user defined CUBRID host and database.</li>
</ul>
<h3 id="requestflowinnode-cubrid">Request flow in node-cubrid</h3>
<p>The request flow in <strong>node-cubrid</strong> module looks as illustrated below.</p>
<p>&nbsp;</p>
<p><figure> <img editor_component="image_link" src="/files/attach/images/194379/839/471/cubrid_nodejs_events_chain.png" /></figure></p>
<p>&nbsp;</p>
<p>Because <strong>node-cubrid</strong> is developed to take the full advantage of JavaScript and Node.js programming, when executing a SQL statement in <strong>node-cubrid</strong>, developers need to listen for an <code>EVENT_QUERY_DATA_AVAILABLE</code> and <code>EVENT_ERROR</code> events, or provide a callback function which will be called once there is a response from the server.</p>
<p>When the request is sent to the server, CUBRID executes it, and returns the response, which can be either a query result set, or the error code. It is by design that CUBRID does not return any identification about the request sender. In other words, in order to associate the response with a request, the driver has to have only one active request which can be the only owner of this response.</p>
<p>For this reason, if a developer wants to execute several queries, they must execute them one after another, i.e. sequentially, <strong>NOT</strong> in parallel. This is how the communication between the driver and the server is implemented in CUBRID and many other database systems.</p>
<p>If there is a vital need to run queries in parallel, developers can use connection pooling modules. We will explain this technique in the examples below.</p>
<h3 id="usingnode-cubrid">Using node-cubrid</h3>
<h4 id="establishingaconnection">Establishing a connection</h4>
<p>First, user <strong>establishes a connection</strong> with a CUBRID server by providing a host name (default: &lsquo;localhost&rsquo;), the broker port (default: 33000), database username (default: &lsquo;public&rsquo;), password (default: empty string), and finally the database name (default: &lsquo;demodb&rsquo;).</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">conn.connect(function&nbsp;(err)&nbsp;{<br />&nbsp; &nbsp; if&nbsp;(err)&nbsp;{<br />&nbsp; &nbsp; &nbsp; &nbsp; throw&nbsp;err.message;<br />&nbsp; &nbsp; }<br />&nbsp; &nbsp; else{<br />&nbsp; &nbsp; &nbsp; &nbsp; console.log('connection&nbsp;is&nbsp;established');<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;	<br />&nbsp; &nbsp; &nbsp; &nbsp; conn.close(function&nbsp;()&nbsp;{<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; console.log('connection&nbsp;is&nbsp;closed');<br />&nbsp; &nbsp; &nbsp; &nbsp; });<br />&nbsp; &nbsp; }<br />});</div>
<p>&nbsp;</p>
<p>The above code illustrates a <em>callback style</em> when a function is passed as an argument to a <code>connect()</code> API which is called if the connection has been successfully established. Alternatively, developers can write applications based on an <em>event-based coding style</em>. For example, the above code can be rewritten as:</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">conn.connect();<br /> <br /> conn.on(conn.EVENT_ERROR,&nbsp;function&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;throw&nbsp;err.message;<br /> });<br /> <br /> conn.on(conn.EVENT_CONNECTED,&nbsp;function&nbsp;()&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;connection&nbsp;is&nbsp;established<br /> &nbsp;&nbsp;&nbsp;&nbsp;conn.close();<br /> });<br /> <br /> conn.on(conn.EVENT_CONNECTION_CLOSED,&nbsp;function&nbsp;()&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;connection&nbsp;is&nbsp;closed<br /> });</div>
<p>&nbsp;</p>
<p>If you prefer the event-based coding style, refer to the <a href="/wiki_apis/entry/cubrid-node-js-api-overview">Driver Event model</a> wiki page to learn more about other events <strong>node-cubrid</strong> emits for certain API calls.</p>
<h4 id="executingqueries">Executing queries</h4>
<p>Once connected, users can start executing SQL queries. There are several APIs you can use to execute queries in <strong>node-cubrid</strong>:</p>
<ol>
<li><code>query(sql, callback);</code></li>
<li><code>queryWithParams(sql, arrParamsValues, arrDelimiters, callback);</code></li>
<li><code>execute(sql, callback);</code></li>
<li><code>executeWithParams(sql, arrParamsValues, arrDelimiters, callback);</code></li>
<li><code>batchExecuteNoQuery(sqls, callback);</code></li>
</ol>
<p>Eventually all of the above APIs execute given SQL queries. The difference is that <code>query*</code> APIs <strong>return</strong> data records while <code>*execute*</code> APIs <strong>do not return</strong> any record. So basically, you would use <code>query*</code> with <code>SELECT</code> queries while <code>*execute*</code> with <code>INSERT</code>/<code>UPDATE</code>/<code>DELETE</code> queries.</p>
<h4 id="executingquerieswithparameters">Executing queries with parameters</h4>
<p><code>queryWithParams()</code> and <code>executeWithParams()</code> APIs allow developers to <strong>bind values</strong> to parameterized SQL queries. Though &ldquo;binding&rdquo; in <strong>node-cubrid</strong> does not infer a communication with the server, the module merely replaces all <code>?</code> placeholders with the given <code>arrParamsValues</code> values which are wrapped with <code>arrDelimiters</code> delimeters. Thus, you can bind values as follows:</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">var&nbsp;code&nbsp;=&nbsp;15214,<br /> &nbsp;&nbsp;&nbsp;&nbsp;sql&nbsp;=&nbsp;'SELECT&nbsp;*&nbsp;FROM&nbsp;athlete&nbsp;WHERE&nbsp;code&nbsp;=&nbsp;?';<br /> <br /> conn.queryWithParams(sql,&nbsp;[code],&nbsp;[],&nbsp;function&nbsp;(err,&nbsp;result,&nbsp;queryHandle)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;check&nbsp;the&nbsp;error&nbsp;first&nbsp;then&nbsp;use&nbsp;the&nbsp;result<br /> });</div>
<p>&nbsp;</p>
<p>The same can be done with non-result SQL statements like:</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">var&nbsp;host_year&nbsp;=&nbsp;2008,<br /> &nbsp;&nbsp;&nbsp;&nbsp;host_nation&nbsp;=&nbsp;'China',<br /> &nbsp;&nbsp;&nbsp;&nbsp;host_city&nbsp;=&nbsp;'Beijing',<br /> &nbsp;&nbsp;&nbsp;&nbsp;opening_date&nbsp;=&nbsp;'08-08-2008',<br /> &nbsp;&nbsp;&nbsp;&nbsp;closing_date&nbsp;=&nbsp;'08-24-2008',<br /> &nbsp;&nbsp;&nbsp;&nbsp;sql&nbsp;=&nbsp;'INSERT&nbsp;INTO&nbsp;olympic&nbsp;(host_year,&nbsp;host_nation,&nbsp;host_city,&nbsp;opening_date,&nbsp;closing_date)&nbsp;VALUES&nbsp;(?,&nbsp;?,&nbsp;?,&nbsp;?,&nbsp;?)';<br /> <br /> conn.executeWithParams(sql,&nbsp;[host_year,&nbsp;host_nation,&nbsp;host_city,&nbsp;opening_date,&nbsp;closing_date],&nbsp;["",&nbsp;"'",&nbsp;"'",&nbsp;"'",&nbsp;"'"],&nbsp;function&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;check&nbsp;the&nbsp;error&nbsp;first<br /> });</div>
<p>&nbsp;</p>
<p>If you need to insert multiple records at once in the form of <code>VALUES (...), (...), ...</code>, you can use <em>helper</em> functions to manually populate <code>?</code> placeholders with values as shown below.</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">var&nbsp;sql&nbsp;=&nbsp;'INSERT&nbsp;INTO&nbsp;olympic&nbsp;(host_year,&nbsp;host_nation,&nbsp;host_city,&nbsp;opening_date,&nbsp;closing_date)&nbsp;VALUES&nbsp;',<br /> &nbsp;&nbsp;partialSQL&nbsp;=&nbsp;'(?,&nbsp;?,&nbsp;?,&nbsp;?,&nbsp;?)',<br /> &nbsp;&nbsp;data&nbsp;=&nbsp;[{...},&nbsp;{...},&nbsp;{...}],<br /> &nbsp;&nbsp;values&nbsp;=&nbsp;[];<br /> <br /> data.forEach(function&nbsp;(r)&nbsp;{<br /> &nbsp;&nbsp;var&nbsp;valuesSQL&nbsp;=&nbsp;CUBRID.Helpers._sqlFormat(<br /> &nbsp;&nbsp;&nbsp;&nbsp;partialSQL,<br /> &nbsp;&nbsp;&nbsp;&nbsp;[r.host_year,&nbsp;r.host_nation,&nbsp;r.host_city,&nbsp;r.opening_date,&nbsp;r.closing_date],<br /> &nbsp;&nbsp;&nbsp;&nbsp;["",&nbsp;"'",&nbsp;"'",&nbsp;"'",&nbsp;"'"]<br /> &nbsp;&nbsp;);<br /> &nbsp;&nbsp;<br /> &nbsp;&nbsp;values.push(valuesSQL);<br /> });<br /> <br /> sql&nbsp;+=&nbsp;values.join(',');<br /> <br /> conn.execute(sql,&nbsp;function&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;check&nbsp;the&nbsp;error&nbsp;first<br /> });</div>
<p>&nbsp;</p>
<h4 id="fetchingmoredata">Fetching more data</h4>
<p>Sometimes, when quering a database, it happens that the results set is quite large that it has to be retrieve in multiple steps. Below you can see how to keep fetching more data until all data is retrieved.</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">var&nbsp;sql&nbsp;=&nbsp;'SELECT&nbsp;*&nbsp;FROM&nbsp;participant';<br /> <br /> conn.query(sql,&nbsp;function&nbsp;(err,&nbsp;result,&nbsp;queryHandle)&nbsp;{<br /> &nbsp;&nbsp;//&nbsp;assuming&nbsp;no&nbsp;error&nbsp;is&nbsp;returned<br /> //&nbsp;the&nbsp;following&nbsp;outputs&nbsp;916<br /> &nbsp;&nbsp;&nbsp;&nbsp;console.log(CUBRID.Result2Array.TotalRowsCount(result));<br /> &nbsp;&nbsp;&nbsp;&nbsp;<br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;outputResults&nbsp;(err,&nbsp;result,&nbsp;queryHandle)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(result)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;309&nbsp;records&nbsp;are&nbsp;in&nbsp;the&nbsp;first&nbsp;results&nbsp;set<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;315&nbsp;records&nbsp;are&nbsp;in&nbsp;the&nbsp;second&nbsp;results&nbsp;set<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;292&nbsp;records&nbsp;are&nbsp;in&nbsp;the&nbsp;third&nbsp;results&nbsp;set<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log(CUBRID.Result2Array.RowsArray(result).length);<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;try&nbsp;to&nbsp;fetch&nbsp;more&nbsp;data<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.fetch(queryHandle,&nbsp;outputResults);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;no&nbsp;more&nbsp;result,&nbsp;close&nbsp;this&nbsp;query&nbsp;handle<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.closeQuery(queryHandle,&nbsp;function&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.close(function&nbsp;()&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('connection&nbsp;closed');<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;});<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;});<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;<br /> &nbsp;&nbsp;&nbsp;&nbsp;outputResults(err,&nbsp;result,&nbsp;queryHandle);<br /> });</div>
<p>&nbsp;</p>
<p>The above are the APIs developers will use most of the time.</p>
<h4 id="usingaconnectionpoolmanager">Using a connection pool manager</h4>
<p><strong>node-cubrid</strong> does not provide connection pool manager. However, at some point developers may want to execute multiple queries at the same time. In such cases, users can use <a href="https://github.com/coopernurse/node-pool">generic-pool</a>, also known as <strong>node-pool</strong>, as a pool manager for CUBRID connections.</p>
<p>To install <strong>generic-pool</strong> type the following in the terminal.</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">npm&nbsp;install&nbsp;generic-pool</div>
<p>&nbsp;</p>
<p>The following example shows how to configure <strong>generic-pool</strong> to create and destroy CUBRID connections.</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">var&nbsp;poolModule&nbsp;=&nbsp;require('generic-pool');<br /> var&nbsp;pool&nbsp;=&nbsp;poolModule.Pool({<br /> &nbsp;&nbsp;&nbsp;&nbsp;name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;'CUBRID',<br /> &nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;you&nbsp;can&nbsp;limit&nbsp;this&nbsp;pool&nbsp;to&nbsp;create&nbsp;maximum&nbsp;10&nbsp;connections<br /> &nbsp;&nbsp;&nbsp;&nbsp;max&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;:&nbsp;10,<br /> &nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;destroy&nbsp;the&nbsp;connection&nbsp;if&nbsp;it's&nbsp;idle&nbsp;for&nbsp;30&nbsp;seconds<br /> &nbsp;&nbsp;&nbsp;&nbsp;idleTimeoutMillis&nbsp;:&nbsp;30000,<br /> &nbsp;&nbsp;&nbsp;&nbsp;log&nbsp;:&nbsp;true&nbsp;,<br /> &nbsp;&nbsp;&nbsp;&nbsp;create&nbsp;&nbsp;&nbsp;:&nbsp;function(callback)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;conn&nbsp;=&nbsp;CUBRID.createCUBRIDConnection('localhost',&nbsp;33000,&nbsp;'dba',&nbsp;'password',&nbsp;'demodb');<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.connect(function&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;callback(err,&nbsp;conn);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;});<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> &nbsp;&nbsp;&nbsp;&nbsp;destroy&nbsp;&nbsp;:&nbsp;function(con)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.close();<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> });</div>
<p>&nbsp;</p>
<p>Then, the connection pool manager can be used in your application as follows.</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">pool.acquire(function(err,&nbsp;conn)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;handle&nbsp;error&nbsp;-&nbsp;this&nbsp;is&nbsp;generally&nbsp;the&nbsp;err&nbsp;from&nbsp;your<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;factory.create&nbsp;function&nbsp;&nbsp;<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;else&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.query("select&nbsp;*&nbsp;from&nbsp;foo",&nbsp;function()&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;//&nbsp;once&nbsp;done&nbsp;querying,&nbsp;return&nbsp;the&nbsp;object&nbsp;back&nbsp;to&nbsp;pool<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;pool.release(conn);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;});<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> });</div>
<p>&nbsp;</p>
<h4 id="usingnode-cubridwithasyncmodule">Using node-cubrid with async module</h4>
<p><strong>node-cubrid</strong> module provides <strong>ActionQueue</strong> <em>helper</em> module which provides the <strong>waterfall</strong> functionality of <a href="https://github.com/caolan/async">async</a> module. You can use <strong>ActionQueue</strong> as follows:</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">CUBRID.ActionQueue.enqueue([<br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.connect(cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.getEngineVersion(cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(engineVersion,&nbsp;cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Engine&nbsp;version&nbsp;is:&nbsp;'&nbsp;+&nbsp;engineVersion);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.query('select&nbsp;*&nbsp;from&nbsp;code',&nbsp;cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(result,&nbsp;queryHandle,&nbsp;cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Query&nbsp;result&nbsp;rows&nbsp;count:&nbsp;'&nbsp;+&nbsp;Result2Array.TotalRowsCount(result));<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Query&nbsp;results:');<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;arr&nbsp;=&nbsp;Result2Array.RowsArray(result);<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for&nbsp;(var&nbsp;k&nbsp;=&nbsp;0;&nbsp;k&nbsp;&lt;&nbsp;arr.length;&nbsp;k++)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log(arr[k].toString());<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.closeQuery(queryHandle,&nbsp;cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Query&nbsp;closed.');<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.close(cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Connection&nbsp;closed.');<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;],<br /> <br /> &nbsp;&nbsp;function&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(err&nbsp;==&nbsp;null)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Program&nbsp;closed.');<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;throw&nbsp;err.message;<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;}<br /> );</div>
<p>&nbsp;</p>
<p>The above is identical to <strong>async</strong>&rsquo;s <strong>waterfall</strong> function shown below.</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="JScript" editor_component="code_highlighter">async.waterfall([<br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.connect(cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.getEngineVersion(cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(engineVersion,&nbsp;cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Engine&nbsp;version&nbsp;is:&nbsp;'&nbsp;+&nbsp;engineVersion);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.query('select&nbsp;*&nbsp;from&nbsp;code',&nbsp;cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(result,&nbsp;queryHandle,&nbsp;cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Query&nbsp;result&nbsp;rows&nbsp;count:&nbsp;'&nbsp;+&nbsp;Result2Array.TotalRowsCount(result));<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Query&nbsp;results:');<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;var&nbsp;arr&nbsp;=&nbsp;Result2Array.RowsArray(result);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for&nbsp;(var&nbsp;k&nbsp;=&nbsp;0;&nbsp;k&nbsp;&lt;&nbsp;arr.length;&nbsp;k++)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log(arr[k].toString());<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.closeQuery(queryHandle,&nbsp;cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Query&nbsp;closed.');<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> <br /> &nbsp;&nbsp;&nbsp;&nbsp;function&nbsp;(cb)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;conn.close(cb);<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Connection&nbsp;closed.');<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;],<br /> <br /> &nbsp;&nbsp;function&nbsp;(err)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;if&nbsp;(err&nbsp;==&nbsp;null)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;console.log('Program&nbsp;closed.');<br /> &nbsp;&nbsp;&nbsp;&nbsp;}&nbsp;else&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;throw&nbsp;err.message;<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;}<br /> );</div>
<p>&nbsp;</p>
<h3 id="roadmap">Roadmap</h3>
<p>At the time of writing this artile <strong>node-cubrid</strong> version 1.0.1 stable was the latest release. In the future version we plan to improve <strong>node-cubrid</strong> a lot to make it more convenient for developers to code. For example, developers will be able to bind values with a single object parameter. Its properties and their values will serve as column names and values in the SQL statement. Very convenient which also increases the code readability.</p>
<p>We will also add new APIs to retrieve values set for server configuration parameters. As of version 1.0.1 <strong>node-cubrid</strong> does not return the number of affected rows after having executed write queries. This will also be implemented. In addition to this, there will be APIs to obtain table schema information which will be very benefitial for ORM developers.</p>
<p>Besides these, in the upcoming version <strong>node-cubrid</strong> will allow to connect to a CUBRID Server using a connection URL, the same API we already provide in all other drivers. This will allow to pass a list of alternative hosts for broker level failover, specify the query timeout duration, etc.</p>
<p>We plan to add many new functionality to <strong>node-cubrid</strong>. If you have a specific request, please create an issue in <a href="http://jira.cubrid.org">CUBRID JIRA</a> issue tracker, or let us know by <a href="http://webchat.freenode.net/?channels=cubrid">IRC</a>, <a href="http://twitter.com/cubrid">Twitter</a>, or <a href="http://www.facebook.com/cubrid">Facebook</a>. We will be glad to review your request. If you have specific questions about CUBRID or <strong>node-cubrid</strong> module, you can ask at our <a href="/questions">Q&amp;A site</a>.</p>]]></description>
                        <pubDate>Mon, 10 Dec 2012 15:25:56 +0900</pubDate>
                        <category>Node.js</category>
                        <category>JavaScript</category>
                        <category>Drivers</category>
                        <category>Web development</category>
                        <category>programming</category>
                        <category>APIs</category>
                                </item>
        										        <item>
            <title>Things to Understand When Moving from MySQL to CUBRID</title>
            <dc:creator>Donghyun Lee</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-comparison/things-to-understand-when-moving-from-mysql-to-cubrid/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-comparison/things-to-understand-when-moving-from-mysql-to-cubrid/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-comparison/things-to-understand-when-moving-from-mysql-to-cubrid/#comment</comments>
                                    <description><![CDATA[<p style="text-align: center;"><img editor_component="image_link" height="200" width="300" alt="cubrid-vs-mysql.png" src="/files/attach/images/220547/167/399/cubrid-vs-mysql.png" /></p>
<p>These days at <a target="_self" href="/blog/tags/NHN/">NHN</a>&nbsp;we use&nbsp;CUBRID for our services more often than MySQL.&nbsp;This article summarizes the differences between MySQL and CUBRID Database, the knowledge that several departments at NHN have obtained as they have&nbsp;changed their database from MySQL to CUBRID.&nbsp;This document is based on MySQL 5.5 and CUBRID 8.4.1.</p>
<p>The differences can be classified into three types:</p>
<ol>
<li>Column Types</li>
<li>SQL Syntax</li>
<li>Provided Functions</li>
</ol>
<h2>Differences in Column Types</h2>
<h3>Case-sensitiveness of Character Types</h3>
<p>Basically, MySQL is not case-sensitive for character type values when the query is executed. Therefore, to make the character type be case-sensitive in MySQL, you should add an additional binary keyword when creating a table or a query statement. On the contrary, CUBRID is basically case-sensitive for character type values when the query is executed.</p>
<p>The following example shows how to indicate that a target column is&nbsp;<code>BINARY</code> when creating a table in MySQL.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;tbl&nbsp;(name&nbsp;CHAR(10)&nbsp;BINARY);<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES('Charles'),('blues');<br /> SELECT&nbsp;*&nbsp;FROM&nbsp;tbl&nbsp;WHERE&nbsp;name='CHARLES';<br /> &nbsp;<br /> Empty&nbsp;set&nbsp;(0.00&nbsp;sec)</div>
<p>This example shows how to indicate that a target column is <code>BINARY</code> when executing a <code>SELECT</code> statement in MySQL.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">SELECT&nbsp;*&nbsp;FROM&nbsp;tbl&nbsp;WHERE&nbsp;BINARY&nbsp;name='Charles';<br /> &nbsp;<br /> +---------+<br /> |&nbsp;name&nbsp;|<br /> +---------+<br /> |&nbsp;Charles&nbsp;|<br /> +---------+</div>
<p>To make CUBRID be case-<b><i>insensitive</i></b>, just like MySQL, apply the <code>UPPER()</code> function or the <code>LOWER()</code> function to the target column as shown in the following example.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">SELECT&nbsp;*&nbsp;FROM&nbsp;tbl&nbsp;ORDER&nbsp;BY&nbsp;name;<br /> &nbsp;<br /> name<br /> ======================<br /> 'Charles &nbsp; '<br /> 'blues &nbsp; &nbsp; '<br /> &nbsp;<br /> &nbsp;<br /> SELECT&nbsp;*&nbsp;FROM&nbsp;tbl&nbsp;ORDER&nbsp;BY&nbsp;UPPER(name);<br /> &nbsp;<br /> name<br /> ======================<br /> 'blues &nbsp; &nbsp; '<br /> 'Charles &nbsp; '</div>
<p>As shown in the example above, when&nbsp;<code>UPPER()</code>&nbsp;function is not applied the data is returned as they were inserted. When we apply&nbsp;<code>UPPER()</code> function to the <code>ORDER BY</code> column we can obtain case-insensitive results. In this case&nbsp;even if there was an index defined on the <code>name</code> column this index cannot be used by&nbsp;<code>ORDER BY</code>&nbsp;due to the existence of&nbsp;<code>UPPER()</code>&nbsp;function,&nbsp;thus no optimization can be applied. If there was no&nbsp;<code>UPPER()</code>&nbsp;function,&nbsp;<code>ORDER BY</code>&nbsp;would&nbsp;fetch the data in the order of the defined index. In order to optimize <code>ORDER BY</code>, you may consider creating a separate column that is totally upper-cased or lower-cased and configuring index in the column.</p>
<p>The following is an example of separately adding the sorting column <code>name2</code> which is not case-sensitive.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">ALTER&nbsp;TABLE&nbsp;tbl&nbsp;ADD&nbsp;COLUMN&nbsp;(name2&nbsp;CHAR(10));<br /> UPDATE&nbsp;tbl&nbsp;SET&nbsp;name2=UPPER(name);<br /> CREATE&nbsp;INDEX&nbsp;i_tbl_name2&nbsp;ON&nbsp;tbl(name2);<br /> SELECT&nbsp;*&nbsp;FROM&nbsp;tbl&nbsp;ORDER&nbsp;BY&nbsp;name2;<br /> &nbsp;<br /> name&nbsp;name2<br /> ============================================<br /> 'blues &nbsp; &nbsp; '&nbsp;'BLUES &nbsp; &nbsp; '<br /> 'Charles &nbsp; '&nbsp;'CHARLES &nbsp; '</div>
<p>This coming fall we will release a new version of CUBRID under the code name "Apricot" which will introduce <a target="_self" href="http://sourceforge.net/apps/trac/cubrid/wiki/IndexEnhancement/FunctionBasedIndex">function based indexes</a>. Then, you will no longer need to create a separate column and create an index on it. You will be able to do the following:</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;INDEX&nbsp;idx_tbl_name_upper&nbsp;ON&nbsp;tbl&nbsp;(UPPER(name));<br /> SELECT&nbsp;*&nbsp;FROM&nbsp;tbl&nbsp;WHERE&nbsp;UPPER(name)&nbsp;=&nbsp;'CHARLES';</div>
<h3>Automatic Type Conversion for Date Type</h3>
<p>MySQL is very flexible in converting the type. It accepts character string input in the numeric type and vice versa (number input in the character string type). It also accepts numeric input in the date type.</p>
<p>From version 8.4.0, CUBRID supports flexible type conversion, allowing for character string input in the numeric type and number input in the character string type. However, unlike MySQL, CUBRID does not accept number input in the date type.</p>
<p>The following is an example of inputting numbers in the date type <code>dt</code>&nbsp;column in MySQL.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;dt_tbl(dt&nbsp;DATE);<br /> mysql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;dt_tbl&nbsp;VALUES&nbsp;(20120515);<br /> mysql&gt;&nbsp;SELECT&nbsp;*&nbsp;FROM&nbsp;dt_tbl;<br /> &nbsp;<br /> +------------+<br /> |&nbsp;dt&nbsp;|<br /> +------------+<br /> |&nbsp;2012-05-15&nbsp;|<br /> +------------+<br /> 1&nbsp;row&nbsp;in&nbsp;set&nbsp;(0.00&nbsp;sec)</div>
<p>The following is an example of inputting numbers in the date type <code>dt</code> column in CUBRID. You can see that an error is returned as a result value when numbers are input in the date type.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;dt_tbl(dt&nbsp;DATE);<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;dt_tbl&nbsp;VALUES&nbsp;(20120515);<br /> &nbsp;<br /> ERROR:&nbsp;before&nbsp;'&nbsp;);&nbsp;'<br /> Cannot&nbsp;coerce&nbsp;20120515&nbsp;to&nbsp;type&nbsp;date.<br /> &nbsp;<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;dt_tbl&nbsp;VALUES&nbsp;('20120515');<br /> csql&gt;&nbsp;SELECT&nbsp;*&nbsp;FROM&nbsp;dt_tbl;<br /> &nbsp;<br /> dt<br /> ============<br /> 05/15/2012</div>
<p>When an error occurs as a result of executing the date function, MySQL returns <code>NULL</code> and CUBRID returns an error by default. To make CUBRID return <code>NULL</code> for such cases, set the value of <code>return_null_on_function_errors</code> system parameter to <code>yes</code>.</p>
<p>The following example shows that <code>NULL</code> is returned when an invalid parameter has been entered in the date function of MySQL.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;SELECT&nbsp;YEAR('12:34:56');<br /> &nbsp;<br /> +------------------+<br /> |&nbsp;YEAR('12:34:56')&nbsp;|<br /> +------------------+<br /> |&nbsp;NULL&nbsp;|<br /> +------------------+<br /> 1&nbsp;row&nbsp;in&nbsp;set,&nbsp;1&nbsp;warning&nbsp;(0.00&nbsp;sec)</div>
<p>The following example shows that an error is returned when an invalid parameter has been entered in the date function of CUBRID when the value of system parameter <code>return_null_on_function_error</code> has been set to <code>no</code>&nbsp;which is the default value.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;YEAR('12:34:56');<br /> &nbsp;<br /> ERROR:&nbsp;Conversion&nbsp;error&nbsp;in&nbsp;date&nbsp;format.</div>
<p>The following example shows that <code>NULL</code> is returned when an invalid parameter has been entered in the date function of CUBRID when the value of system parameter <code>return_null_on_function_errors</code> has been changed to <code>yes</code>.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;YEAR('12:34:56');<br /> &nbsp;<br /> year('12:34:56')<br /> ======================<br /> NULL</div>
<h3>Result Value Type of Integer-by-Integer Division</h3>
<p>When integer-by-integer division is performed, MySQL prints the output value as a&nbsp;<code>DECIMAL (m, n)</code>, but CUBRID prints it as a rounded <code>INTEGER</code>. This is because when each operand is of the same type, the result&nbsp;in CUBRID&nbsp;is printed as that same type.&nbsp;In this case to display the result value as a <code>REAL</code> number, apply the <code>CAST()</code> function to any or all operands in the fraction.</p>
<p>The following shows an example of executing integer-by-integer division in MySQL. The result value will be printed as a real number type.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;SELECT&nbsp;4/3;<br /> &nbsp;<br /> +--------+<br /> |&nbsp;4/3&nbsp;|<br /> +--------+<br /> |&nbsp;1.3333&nbsp;|<br /> +--------+<br /> &nbsp;<br /> mysql&gt;&nbsp;SELECT&nbsp;4/2;<br /> &nbsp;<br /> +--------+<br /> |&nbsp;4/2&nbsp;|<br /> +--------+<br /> |&nbsp;2.0000&nbsp;|<br /> +--------+</div>
<p>The following shows an example of executing integer-by-integer division in CUBRID. The result value will be printed as an <code>INTEGER</code> type.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;4/3;<br /> &nbsp;<br /> 4/3<br /> =============<br /> 1<br /> &nbsp;<br /> csql&gt;&nbsp;SELECT&nbsp;4/2;<br /> &nbsp;<br /> 4/2<br /> =============<br /> 2</div>
<p>The following shows an example of executing integer-by-integer division by using the <code>CAST()</code> function in CUBRID. The result value will be printed as a real number type.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;CAST(4&nbsp;AS&nbsp;DECIMAL(5,4))/CAST(3&nbsp;AS&nbsp;DECIMAL(5,4));<br /> &nbsp;<br /> cast(4&nbsp;as&nbsp;numeric(5,4))/&nbsp;cast(3&nbsp;as&nbsp;numeric(5,4))<br /> ======================<br /> 1.333333333</div>
<p>The following shows an example of executing integer-by-real number division in the CUBRID. Since one of the input values is a real number type, the result value will be printed as a real number (<code>DOUBLE</code>) type .</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;4/3.0;<br /> &nbsp;<br /> 4/3.0<br /> ======================<br /> 1.333333333</div>
<h3>Processing SUM result which is larger than the Maximum Value of Input Value Type</h3>
<p>How will the result be printed if the result of <code>SUM</code> is larger than the maximum value of the input value type?</p>
<p>MySQL converts the result of <code>SUM</code> to a pre-defined large <code>DECIMAL</code> number type. However, CUBRID processes the result as an <b>overflow error</b>. It means that&nbsp;<b>in CUBRID&nbsp;</b><b>the type of the input column decides the result type</b>. Therefore, to avoid overflow errors in CUBRID, you should convert (<code>CAST()</code>) the input column type to a type that can accept the <code>SUM</code> result value before executing operations.</p>
<p>Converting the type when&nbsp;executing the&nbsp;<code>CAST()</code>&nbsp;function&nbsp;incurs some additional cost, so I recommend to decide the column type&nbsp;considering the result value of functions you plan to use.</p>
<p>First, configure the same table in MySQL and the CUBRID as follows.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;t&nbsp;(code&nbsp;SMALLINT);<br /> INSERT&nbsp;INTO&nbsp;t&nbsp;VALUES(32767);<br /> INSERT&nbsp;INTO&nbsp;t&nbsp;VALUES&nbsp;(32767);</div>
<p>MySQL successfully prints the value because the result value of executing <code>SUM</code> is smaller than the result type. However, as the print type is decided by the input type in CUBRID, an overflow error occurs because the resulting value of executing <code>SUM</code> is larger than the result type.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;SELECT&nbsp;SUM(code)&nbsp;FROM&nbsp;t;<br /> &nbsp;<br /> +-----------+<br /> |&nbsp;sum(code)&nbsp;|<br /> +-----------+<br /> |&nbsp;65534&nbsp;|<br /> +-----------+<br /> &nbsp;<br /> mysql&gt;&nbsp;SHOW&nbsp;COLUMNS&nbsp;FROM&nbsp;ttt;<br /> &nbsp;<br /> +-----------+---------------+------+-----+---------+-------+<br /> |&nbsp;Field&nbsp;|&nbsp;Type&nbsp;|&nbsp;Null&nbsp;|&nbsp;Key&nbsp;|&nbsp;Default&nbsp;|&nbsp;Extra&nbsp;|<br /> +-----------+---------------+------+-----+---------+-------+<br /> |&nbsp;sum(code)&nbsp;|&nbsp;decimal(27,0)&nbsp;|&nbsp;YES&nbsp;|&nbsp;|&nbsp;NULL&nbsp;|&nbsp;|<br /> +-----------+---------------+------+-----+---------+-------+<br /> &nbsp;<br /> csql&gt;&nbsp;SELECT&nbsp;SUM(code)&nbsp;FROM&nbsp;t;<br /> &nbsp;<br /> ERROR:&nbsp;Overflow&nbsp;occurred&nbsp;in&nbsp;addition&nbsp;context.</div>
<p>The following is an example of converting the column type and then executing <code>SUM</code> in CUBRID. You can see that the value has been successfully printed.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;SUM(CAST&nbsp;(CODE&nbsp;AS&nbsp;INT))&nbsp;FROM&nbsp;t;<br /> &nbsp;<br /> sum(&nbsp;cast(code&nbsp;as&nbsp;integer))<br /> ======================<br /> 65534</div>
<p>The following is an example of executing <code>SUM</code> after deciding the column type by considering the SUM result value size in the stage of creating a table in the CUBRID. You can see that the value has been successfully printed.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;t&nbsp;(code&nbsp;INT);<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;t&nbsp;VALUES(32767);<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;t&nbsp;VALUES&nbsp;(32767);<br /> csql&gt;&nbsp;SELECT&nbsp;SUM(code)&nbsp;FROM&nbsp;t;<br /> &nbsp;<br /> sum(code)<br /> ======================<br /> 65534</div>
<h3>Result Value Type of VARCHAR Type</h3>
<p>MySQL and CUBRID both allow for numerical operation when the value of the column that is <code>VARCHAR</code> type is a string consisting of numbers. In this case, the operation result type is <code>DOUBLE</code> for both (however, saving a value that needs numerical operation as a string type is not recommended; it is used just for explanation here).</p>
<p>The following is an example of comparing the query result type of the MySQL to that of CUBRID.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;tbl(col&nbsp;VARCHAR(10));<br /> mysql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES('1'),('2'),('3'),('4'),('5');<br /> mysql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;ttbl&nbsp;AS&nbsp;SELECT&nbsp;SUM(col)&nbsp;FROM&nbsp;tbl;<br /> mysql&gt;&nbsp;SHOW&nbsp;COLUMNS&nbsp;FROM&nbsp;ttbl;<br /> &nbsp;<br /> +----------+--------+------+-----+---------+-------+<br /> |&nbsp;Field&nbsp;|&nbsp;Type&nbsp;|&nbsp;Null&nbsp;|&nbsp;Key&nbsp;|&nbsp;Default&nbsp;|&nbsp;Extra&nbsp;|<br /> +----------+--------+------+-----+---------+-------+<br /> |&nbsp;SUM(col)&nbsp;|&nbsp;double&nbsp;|&nbsp;YES&nbsp;|&nbsp;|&nbsp;NULL&nbsp;|&nbsp;|<br /> +----------+--------+------+-----+---------+-------+<br /> &nbsp;<br /> &nbsp;<br /> csql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;tbl(col&nbsp;VARCHAR(10));<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES('1'),('2'),('3'),('4'),('5');<br /> csql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;ttbl&nbsp;AS&nbsp;SELECT&nbsp;SUM(col)&nbsp;FROM&nbsp;tbl;<br /> csql&gt;&nbsp;;sc&nbsp;ttbl<br /> &nbsp;<br /> &lt;Class&nbsp;Name&gt;<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;ttbl<br /> &lt;Attributes&gt;<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;sum(col)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;DOUBLE</div>
<p>In MySQL, if there is a character (not a number) that exists on the column value, the character is considered as 0 for operation. However, CUBRID prints an error that it cannot convert the character to double type. See the following example.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES('a');<br /> mysql&gt;&nbsp;SELECT&nbsp;SUM(col)&nbsp;FROM&nbsp;tbl;<br /> &nbsp;<br /> +--------+<br /> |&nbsp;SUM(a)&nbsp;|<br /> +--------+<br /> |&nbsp;15&nbsp;|<br /> +--------+<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES('a');<br /> csql&gt;&nbsp;SELECT&nbsp;SUM(col)&nbsp;FROM&nbsp;tbl;<br /> &nbsp;<br /> ERROR:&nbsp;Cannot&nbsp;coerce&nbsp;value&nbsp;of&nbsp;domain&nbsp;"character&nbsp;varying"&nbsp;to&nbsp;domain&nbsp;"double".</div>
<h2>Difference of SQL Syntax</h2>
<h3>Supporting START WITH &hellip; CONNECT BY</h3>
<p>CUBRID supports the <code>START WITH &hellip; CONNECT BY</code> syntax which can express the hierarchy that MySQL does not support as a query. This is a part of Oracle SQL compatibility syntax.</p>
<p>As an example, using the following data we will print "managers" and "juniors" by sorting the result values within the same level in the order of "join date". <code>id</code> is the employee ID of a junior staff and <code>mgrid</code> is the employee ID of a manager.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;tree(id&nbsp;INT,&nbsp;mgrid&nbsp;INT,&nbsp;name&nbsp;VARCHAR(32),&nbsp;birthyear&nbsp;INT);<br /> &nbsp;<br /> &nbsp;<br /> INSERT&nbsp;INTO&nbsp;tree&nbsp;VALUES&nbsp;(1,NULL,'Kim',&nbsp;1963);<br /> INSERT&nbsp;INTO&nbsp;tree&nbsp;VALUES&nbsp;(2,NULL,'Moy',&nbsp;1958);<br /> INSERT&nbsp;INTO&nbsp;tree&nbsp;VALUES&nbsp;(3,1,'Jonas',&nbsp;1976);<br /> INSERT&nbsp;INTO&nbsp;tree&nbsp;VALUES&nbsp;(4,1,'Smith',&nbsp;1974);<br /> INSERT&nbsp;INTO&nbsp;tree&nbsp;VALUES&nbsp;(5,2,'Verma',&nbsp;1973);<br /> INSERT&nbsp;INTO&nbsp;tree&nbsp;VALUES&nbsp;(6,2,'Foster',&nbsp;1972);<br /> INSERT&nbsp;INTO&nbsp;tree&nbsp;VALUES&nbsp;(7,6,'Brown',&nbsp;1981);</div>
<p>MySQL does not support hierarchy statement. Therefore, to print the result value satisfying the above request, you should execute several query statements in the following order.</p>
<p>1)&nbsp;First, print a "level 1" employees whose <code>mgrid</code> is NULL.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;id,&nbsp;mgrid,&nbsp;name,&nbsp;1&nbsp;AS&nbsp;level&nbsp;FROM&nbsp;tree&nbsp;WHERE&nbsp;mgrid&nbsp;IS&nbsp;NULL;&nbsp;</div>
<p>2)&nbsp;Then print the "level 2" employees whose <code>mgrid</code> is 1.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;id,&nbsp;mgrid,&nbsp;name,&nbsp;2&nbsp;AS&nbsp;level&nbsp;FROM&nbsp;tree&nbsp;WHERE&nbsp;mgrid=1;</div>
<p>3)&nbsp;Then print the "level 2" employees whose <code>mgrid</code> is 2.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;id,&nbsp;mgrid,&nbsp;name,&nbsp;2&nbsp;AS&nbsp;level&nbsp;FROM&nbsp;tree&nbsp;WHERE&nbsp;mgrid=2;</div>
<p>4)&nbsp;Then print the level "3 employee" whose <code>mgrid</code> is 6.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;id,&nbsp;mgrid,&nbsp;name,&nbsp;3&nbsp;AS&nbsp;level&nbsp;FROM&nbsp;tree&nbsp;WHERE&nbsp;mgrid=6;</div>
<p>On the contrary, as CUBRID supports the hierarchical queries, a single query statement can be created as follows.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">SELECT&nbsp;id,&nbsp;mgrid,&nbsp;name,&nbsp;LEVEL<br /> FROM&nbsp;tree<br /> START&nbsp;WITH&nbsp;mgrid&nbsp;IS&nbsp;NULL<br /> CONNECT&nbsp;BY&nbsp;PRIOR&nbsp;id=mgrid<br /> ORDER&nbsp;SIBLINGS&nbsp;BY&nbsp;id;<br /> &nbsp;<br /> id&nbsp;mgrid&nbsp;name&nbsp;level<br /> ===============================================<br /> 1&nbsp;null&nbsp;Kim&nbsp;1<br /> 3&nbsp;1&nbsp;Jonas&nbsp;2<br /> 4&nbsp;1&nbsp;Smith&nbsp;2<br /> 2&nbsp;null&nbsp;Moy&nbsp;1<br /> 5&nbsp;2&nbsp;Verma&nbsp;2<br /> 6&nbsp;2&nbsp;Foster&nbsp;2<br /> 7&nbsp;6&nbsp;Brown&nbsp;3</div>
<p>The above code means that a parent node (manager) and the child node (junior staff) should be printed in order of the values of child nodes (junior staffs) with the same level in the order of the <code>id</code>.</p>
<h3>Including Disaggregate Item in the SELECT LIST which Includes Aggregate Function</h3>
<h4>ONLY FULL GROUP BY</h4>
<p>When executing the <code>GROUP BY</code> clause, both MySQL and CUBRID basically allow the disaggregate column that is not included in the <code>GROUP BY</code> clause to be included in the <code>SELECT</code> list. However, the disaggregate column not included in the <code>GROUP BY</code> clause selects the record value which is fetched among various values for the first time. Therefore, note that the value may be different according to the fetch orders of MySQL and CUBRID. For the disaggregate column, it is not clear which proper value should it select among several values. Therefore, enable the <code>ONLY FULL GROUP BY</code> function to not expose the column that is not included in the <code>GROUP BY</code> clause to the <code>SELECT</code> list.</p>
<p>To enable the <code>ONLY FULL GROUP BY</code> function, for MySQL set the <code>sql_mode</code> value in the <code>my.conf</code> configuration file to <code>ONLY_FULL_GROUP_BY</code>. For CUBRID, set the <code>only_full_group_by</code> value in the <code>cubrid.conf</code> configuration file to <code>yes</code>.</p>
<p>From version 8.3.0&nbsp;CUBRID&nbsp;supports the <code>only_full_group_by</code> system parameter. Before 8.3.0, the <code>ONLY FULL GROUP BY</code> function always ran as if it was enabled.</p>
<p>The following example shows the result of executing the <code>GROUP BY</code> clause in MySQL and CUBRID while the <code>ONLY FULL GROUP BY</code> function is enabled.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;sales_tbl<br /> (dept_no&nbsp;int,&nbsp;name&nbsp;VARCHAR(20)&nbsp;PRIMARY&nbsp;KEY,&nbsp;sales_month&nbsp;int,&nbsp;sales_amount&nbsp;int&nbsp;DEFAULT&nbsp;100);<br /> INSERT&nbsp;INTO&nbsp;sales_tbl&nbsp;VALUES<br /> (501,&nbsp;'Stephan',&nbsp;4,&nbsp;DEFAULT),<br /> (201,&nbsp;'George'&nbsp;,&nbsp;1,&nbsp;450),<br /> (201,&nbsp;'Laura'&nbsp;,&nbsp;2,&nbsp;500),<br /> (301,&nbsp;'Max'&nbsp;,&nbsp;4,&nbsp;300),<br /> (501,&nbsp;'Chang'&nbsp;,&nbsp;5,&nbsp;150),<br /> (501,&nbsp;'Sue'&nbsp;,&nbsp;6,&nbsp;150),<br /> (NULL,&nbsp;'Yoka'&nbsp;,4,&nbsp;NULL);<br /> &nbsp;<br /> SELECT&nbsp;dept_no,&nbsp;avg(sales_amount)&nbsp;FROM&nbsp;sales_tbl<br /> GROUP&nbsp;BY&nbsp;dept_no&nbsp;ORDER&nbsp;BY&nbsp;dept_no;<br /> &nbsp;<br /> dept_no&nbsp;avg(sales_amount)<br /> ================================<br /> NULL&nbsp;NULL<br /> 201&nbsp;475<br /> 301&nbsp;300<br /> 501&nbsp;133</div>
<h4>When There Is No GROUP BY Clause, Is it possible to Query Even if the Disaggregate Item Exists in the SELECT LIST?</h4>
<p>MySQL has an item that uses the aggregate function in the <code>SELECT</code> list and executes one value that has been fetched initially when there are several values for another item. However, in this case, CUBRID considers that it cannot decide the value and returns an error.</p>
<p>If there is a disaggregate column by the aggregate function in the <code>SELECT LIST</code>, it means that any value among various values for the column will be randomly selected. Therefore, it is recommended not to execute that kind of query.</p>
<p>Configure identical data in MySQL and CUBRID as shown below.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;tbl(a&nbsp;int,&nbsp;b&nbsp;date);<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES&nbsp;(1,'20000101');<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES&nbsp;(2,'20000102');<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES&nbsp;(3,'20000103');<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;VALUES&nbsp;(4,'20000104');</div>
<p>In this case, the following query can be executed for both MySQL and CUBRID.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">SELECT&nbsp;COUNT(a),&nbsp;DATE_ADD(MAX(b),&nbsp;INTERVAL&nbsp;10&nbsp;DAY)&nbsp;FROM&nbsp;tbl;<br /> &nbsp;<br /> +----------+-----------------------------------+<br /> |&nbsp;COUNT(a)&nbsp;|&nbsp;DATE_ADD(MAX(b),&nbsp;INTERVAL&nbsp;10&nbsp;DAY)&nbsp;|<br /> +----------+-----------------------------------+<br /> |&nbsp;4&nbsp;|&nbsp;2000-01-14&nbsp;|<br /> +----------+-----------------------------------+</div>
<p>In MySQL, when there are several values for column a, the column value of the record which has been fetched for the first time is calculated. In this case, the value of column a is floating so this processing is not proper.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;SELECT&nbsp;COUNT(a),&nbsp;DATE_ADD(MAX(b),&nbsp;INTERVAL&nbsp;a&nbsp;DAY)&nbsp;FROM&nbsp;tbl;<br /> &nbsp;<br /> +----------+----------------------------------+<br /> |&nbsp;COUNT(a)&nbsp;|&nbsp;DATE_ADD(MAX(b),&nbsp;INTERVAL&nbsp;a&nbsp;DAY)&nbsp;|<br /> +----------+----------------------------------+<br /> |&nbsp;4&nbsp;|&nbsp;2000-01-05&nbsp;|<br /> +----------+----------------------------------+</div>
<p>In CUBRID, when there are several values for column a, CUBRID considers that it cannot determine which record column value must be calculated and returns an error.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;COUNT(a),&nbsp;DATE_ADD(MAX(b),&nbsp;INTERVAL&nbsp;a&nbsp;DAY)&nbsp;FROM&nbsp;tbl;<br /> &nbsp;<br /> ERROR:&nbsp;tbl.a&nbsp;is&nbsp;not&nbsp;single&nbsp;valued.&nbsp;Attributes&nbsp;exposed&nbsp;in<br /> aggregate&nbsp;queries&nbsp;must&nbsp;also&nbsp;appear&nbsp;in&nbsp;the&nbsp;group&nbsp;by&nbsp;clause.</div>
<p>When an error is returned as shown above, if you want to make it executable, change the a to <code>MAX(a)</code> or <code>MIN(a)</code> in order to get only one value for the <code>INTERVAL a</code> as shown below. If the value of <code>a</code> is always the same, the result of executing this query in MySQL before changing the <code>a</code>&nbsp;will be same as the result in CUBRID after changing the <code>a</code>.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;COUNT(a),&nbsp;DATE_ADD(MAX(b),&nbsp;INTERVAL&nbsp;MAX(a)&nbsp;DAY)&nbsp;FROM&nbsp;tbl;<br /> &nbsp;<br /> count(a)&nbsp;date_add(max(b),&nbsp;INTERVAL&nbsp;max(a)&nbsp;DAY)<br /> =====================================================<br /> 4&nbsp;01/08/2000</div>
<h3>Using a Reserved Word as Column Name and Table Name</h3>
<p>Both MySQL and CUBRID do not allow reserved words for column name, table name, and alias. To use reserved words for identifiers such as column name, table name, and alias, the reserved words must be enclosed with quotes (<code>"</code> or <code>`</code>). In CUBRID, square brackets (<code>[ ]</code>) are allowed as well as quotes.</p>
<p>Each DBMS supports different reserved words. For example, MySQL does not use <code>ROWNUM</code>, <code>TYPE</code>, <code>NAMES</code>, <code>FILE</code>, and <code>SIZE</code> as reserved words, however, CUBRID uses them all. For more details on CUBRID reserved words, see the online manual <a target="_self" href="/manual/841/en/Reserved%20Words">CUBRID SQL Guide &gt; Reserved Words</a>.</p>
<p>The following example is using reserved words as column names in CUBRID.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;`names`&nbsp;(`file`&nbsp;VARCHAR(255),&nbsp;`size`&nbsp;INT,&nbsp;`type`&nbsp;CHAR(10));<br /> CREATE&nbsp;TABLE&nbsp;"names"&nbsp;("file"&nbsp;VARCHAR(255),&nbsp;"size"&nbsp;INT,&nbsp;"type"&nbsp;CHAR(10));<br /> CREATE&nbsp;TABLE&nbsp;[names]&nbsp;([file]&nbsp;VARCHAR(255),&nbsp;[size]&nbsp;INT,&nbsp;[type]&nbsp;CHAR(10));<br /> SELECT&nbsp;[file],&nbsp;[size],&nbsp;[type]&nbsp;FROM&nbsp;[names];</div>
<h2>Functional Differences</h2>
<h3>Supporting Descending Index</h3>
<p>Technically, in MySQL, you can create a descending (<code>DESC</code>) index. However, the descending index is <b><i>not actually</i></b> created. However, the descending index is actually created in CUBRID.</p>
<p>The following example shows creating the descending index for the <code>ndate</code> column in the CUBRID.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;CREATE&nbsp;INDEX&nbsp;ON&nbsp;test_tbl(no&nbsp;ASC,&nbsp;ndate&nbsp;DESC);</div>
<p>Note that you can create an index in CUBRID as using the <code>REVERSE</code> keyword. Then the index is created in the same order of having <code>DESC</code> in the column.</p>
<p>The following example shows how to use the <code>REVERSE</code> keyword and the <code>DESC</code> keyword in order to create indexes for the <code>ndate</code> column in CUBRID in descending order.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;REVERSE&nbsp;INDEX&nbsp;ON&nbsp;test_tbl(ndate);&nbsp;</div>
<p>... which is same as...</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;INDEX&nbsp;ON&nbsp;test_tbl(ndate&nbsp;DESC);</div>
<p>In MySQL, creating indexes in descending order is not allowed. Therefore, you can create an ascending index after adding an additional column in order to input values in the reverse order. For example, convert the value of DATE"2012-05-18" to the numeric value -20120518 (negative value) and then input an additional column.</p>
<p>The following example shows creating an ascending index for the <code>reverse_ndate</code> column in MySQL.</p>
<p>1)</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;test_tbl(no&nbsp;INT,&nbsp;ndate&nbsp;DATE,&nbsp;reverse_ndate&nbsp;INT);</div>
<p>2)</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;INDEX&nbsp;ON&nbsp;test_tbl(ndate&nbsp;ASC);</div>
<p>3)</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">UPDATE&nbsp;test_tbl&nbsp;SET&nbsp;reverse_ndate&nbsp;=&nbsp;-ndate;</div>
<p>4)</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;INDEX&nbsp;ON&nbsp;test_tbl(reverse_ndate&nbsp;ASC);</div>
<p>However, when you create a descending index in the <code>ndate</code> column, or when you create the <code>reverse_ndate</code> column with reverse order values and then create an ascending index, if one is <code>UPDATE</code> or <code>DELETE</code> and the other is <code>SELECT</code>, <code>UPDATE</code>, or <code>DELETE</code> and both of indexes are simultaneously scanned, the index scan will be made in reverse to each other. This case increases the possibility of a deadlock so you must note this while creating a descending index.</p>
<p>Both MySQL and CUBRID have a bi-directional link between index nodes. Therefore, the ascending index can be used when the query planner determines that the cost of reverse scan using an ascending index will be lower than the sequential scan using the descending index even if <code>ORDER BY DESC</code> is performed in the query, The reverse scan takes the advantage only when the number of records to be scanned is relatively small.</p>
<p>The following example shows using an ascending index when ordering in the descending order in CUBRID.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">csql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;test_tbl(a&nbsp;INT,&nbsp;b&nbsp;char(1024000));<br /> csql&gt;&nbsp;CREATE&nbsp;INDEX&nbsp;i_test_tbl_a&nbsp;ON&nbsp;test_tbl(a);<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;test_tbl&nbsp;(a,&nbsp;b)&nbsp;VALUES&nbsp;(10,&nbsp;'a'),&nbsp;(20,&nbsp;'b'),&nbsp;(30,&nbsp;'c'),&nbsp;(40,&nbsp;'d'),&nbsp;(50,&nbsp;'e'),&nbsp;(60,&nbsp;'f'),&nbsp;(70,&nbsp;'g'),&nbsp;(80,&nbsp;'h'),&nbsp;(90,&nbsp;'i'),&nbsp;(100,&nbsp;'j');<br /> &nbsp;<br /> csql&gt;&nbsp;SELECT&nbsp;a&nbsp;FROM&nbsp;test_tbl&nbsp;WHERE&nbsp;a&nbsp;&gt;&nbsp;70&nbsp;ORDER&nbsp;BY&nbsp;a&nbsp;DESC;</div>
<h3>Supporting ROWNUM</h3>
<p><code>ROWNUM</code> is a function to number the result rows of the <code>SELECT</code> query from 1 in the ascending order and used as a column of a table. With <code>ROWNUM</code>, you can add a serial number for printed records and limit the number of records of the query result by using the conditions of the <code>WHERE</code> clause.</p>
<p>CUBRID supports <code>ROWNUM</code> while MySQL does not. So if you want to add a serial number to the records in MySQL, you should use a session variable. Both MySQL and CUBRID support <code>LIMIT &hellip; OFFSET</code> in order to limit the number of records of the query result. However, the following discussion will address <code>ROWNUM</code>.</p>
<p>The following two examples are used to print serial numbers for the records of the query result.</p>
<p>In MySQL, process <code>ROWNUM</code> by using a session variable.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">mysql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;test_tbl(col&nbsp;CHAR(1));<br /> mysql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;test_tbl&nbsp;VALUES&nbsp;('a'),&nbsp;('b'),('c'),('d');<br /> mysql&gt;&nbsp;SELECT&nbsp;@rownum&nbsp;:=&nbsp;@rownum&nbsp;+&nbsp;1&nbsp;as&nbsp;rownum,&nbsp;col&nbsp;FROM&nbsp;test_tbl&nbsp;WHERE&nbsp;(@rownum&nbsp;:=&nbsp;0)=0;<br /> +--------+------+<br /> |&nbsp;rownum&nbsp;|&nbsp;col&nbsp;|<br /> +--------+------+<br /> |&nbsp;1&nbsp;|&nbsp;a&nbsp;|<br /> |&nbsp;2&nbsp;|&nbsp;b&nbsp;|<br /> |&nbsp;3&nbsp;|&nbsp;c&nbsp;|<br /> |&nbsp;4&nbsp;|&nbsp;d&nbsp;|<br /> +--------+------+</div>
<p>In CUBRID, execute <code>ROWNUM</code>.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;ROWNUM,&nbsp;col&nbsp;FROM&nbsp;test_tbl;<br /> &nbsp;<br /> rownum&nbsp;col<br /> ============================================<br /> 1&nbsp;'a'<br /> 2&nbsp;'b'<br /> 3&nbsp;'c'<br /> 4&nbsp;'d'</div>
<p>Oracle also supports <code>ROWNUM</code>. However, in Oracle, <code>ROWNUM</code> is executed first in the syntax including the <code>GROUP BY</code> clause and the <code>ORDER BY</code> clause, and then the <code>GROUP BY</code> clause and the <code>ORDER BY</code> clause are executed later. Therefore, the printed <code>ROWNUM</code>&gt; is not in the sorting order. So, if you want to execute the query including the <code>GROUP BY</code> clause or the <code>ORDER BY</code> clause first, make the query as a subquery of the <code>FROM</code> clause and execute <code>ROWNUM</code> for that.</p>
<p>The following example shows printing <code>ROWNUM</code> in the order of <code>ORDER BY</code> in Oracle.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;ROWNUM,&nbsp;contents&nbsp;FROM&nbsp;(SELECT&nbsp;contents&nbsp;ORDER&nbsp;BY&nbsp;date)&nbsp;AS&nbsp;subtbl;</div>
<p>CUBRID supports the <code>GROUPBY_NUM()</code> function and the <code>ORDERBY_NUM()</code> function which produces the result of <code>GROUP BY</code> and <code>ORDER BY</code> printed in the order.</p>
<p>In the example of MySQL, since there was no <code>ROWNUM</code> and a session variable <code>@rownum</code> has been used, the final order result was used to print the result even though there were the <code>GROUP BY</code> clause and the <code>ORDER BY</code> clause. However, since CUBRID uses <code>ROWNUM</code>, you should use the <code>GROUPBY_NUM()</code> function and the <code>ORDERBY_NUM()</code> function if you want to keep the order of printed numbers without using a subquery.</p>
<p>The following two examples show the difference between two cases; one uses&nbsp;<code>GROUPBY_NUM()</code> function while the other does not. Also it&nbsp;shows limiting the number of result records by using <code>ROWNUM</code> and then executing the <code>GROUP BY</code> order in the corresponding records.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;ROWNUM,&nbsp;host_year,&nbsp;MIN(score)&nbsp;FROM&nbsp;history<br /> WHERE&nbsp;ROWNUM&nbsp;BETWEEN&nbsp;1&nbsp;AND&nbsp;5&nbsp;GROUP&nbsp;BY&nbsp;host_year;<br /> &nbsp;<br /> rownum&nbsp;host_year&nbsp;min(score)<br /> =========================================================<br /> 6&nbsp;2000&nbsp;'03:41.0'<br /> 6&nbsp;2004&nbsp;'01:45.0'</div>
<p>The following example shows limiting the number of result records by using <code>GROUPBY_NUM()</code> for the record set ordered by using <code>GROUP BY</code>.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SELECT&nbsp;GROUPBY_NUM(),&nbsp;host_year,&nbsp;MIN(score)&nbsp;FROM&nbsp;history<br /> GROUP&nbsp;BY&nbsp;host_year&nbsp;HAVING&nbsp;GROUPBY_NUM()&nbsp;BETWEEN&nbsp;1&nbsp;AND&nbsp;5;<br /> &nbsp;<br /> groupby_num()&nbsp;host_year&nbsp;min(score)<br /> ==================================================<br /> 1&nbsp;1968&nbsp;'8.9'<br /> 2&nbsp;1980&nbsp;'01:53.0'<br /> 3&nbsp;1984&nbsp;'13:06.0'<br /> 4&nbsp;1988&nbsp;'01:58.0'<br /> 5&nbsp;1992&nbsp;'02:07.0'</div>
<p>The following two examples show the difference between two cases; one uses the <code>FOR ORDERBY_NUM()</code> function while the other does not. Also it&nbsp;shows limiting the number of result records by using <code>ROWNUM</code> and then executing the <code>ORDER BY</code> order in the corresponding records.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">SELECT&nbsp;athlete,&nbsp;score&nbsp;FROM&nbsp;history<br /> WHERE&nbsp;ROWNUM&nbsp;BETWEEN&nbsp;3&nbsp;AND&nbsp;5&nbsp;ORDER&nbsp;BY&nbsp;score;<br /> &nbsp;<br /> athlete&nbsp;score<br /> ============================================<br /> 'Thorpe&nbsp;Ian'&nbsp;'01:45.0'<br /> 'Thorpe&nbsp;Ian'&nbsp;'03:41.0'<br /> 'Hackett&nbsp;Grant'&nbsp;'14:43.0'</div>
<p>The following example shows limiting the number of the result records by using <code>ORDERBY_NUM()</code> for the record set ordered by using <code>ORDER BY</code>.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">SELECT&nbsp;athlete,&nbsp;score&nbsp;FROM&nbsp;history<br /> ORDER&nbsp;BY&nbsp;score&nbsp;FOR&nbsp;ORDERBY_NUM()&nbsp;BETWEEN&nbsp;3&nbsp;AND&nbsp;5;<br /> athlete&nbsp;score<br /> ============================================<br /> 'Luo&nbsp;Xuejuan'&nbsp;'01:07.0'<br /> 'Rodal&nbsp;Vebjorn'&nbsp;'01:43.0'<br /> 'Thorpe&nbsp;Ian'&nbsp;'01:45.0'</div>
<h3>Difference of AUTO_INCREMENT</h3>
<p>What should we do if we must know the primary key created after executing <code>INSERT</code>?</p>
<p>The <code>LAST_INSERT_ID()</code> function is used to get the last value which has been INSERTED to the column after INSERTING a value to the <code>AUTO_INCREMENT</code> attribute column within a program connected to the database. For example, you can use it when you want to <code>INSERT</code> a value in the table where the primary key is included and then insert a foreign key value from another table in the table as the value of the <code>LAST_INSERT_ID()</code> function. Both MySQL and CUBRID support this function. However, an error may occur in some specific cases, so I recommend you to directly get the <code>AUTO_INCREMENT</code> value rather than using the <code>LAST_INSERT_ID()</code> function in CUBRID. Below I will explain these specific cases.</p>
<p>In MySQL or CUBRID, when a record is INSERTED to the table which includes the <code>AUTO_INCREMENT</code> attribute column, the value of the column is automatically increased by <b>1</b>.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;tbl&nbsp;(col&nbsp;INT&nbsp;AUTO_INCREMENT&nbsp;PRIMARY&nbsp;KEY,&nbsp;col2&nbsp;INT);<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;(col2)&nbsp;VALUES(1);<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;(col2)&nbsp;VALUES(2);<br /> INSERT&nbsp;INTO&nbsp;tbl&nbsp;(col2)&nbsp;VALUES(3);<br /> SELECT&nbsp;LAST_INSERT_ID();<br /> &nbsp;<br /> +------------------+<br /> |&nbsp;last_insert_id()&nbsp;|<br /> +------------------+<br /> |&nbsp;3&nbsp;|<br /> +------------------+</div>
<p>After executing <code>INSERT</code> to <b>Table A</b>, if you are trying to <code>INSERT</code> a foreign key value to <b>Table B</b> (which refers to <i>Table A</i>) by using the <code>LAST_INSERT_ID()</code> function value and an application or a server is unexpectedly and abnormally terminated, what will happen? (<i>on the assumption that the procedure is one transaction, of course</i>)</p>
<p>In MySQL, the procedure, including all values increased by the <code>AUTO_INCREMENT</code>, is rolled back. If the <code>AUTO_INCREMENT</code> value of the record to execute INSERT newly was 3, the <code>AUTO_INCREMENT</code> value becomes 3 again after transaction rollback. However, in CUBRID, the increase of <code>AUTO_INCREMENT</code> value is not affected by transaction rollback. Therefore, the existing <code>AUTO_INCREMENT</code> value of the record was 3, the <code>AUTO_INCREMENT</code> value becomes 4 after transaction rollback.</p>
<p>In MySQL, you can get the value entered in the column specified as the <code>AUTO_INCREMENT</code> by using the <code>LAST_INSERT_ID()</code> function only. However, in CUBRID, you can directly get the <code>AUTO_INCREMENT</code> value <i>without</i> using the <code>LAST_INSERT_ID()</code> function. It is just like using the <a target="_self" href="/manual/841/en/CREATE%20SERIAL">SERIAL</a> object of CUBRID because <code>AUTO_INCREMENT</code> has been implemented as a SERIAL object in CUBRID. When an <code>AUTO_INCREMENT</code> column is created in CUBRID, the name of the SERIAL object of the column is specified as "<span style="font-family: monospace;"><b>&lt;table name&gt;_ai_&lt;column name&gt;</b></span>" internally.</p>
<p>The following example shows calculating the next value and the current value on the <code>AUTO_INCREMENT</code> column in CUBRID.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;&lt;table&nbsp;name&gt;_ai_&lt;column&nbsp;name&gt;.NEXT_VALUE;<br /> SELECT&nbsp;&lt;table&nbsp;name&gt;_ai_&lt;column&nbsp;name&gt;.CURRENT_VALUE;</div>
<p>In the current CUBRID version, the <code>LAST_INSERT_ID()</code> function may malfunction in the following specific cases. Therefore, I recommend that you directly get the <code>AUTO_INCREMENT</code> value rather than using the <code>LAST_INSERT_ID()</code> function. The <code>LAST_INSERT_ID()</code> is one of the session variables that manage the values in each database connection unit. If a broker is switched in CUBRID or if a failover occurs because of master node switch in the HA environment, it may malfunction so the desired value may not be printed. It will be modified later.</p>
<p>In the following example, a session variable has been used as an intermediate storage for getting the next <code>AUTO_INCREMENT</code> value. For an application, use the variable in the program as an intermediate storage. Note that the SERIAL in CUBRID has the same function as the SEQUENCE in Oracle.</p>
<p>The following example shows executing <code>INSERT</code> the primary key and the foreign key respectively after getting the <code>NEXT_VALUE</code> value of the SERIAL.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;tblPK&nbsp;(col&nbsp;INT&nbsp;AUTO_INCREMENT&nbsp;PRIMARY&nbsp;KEY,&nbsp;col2&nbsp;INT);<br /> &nbsp;<br /> csql&gt;&nbsp;CREATE&nbsp;TABLE&nbsp;tblFK&nbsp;(colfk&nbsp;INT&nbsp;AUTO_INCREMENT&nbsp;PRIMARY&nbsp;KEY,&nbsp;colfk2&nbsp;INT,&nbsp;CONSTRAINT&nbsp;fk_col&nbsp;FOREIGN&nbsp;KEY&nbsp;(colfk2)&nbsp;REFERENCES&nbsp;tblPK&nbsp;(col));</div>
<p>At this time, the name of SERIAL created by the <code>tblPK AUTO_INCREMENT</code> is <code>tblpk_ai_col</code>. You can check that by executing:</p>
<p>&nbsp;</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;name&nbsp;FROM&nbsp;db_serial&nbsp;WHERE&nbsp;class_name='tblpk';</div>
<p>Directly insert the <code>tblPK AUTO_INCREMENT</code> value in the <code>tblPK</code> primary key and the <code>tblPK</code> foreign key.&nbsp;You don't need to save the <code>tblFK colfk</code> in an additional variable since it is not referred to by the other tables.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">csql&gt;&nbsp;SET&nbsp;@a=(SELECT&nbsp;tblpk_ai_col.NEXT_VALUE);<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;tblPK&nbsp;VALUES&nbsp;(@a,10);<br /> csql&gt;&nbsp;INSERT&nbsp;INTO&nbsp;tblFK(colfk2)&nbsp;VALUES&nbsp;(@a);</div>
<p>The following example shows how to write the code that returns the primary key value when a value executes <code>INSERT</code> in the <code>tblPK</code> in the iBatis sqlmaps. For the <code>tblFK</code> foreign key, use the returned value from the "insert" object.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT&nbsp;tblpk_ai_col.NEXT_VALUE<br /> <br /> <br /> INSERT&nbsp;INTO&nbsp;tblPK(col,&nbsp;col2)&nbsp;VALUES&nbsp;(#id#,&nbsp;#col2#)</div>
<h3>Length Difference of String Functions Such as CHAR_LENGTH</h3>
<p>For a string function, MySQL calculates the length <i>based on the number of characters</i>. However, CUBRID calculates it <b>based on the character byte length</b>. This difference depends on the database support of the character set. MySQL supports the character set and calculates the length based on the character set. However, CUBRID does not support the character set and considers all data as bytes, so it calculates the length based on the byte length.</p>
<p>For UTF-8 character set, the byte length of one Korean character is 3 bytes. Therefore, the return value of the <a target="_self" href="/manual/841/en/CHAR_LENGTH%20Function">CHAR_LENGTH</a> will be three times the number of characters in CUBRID in case of Korean characters. In CUBRID, the functions related to the character length, such as <code>POSITION</code>, <code>RPAD</code> and <code>SUBSTR</code>, use the byte length as the input parameter, not the character length. When specifying the length of <code>CHAR</code> or <code>VARCHAR</code> also as creating a table, the byte length is used.</p>
<blockquote class="q4">
<p>In the second half of this year&nbsp;we will release a new version of CUBRID under the code name "<b>Apricot</b>" with full Unicode support where the&nbsp;calculation will be made based on the character length, <i>not</i> the byte length.</p>
</blockquote>
<p>The following example shows the result of executing <code>CHAR_LENGTH</code> in the CUBRID.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Bash" editor_component="code_highlighter">SELECT&nbsp;CHAR_LENGTH('CUBRIDa');<br /> &nbsp;<br /> char&nbsp;length('CUBRIDa')<br /> ==================<br /> 13</div>
<h3>Cursor Holdability After Commit while the Record Set Fetch</h3>
<p>Cursor holdability is that an application holds the record set of the query result to fetch the next record even after performing an explicit commit or an automatic commit. In this connection, the JDBC specification supports <code>ResultSet.HOLD_CURSORS_OVER_COMMIT</code> and <code>ResultSet.CLOSE_CURSORS_AT_COMMIT</code>. Both MySQL and CUBRID ignore the configuration that uses the <code>conn.setHoldability()</code> function. MySQL always runs as <code>HOLD_CURSORS_OVER_COMMIT</code>, and CUBRID always runs as <code>CLOSE_CURSORS_AT_COMMIT</code>.</p>
<blockquote class="q4">
<p>In MySQL, when <code>conn.getHoldability()</code> is called, the <code>CLOSE_CURSORS_AT_COMMIT</code> value is returned, which is opposite to the current operation. Refer to the source code of <span style="font-family: monospace;">mysql-connector-java-5.1.20.tar.gz</span> of <a target="_self" href="http://www.mysql.com/downloads/connector/j/">http://www.mysql.com/downloads/connector/j/</a>.</p>
</blockquote>
<p>In other words, MySQL holds the cursor even if a commit occurs during fetch, and CUBRID closes the cursor when a commit occurs during fetch. Therefore, to hold the cursor while fetching the record set <code>SELECT</code> in CUBRID 8.4.x or lower versions, set the auto commit mode to <code>FALSE</code> and fetch the record set while not committing a transaction. (In the new version of CUBRID "Apricot" to be released in the second half of this year, the cursor will be held regardless of commit unless the record set is closed.)</p>
<p>The following schema is used below to demonstrate how cursor holdability works.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">CREATE&nbsp;TABLE&nbsp;tbl1(a&nbsp;INT);<br /> INSERT&nbsp;INTO&nbsp;tbl1&nbsp;VALUES&nbsp;(1),(2),(3),(4);<br /> CREATE&nbsp;TABLE&nbsp;tbl2(a&nbsp;INT);</div>
<p>The following example shows a program that there are four data rows on <i>Table tbl1</i> while the auto commit is set to true (<code>autocommit=true</code>) and <code>SELECT</code> and <code>INSERT</code> the data to <i>Table tbl2</i>. After executing that, MySQL holds the cursor even when auto commit is executed, so four data will be <code>INSERT</code> in <i>Table tbl2</i>. On the contrary, CUBRID closes the cursor because of auto commit when <code>INSERT</code> is executed. Therefore, no more fetch is executed and only one data will be <code>INSERT</code> in <i>Table tbl2</i>.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Java" editor_component="code_highlighter">public&nbsp;static&nbsp;void&nbsp;executeTr(Connection&nbsp;conn)&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;stmt=conn.createStatement();<br /> &nbsp;&nbsp;&nbsp;&nbsp;stmt2=conn.createStatement();<br /> &nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;q1&nbsp;=&nbsp;"SELECT&nbsp;a&nbsp;FROM&nbsp;tbl1";<br /> &nbsp;&nbsp;&nbsp;&nbsp;String&nbsp;q2;<br /> &nbsp;&nbsp;&nbsp;&nbsp;ResultSet&nbsp;rs&nbsp;=&nbsp;stmt.executeQuery(q1);<br /> &nbsp;&nbsp;&nbsp;&nbsp;conn.commit();<br /> &nbsp;&nbsp;&nbsp;&nbsp;while&nbsp;(rs.next())<br /> &nbsp;&nbsp;&nbsp;&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;int&nbsp;a&nbsp;=&nbsp;rs.getInt("a");<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;q2="INSERT&nbsp;INTO&nbsp;tbl2&nbsp;VALUES&nbsp;("+a+")";<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;stmt2.executeUpdate(q2);<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;stmt.close();<br /> &nbsp;&nbsp;&nbsp;&nbsp;stmt2.close();<br /> }</div>
<h2>Conclusion</h2>
<p>In this article I have reviewed many (but not all) differences between MySQL and CUBRID, which I recommend users to know before they switch from MySQL to CUBRID. It is important to fully understand these differences to successfully apply CUBRID to services.</p>
<p>By Donghyeon Lee, NHN Business Platform DBMS Development Lab</p>
<blockquote class="q4">
<p><b>From the author:</b></p>
<p>"Is your baby a boy?" I am a father who is asked this question even though my little baby is a girl. I will do my best until the day CUBRID becomes a DBMS that is very familiar to users.</p>
</blockquote>
<h2>References</h2>
<ol>
<li>MySQL 5.5 Reference Manual: http://dev.mysql.com/doc/refman/5.5/en/create-index.html</li>
<li>MySQL Performance Blog: http://www.mysqlperformanceblog.com/2006/05/09/descending-indexing-and-loose-index-scan/</li>
<li>Interface ResultSet (Java Platform SE 7): http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html</li>
</ol>]]></description>
                        <pubDate>Mon, 30 Jul 2012 17:07:22 +0900</pubDate>
                        <category>CUBRID Internals</category>
                        <category>MySQL</category>
                        <category>CUBRID Comparison</category>
                        <category>SQL</category>
                        <category>Migration</category>
                                </item>
        										        <item>
            <title>What is SPDY? Deployment Recommendations</title>
            <dc:creator>Se Hoon Park</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/what-is-spdy-deployment-recommendations/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/what-is-spdy-deployment-recommendations/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/what-is-spdy-deployment-recommendations/#comment</comments>
                                    <description><![CDATA[<p>&nbsp;</p>
<p>In the middle of this year a <a target="_self" href="http://news.cnet.com/8301-1023_3-57472566-93/facebook-endorses-googles-spdy-networking-protocol/">news report</a>&nbsp;announced that Facebook is planning to support&nbsp;Google's&nbsp;SPDY protocol in large scale, and that they are already implementing SPDY/v2. Here is the <a target="_self" href="http://lists.w3.org/Archives/Public/ietf-http-wg/2012JulSep/0251.html">official response</a> from Facebook that I have found on this topic.&nbsp;Among various efforts devised and suggested&nbsp;by Google&nbsp;to make the Web faster,&nbsp;I think SPDY will be&nbsp;the one to become a new industry standard, and it will be included in HTTP/2.0.</p>
<p>As an acronym of <b>SPeeDy</b>, SPDY is a new protocol Google has suggested as a part of its efforts to "<i>make the Web faster</i>." It was suggested as a protocol to use the present and future Internet environment more efficiently by addressing the disadvantages of HTTP devised in the early Internet environment.</p>
<p>In this article, I will provide a brief introduction to the features and merits of SPDY. I will explain about the state of SPDY&nbsp;support,&nbsp;and what to do and what to consider when it is introduced.</p>
<h2>When was the latest version of HTTP released?</h2>
<p>HTTP version 0.9 was first announced in 1991, and HTTP 1.0 and 1.1 were released in 1996 and 1999, respectively, and since then nothing has been changed in HTTP for the last 10 years. These days, however, a webpage has a size 20 times bigger with 20 times more HTTP requests than a webpage in the 1990s. The <b>Table 1</b>&nbsp;below shows the data quoted from Google I/O 2012.</p>
<table style="margin: 0 auto;">
<caption><b>Table 1: Comparison of Mean Webpage Size in 2010 and 2012.</b></caption> <thead> 
<tr>
<th></th> <th>Mean page size</th> <th>Mean # of requests per page</th> <th>Mean No. of domains</th>
</tr>
</thead> 
<tbody>
<tr>
<td>2010 Nov. 15</td>
<td>702 KB</td>
<td>74</td>
<td>10</td>
</tr>
<tr>
<td>2012 May 5</td>
<td>1059 KB</td>
<td>84</td>
<td>12</td>
</tr>
</tbody>
</table>
<p>The capacity of the Yahoo! main page in 1996 was 34 KB. That is only 1/30 of the mean webpage capacity in 2012.&nbsp;There is a significant gap even between 2010 and 2012, let alone between the 1990s and 2012. This is because the mean page size and number of requests are ever increasing as User UX is becoming more and more sophisticated along with the dissemination of high-speed Internet.</p>
<p>The characteristics of today's webpage have changed from that in the past as follows:</p>
<p>&nbsp;</p>
<ul>
<li>Consists of much more resources.</li>
<li>Uses multiple domains.</li>
<li>Operates more dynamically.</li>
<li>Emphasizes security more.</li>
</ul>
<p>&nbsp;</p>
<p>Considering how today's web environment is different from the past, Google has announced the SPDY protocol, which complemented the disadvantages of HTTP.&nbsp;SPDY focuses especially on resolving the problem of load latency.</p>
<h2>Features of SPDY</h2>
<p>The <b>Figure 1</b>&nbsp;below shows the layers of SPDY compared to the traditional TCP/IP layer model.</p>
<p style="text-align: center;"><img editor_component="image_link" height="266" width="567" alt="http_vs_spdy.png" src="/files/attach/images/220547/186/504/http_vs_spdy.png" /></p>
<p style="text-align: center;"><b>Figure 1: HTTP vs. SPDY.</b></p>
<p>The features of SPDY can be summarized as follows:</p>
<p>&nbsp;</p>
<ul>
<li>Always operates on Transport Layer Security (TLS).       
<ul>
<li><i>Transport Layer Security (TLS) is the next version of Secure Sockets Layer (SSL). TLS and SSL are sometimes used to refer to the same protocol because they are the name for two different versions of the same protocol, and this also applies to this article.</i></li>
</ul>
</li>
<li>Therefore, SPDY applies only to websites written with HTTPS.</li>
</ul>
<p>&nbsp;</p>
<h3>HTTP Header compression</h3>
<p>As HTTP headers have many redundant contents whenever a request is made, you can improve the performance significantly just by compressing headers. According to a report by Google, it is possible to reduce the size by 10-35% by compressing HTTP headers even in the initial request, and reduce the size of headers by 80-97% when requests are made several times (long-lived connection). Also, when the upload bandwidth is relatively small, as in mobile devices, this HTTP header compression is more useful. These days, as the HTTP header is 2 KB on average and is growing bigger, the merit of compressing HTTP headers is expected to grow in the future.</p>
<h3>Binary protocol</h3>
<p>As it uses binary framing rather than text-based framing, it provides faster parsing and is less sensitive to errors.</p>
<h3>Multiplexing</h3>
<p>SPDY handles multiple independent streams in a single connection concurrently. For this reason, unlike HTTP which handles one request at a time in a single connection with response to requests made consecutively, SPDY handles multiple requests and responses concurrently with a small number of connections. In addition, unlike HTTP pipelining in which if one response is delayed, the others are all delayed, SPDY handles each request and response independently as it uses a FIFO queue.</p>
<h3>Full-duplex interleaving and stream prioritization</h3>
<p>As SPDY allows interleaving in which one stream in a process can be interleaved with another and stream prioritization, data of higher priority can jump into the process of transportation of data of lower priority and can be transported earlier.</p>
<h3>Server push</h3>
<p>Servers can push content without client requests unlike Comet and Long-polling. Unlike methods such as inlining, SPDY supports resource caching and uses the same bandwidth as that of inlining or a smaller bandwidth. When implementing server push, however, you need to implement additional Web server application logic.</p>
<h3>No need to re-write a website</h3>
<p>Except for some features that require additional implementation, such as server push, you don't need to change a website itself to apply SPDY. However, the browser and the server should support SPDY. SPDY can be applied completely and transparently to browser users. In other words, there is no protocol scheme like "spdy://." The browser also displays nothing with regard to the use of the SPDY protocol.</p>
<p>Based on such characteristics of SPDY, the following table shows the difference between HTTP and SPDY.</p>
<table style="margin: 0 auto;">
<caption><b>Table 2: HTTP/1.1 vs. SPDY.</b></caption> <thead> 
<tr>
<th></th> <th>HTTP</th> <th>SPDY</th>
</tr>
</thead> 
<tbody>
<tr>
<td>Secure</td>
<td>Not Default</td>
<td>Default</td>
</tr>
<tr>
<td>Header Compression</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>Multiplexing</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>Full-Duplex</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>Prioritization</td>
<td>No, (instead, a browser employs heuristics.)</td>
<td>Yes</td>
</tr>
<tr>
<td>Server push</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>DNS Lookup</td>
<td>More</td>
<td>Less</td>
</tr>
<tr>
<td>Connections</td>
<td>More</td>
<td>Less</td>
</tr>
</tbody>
</table>
<p>Therefore, it can be said that SPDY is a protocol designed to use TCP connection more efficiently by improving the data transfer format and connection management of HTTP.&nbsp;As a result of such efforts, in a test with the top 25 websites, SPDY worked 39-55% faster compared to HTTP + SSL.</p>
<h2>Why does SPDY need TLS?</h2>
<p><b>Why does SPDY use TLS even though using TLS causes latency due to encryption/decryption?</b> Google's <a target="_self" href="http://www.chromium.org/spdy/spdy-whitepaper/">SPDY Whitepaper</a> states the following answer for this question:&nbsp;</p>
<blockquote class="q1">
<p><i>"In the long term, the importance of web security will be emphasized more and more, and thus we want to get a better security in the future by specifying TLS as a sub protocol of SPDY.&nbsp;We need TLS for compatibility with the current network infrastructure. In other words, we need it to prevent any compatibility issue with the communication going through the existing proxy.&rdquo;</i></p>
</blockquote>
<p>Despite this reason, when looking into the implementation of the actual SPDY, you can see that SPDY depends much on TLS's Next Protocol Negotiation (NPN) extension mechanism. This TLS NPN extension mechanism determines whether a request coming from Port 443 is SPDY or not and identifies the version of SPDY used by the request, to decide whether to use SPDY to handle the next communication. Without TLS NPN, you will need to get additional RTT to use SPDY.</p>
<h2>Efforts for Standardization&nbsp;</h2>
<p>SPDY is being developed as an open networking protocol and has been suggested to IETF as a method of HTTP/2.0. SPDY is a sub-project of the Google Chromium project, and thus Chromium client implementations and server tools are all being developed with open sources.</p>
<h2>Future of SPDY</h2>
<p>Most recently SPDY Draft 3 was released, and SPDY Draft 4 is under development.&nbsp;The features likely to be added in Draft 4 are as follows:&nbsp;</p>
<p>&nbsp;</p>
<ul>
<li>Name resolution push</li>
<li>Certificate data push</li>
<li>Explicit proxy support</li>
</ul>
<p>&nbsp;</p>
<p>The final goal of SPDY is to provide a page within "a single connection setup time + bytes/bandwidth time".</p>
<h2>Browsers, Servers, Libraries and Web Services Supporting SPDY</h2>
<p>Currently a variety of browsers and servers support SPDY, and Google, who originally suggested SPDY, already provides almost all of its services with SPDY. The browsers, servers, libraries and services supporting SPDY are as follows.</p>
<h3>Browsers Supporting SPDY</h3>
<p>As of July 2012, the following is a list of browsers which support the SPDY protocol.</p>
<h4>Google Chrome/Chromium</h4>
<p>Chrome and Chromium have supported SPDY from their initial version. If you enter the following URI, you can inspect SPDY sessions in Chrome/Chromium: <a target="_self" href="chrome://net-internals/#events&amp;q=type:SPDY_SESSION%20is:active">chrome://net-internals/#events&amp;q=type:SPDY_SESSION%20is:active</a>. For example, if you visit <a target="_self" href="http://www.gmail.com">www.gmail.com</a>, in the&nbsp;<a target="_self" href="chrome://net-internals/#events&amp;q=type:SPDY_SESSION%20is:active">chrome://</a>&nbsp;tab&nbsp;you can see multiple SPDY sessions being created. Android mobile Chrome also supports SPDY.</p>
<h4>Firefox 11 and later versions</h4>
<p>While added in version 11, SPDY was not enabled by default until Firefox version 13. If you enter <i>about:config</i> in FF, which is the URI for Firefox settings, and see <code>network.http.spdy.enabled</code>, you can check whether support for SPDY has been enabled or not. Android mobile Firefox 14 also supports SPDY.</p>
<h4>Amazon Silk</h4>
<p>The Silk browser equipped in Kindle Fire, an Android-based e-book reader from Amazon, also supports SPDY. It communicates with Amazon EC2 service by using SPDY.</p>
<h4>Default browser of Android 3.0 and higher</h4>
<p>The default browser of Android 3.0 (Honeycomb) and 4.0 (Ice Cream Sandwich) supports SPDY.</p>
<p>For more information about which browsers support SPDY, see&nbsp;<a target="_self" href="http://caniuse.com/spdy">http://caniuse.com/spdy</a>.</p>
<h3>Servers and Libraries Supporting SPDY</h3>
<p>Support for SPDY is being vitalized mainly by major web servers and application servers, and a variety of libraries that implement SPDY are also being developed.</p>
<h4>Nginx</h4>
<p>Nginx released a beta version of SPDY module on June 15, 2012, and has continuously provided <a target="_self" href="http://nginx.org/patches/spdy">patches</a>. See <a target="_self" href="http://www.slideshare.net/profyclub_ru/spdy-146-14829935">SPDY: 146% faster</a>&nbsp;Slideshare presentation by Nginx team to learn more about it.</p>
<h4>Jetty</h4>
<p>Jetty also provides the <a target="_self" href="http://wiki.eclipse.org/Jetty/Feature/SPDY">SPDY module</a>.</p>
<h4>Apache</h4>
<p>The <a target="_self" href="http://code.google.com/p/mod-spdy/">SPDY module for Apache 2.2</a> is also being developed.</p>
<h4>Libraries</h4>
<p>In addition, SPDY implementation structures for Python, Ruby and node.js servers have already been developed or are currently being developed.&nbsp;There are a variety of versions for <a target="_self" href="http://libspdy.org/index.html">SPDY C library</a>, including <b>libspdy</b>, <b>spindly</b> and <b>spdylay</b>.&nbsp;A library to use <a target="_self" href="https://github.com/sorced-jim/SPDY-for-iPhone">SPDY on iOS</a> is also being developed.</p>
<h4>Netty</h4>
<p>Netty began to provide <a target="_self" href="https://netty.io/Blog/Netty+331+released+-+SPDY+Protocol+%21">SPDY package</a> from its 3.3.1 version released in 2012.</p>
<h4>Tomcat</h4>
<p>SPDY support in&nbsp;Tomcat is currently under development and it should come in Tomcat version 8.</p>
<h3>Services Using SPDY</h3>
<p>As mentioned earlier, Google has already converted almost all of its services, including search, Gmail and Google plus, into HTTPS, and provides them through SPDY. In addition, when Google App Engine uses&nbsp;HTTPS, it also supports SPDY. Twitter also uses SPDY when providing service via&nbsp;HTTPS.</p>
<p>However, among numerous Web sites on the Internet, only a few websites use SPDY. According to a <a target="_self" href="http://news.netcraft.com/archives/2012/05/02/may-2012-web-server-survey.html">survey by Netcraft</a> conducted in May 2012, of a total of 660 million websites only 339 websites are currently using SPDY. In other words, except for Google and Twitter, there is hardly any major website using SPDY.</p>
<h2>When SPDY is Not Very Efficient</h2>
<p>SPDY is not always fast. Sometimes you may not get any performance improvement with SPDY. Such situations are as follows.</p>
<h3>When using only HTTP</h3>
<p>As SPDY always requires SSL, you need additional SSL handshake time. Therefore, when you convert an HTTP site into HTTPS to support SPDY, you may not obtain distinct performance improvement due to SSL handshake.</p>
<h3>When there are too many domains</h3>
<p><b>SPDY operates by domain.</b> This means that it requires as many connections as the number of domains, and that request multiplexing is available only in a single domain. Moreover, as it is difficult to make all domains support SPDY, you may not get the merits of SPDY when there are too many domains. Especially when a CDN does not support SPDY, you may not expect the performance improvement with SPDY.</p>
<h3>When HTTP is not the bottleneck</h3>
<p>For most pages, HTTP is not the bottleneck. For example, in cases when a resource can be downloaded only after another resource is downloaded, SPDY will not be that effective.</p>
<h3>When Round-Trip-Time (RTT) is low</h3>
<p>SPDY is more efficient when RTT is high.&nbsp;When RTT is very low, for example,&nbsp;in communications between servers within IDC, SPDY has few merits.</p>
<h3>When resource in a page is very small</h3>
<p>For pages with six or fewer resources, SPDY has few merits because the value of reusing connection is not significant.</p>
<h2>Things to Do to Introduce SPDY&nbsp;</h2>
<p>When you introduce SPDY, you need to carry out the following tasks to apply it most efficiently.</p>
<h3>Application Level</h3>
<h4>Use only one connection</h4>
<p>For better SPDY performance and more efficient use of Internet resources, you need to use as few connections as possible. If you use a small number of connections when using SPDY, you can see the benefits such as putting data into packets in a better way, getting better header compression efficiency, reducing the frequency of checking the status of connection and reducing the frequency of handshake. Also, in terms of Internet resources, with a small number of connections, you can also have a more efficient TCP and reduce Bufferbloat.</p>
<blockquote class="q4">
<p><b>Bufferbloat</b> is a phenomenon in which excess buffering of packets at a router or a switch causes high latency and reduced throughput. With the idea of avoiding packet discards as much as possible and low memory prices, the buffer size of a router or a switch is continuously increasing. The packets that should have been discarded earlier could survive longer and as a result these packets hinder TCP congestion avoidance algorithm, deteriorating the overall network performance.</p>
</blockquote>
<h4>Avoid domain sharding</h4>
<p>Domain sharding is a kind of expedient used to avoid the restriction on the number of concurrent downloads (<i><a target="_self" href="http://stackoverflow.com/a/985704/556678">in general</a>, six downloads per hostname in modern browsers</i>) in web applications. If you use SPDY and comply with "use a single connection" recommendation, you don't need to use domain sharding. To make it worse, domain sharding causes additional DNS queries and makes applications more complex.</p>
<h4>Use server push instead of inlining</h4>
<p>Inlining stylesheets or scripts is often used to reduce the number of HTTP requests, thus RTT, in web applications. However, inlining makes Web pages less cacheable and increases the size of webpages due to base64 encoding. If you use the SPDY server push feature used to push content, you can avoid these problems.</p>
<h4>Use request prioritization</h4>
<p>You can enable the client to inform the server of the relative priority of resources by using the request prioritization feature of SPDY. The simple common heuristic prioritization could be <code>html &gt; js, css &gt; *</code>.</p>
<h4>Choose the proper size of a SPDY frame</h4>
<p>Although SPDY spec allows large frames, sometimes a small frame is more desirable. This is because a small frame allows interleaving to work better.</p>
<h3>SSL Level</h3>
<h4>Use a smaller, full certificate chain</h4>
<p>The size of a certificate chain makes a huge influence on the performance of the initialization of a connection. If there are more certificates in a certificate chain, it will take more time to verify the validity of certificates, and more space will be occupied in <b>initcwnd</b>.</p>
<blockquote class="q4">
<p><b>initcwnd:</b> the initial value used in TCP initial congestion window and TCP congestion control algorithm. The congestion window is a sender-side window to control the size according to TCP congestion control algorithm, which is different from the TCP window, which is a receiver-side limit on the size.&nbsp;</p>
</blockquote>
<p>In addition, if a server does not provide a full certificate chain, the client will use additional RTT to get certificates in the middle. Therefore, if a large, incomplete certificate chain is used, it will take longer time for an application to use the connection.</p>
<h4>Use a wildcard certificate (e.g., *.naver.com) if possible</h4>
<p>If you use wildcard certificates, you can reduce the number of connections and use the connection sharing of SPDY. As a wildcard certificate is provided by a certification institute, however, you may need to discuss with the institute and pay extra costs.</p>
<h4>Do not set the size of SSL write buffer too large</h4>
<p>If SSL write buffer is too large, TLS application record will be put on multiple packets. As an application can process only after the entire TLS application record is completed, the record put on multiple packets will cause additional latency. Google servers use 2 KB buffers.</p>
<h3>TCP Level</h3>
<h4>Set the initcwnd of the server to at least 10</h4>
<p><b>Initcwnd</b> is the main bottleneck that affects the initial loading time of a page. If you use only HTTP, you can avoid this problem by attaining the initial congestion window size of <code>n &times; initcwnd</code> by opening multiple connections concurrently. As a single connection is advantageous in SPDY, however, it is better to set <b>initcwnd</b> to a large value from the beginning. This value in old Linux kernels is fixed to 2-3, and the method to adjust this value is not provided. As this value was determined according to the reliability and bandwidth of TCP network when it was first considered, it is not suitable to today's TCP network with higher stability and bandwidth. The method to adjust this value was added in Linux Kernel 3.0, and the latest Linux kernels already use the default value of 10 or higher.</p>
<h4>Disable tcp_slow_start_after_idle</h4>
<p>The <code>tcp_slow_start_after_idle</code> on Linux is set to <b>1</b>. This causes a congestion window to return to the size of <b>initcwnd</b> when the connection goes idle, and makes TCP Slow Start restart.</p>
<blockquote class="q4">
<p><b>TCP Slow Start</b> is an algorithm that works by sending packets to the congestion window of the initcwnd size and increasing the TCP congestion window up to the maximum value allowed by the network or up to the TCP window of the receiver side. If the initcwnd value is small, it takes more round-trips until the window reaches the maximum size allowed by the network, and as a result, the initial page loading time will increase.</p>
</blockquote>
<p>As this will eliminate the advantage of a single connection of SPDY, this setting should be disabled. You can change the setting by using the <b>sysctl</b> command. It is also advantageous to disable this setting when using HTTP keepalive.</p>
<h2>If SPDY is Really Introduced &hellip;</h2>
<p>In conclusion, to actually introduce SPDY, you need to consider a variety of matters and modify applications and servers. You cannot ignore the costs of introducing the protocol, either. For a Web application written for Tomcat, which does not yet support SPDY, you should consider the cost required to change the Web application server, as well as the cost required to implement the server push functionality.</p>
<p>Except for the costs required to introduce SPDY, <b>what should we take into account first when we introduce SPDY in a real service?</b> I chose three matters to consider.</p>
<h3>The service should be the one that already uses HTTPS</h3>
<p>You cannot get many advantages when introducing SPDY for services using only HTTP. You should pay the costs for introducing SSL as well.</p>
<h3>You should be able to change the Linux kernel</h3>
<p>Even CentOS 6.3, the latest version released on July 9, 2012, still uses the kernel 2.6.32. Adjustment of <b>initcwnd</b> is supported only from the kernel 3.0, and you need to change the kernel if possible because the performance improvement you can get by adjusting <b>initcwnd</b> is very significant.</p>
<h3>Consider the ratio of users of SPDY supported browsers</h3>
<p>In Korea, many users still use IE which does not support SPDY. On mobile devices, iOS does not yet support SPDY, while Android 3.0 or higher supports SPDY. Therefore, until there are sufficient users of SPDY-supported browsers, you should carefully compare the advantages you can get from the performance improvement derived from SPDY with the costs required to introduce the protocol.</p>
<p>By Sehoon Park, Senior Engineer at Web Platform Development Lab, NHN Corporation.</p>]]></description>
                        <pubDate>Mon, 26 Nov 2012 09:22:07 +0900</pubDate>
                        <category>SPDY</category>
                        <category>HTTP</category>
                        <category>HTTPS</category>
                        <category>Protocol</category>
                        <category>Google</category>
                        <category>Facebook</category>
                        <category>performance</category>
                        <category>Web development</category>
                        <category>Server push</category>
                                </item>
        										        <item>
            <title>Tell about your open source project on CUBRID Blog. We&apos;ll pay for Facebook Ads.</title>
            <dc:creator>Esen Sagynov</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-life/tell-about-your-open-source-project-on-cubrid-blog/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-life/tell-about-your-open-source-project-on-cubrid-blog/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-life/tell-about-your-open-source-project-on-cubrid-blog/#comment</comments>
                                    <description><![CDATA[<p style="text-align: center;"><img height="269" width="727" alt="promote_your_open_source_project.png" src="/files/attach/images/220547/127/665/promote_your_open_source_project.png" /></p>
<p>At CUBRID we know how much important and difficult at the same time&nbsp;it is&nbsp;to find the right audience who would listen to what your open source project has to offer. Due to its nature, most OS projects, especially those driven by a community of likeminded enthusiasts, often do not possess the budget to reach out to target users. If you are a member of such community and would like to tell the world about it, read on. We have good news for you!</p>
<p>Last year we announced the&nbsp;<a href="/affiliates">CUBRID Affiliates Program</a>&nbsp;through which we have already&nbsp;<a href="/blog/cubrid-life/first-batch-of-donations-to-opensource-projects-from-cubrid-affiliates-program/">donated</a>&nbsp;3000 USD to 14 open source projects. Some of these projects provide very useful features to <a href="/wiki_tutorials/entry/important-facts-to-know-about-cubrid">CUBRID Database</a> users.&nbsp;This time&nbsp;we want to periodically introduce to our CUBRID community members one or two handpicked open source projects that we think are also potential and worth talking about. Starting from today we will accept guest posts to our CUBRID Blog from open source communities where they can introduce their software to our readers. We will handle the promotion.</p>
<h2>Step 1: Pitch it!</h2>
<p>We are looking for great software that will make our and our readers' eyes sparkle. Before your post gets published, we would like to read an overview of your project. <a href="mailto:contact@cubrid.org">Send us a brief email</a> introducing your project: site links, introductory videos, presentations if available, what it does, what benefits it can provide to our readers. We will get back to you in no time.</p>
<h2>Step 2: Write it!</h2>
<p>If we accept your pitch, we will ask you to write a full post that should cover your software. You can include images and even video. In your introduction you need to tell the readers what problem your software solves, where and how a reader can take off, and, very importantly, how it differs from other existing solutions (I am sure there is at least one product you can compare with). Together with your article, send us your bio and Gravatar enabled email address.</p>
<h2>Step 3: Get published!</h2>
<p>We will create an account for you on our CUBRID community site and associate your email address&nbsp;it with. Then we will publish your post under your name so that you can receive notifications for comments as the post author. Once published, we will share your story on CUBRID <a href="https://www.facebook.com/cubrid">Facebook</a>, <a href="http://www.twitter.com/cubrid">Twitter</a>, <a href="https://plus.google.com/b/115924035309696499466/115924035309696499466">Google+</a>&nbsp;pages as well as other networking sites like the popular DZone. On top of that we will place a&nbsp;223x170 pixel banner on the top right of CUBRID Blog site, if you provide one, as well as&nbsp;pay at least $5 (<i>about 11K impressions</i>) to promote your post on Facebook timeline.</p>
<p>Like it? Then go on and <a href="mailto:contact@cubrid.org">contact us</a>!</p>]]></description>
                        <pubDate>Tue, 28 May 2013 21:20:12 +0900</pubDate>
                        <category>event</category>
                        <category>CUBRID Affiliates</category>
                                </item>
        										        <item>
            <title>Understanding Vert.x Architecture - Part II</title>
            <dc:creator>Jaehong Kim</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/understanding-vertx-architecture-part-2/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/understanding-vertx-architecture-part-2/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/understanding-vertx-architecture-part-2/#comment</comments>
                                    <description><![CDATA[<p>&nbsp;</p>
<p>Last month a colleague of mine has already <a href="/blog/dev-platform/inside-vertx-comparison-with-nodejs/" title="Inside Vert.x. Comparison with Node.js.">covered Vert.x</a>, a relatively new Java application framework which provides noticable performance advantage over competing technologies and features multi programming language support. The previous article has explained us about the philosophy of Vert.x, performance comparison with Node.js, internal structure of Vert.x, and many more. Today, I would like to continue this conversation and talk more about Vert.x architecture.</p>
<h2>Considerations Used to Develop Vert.x</h2>
<p><b>Polyglot</b> is the feature making Vert.x stand out from other server frameworks. In the past, server frameworks could not support multiple languages. Supporting several languages does more than expand the range of users. More important thing is that services using different languages in a distributed environment can intercommunicate with ease. Of course, supporting a variety of languages is not sufficient for supporting a distributed environment. Essential functions of greater priority for a distributed environment include address system or message bus. Vert.x framework provides these functions. As Vert.x provides these functions as well as Polyglot, the benefits of Vert.x should be considered for a distributed environment.</p>
<p>As Vert.x supports a universal server framework, a variety of workloads should be considered. We should consider unusal cases different from <a href="/blog/tags/Nginx">Nginx</a>, which is typically used as a Web server, or Node.js. It is to build a universal server application that processes a variety of protocols except HTTP (i.e., not a Web server which executes simple operations, considering scalability in a 3-tier environment). In order to accomplish this, Vert.x provides an additional thread pool while using the <b>Run Loop</b> method.</p>
<p>We will discuss Vert.x architecture starting from the thread pool and a consideration for a distributed environment.</p>
<h3>Run Loop and Thread Pool</h3>
<p>Vert.x and asynchronous server applications (or frameworks), include Ngin.x and Node.js, use the Run Loop method. Vert.x uses the term 'Event Loop' instead of 'Run Loop'. However, as Run Loop is the more popular term among some developers. I use this term, Run Loop, here. Run Loop, as you will guess from the name, is a method for checking whether there is a new event in the infinite loop, and calling an event handler when an event has been received. As such, the asynchronous server application and the event-based server application are different terms indicating an identical target, similar to &lsquo;enharmonic' for music. To use the Run Loop method, all I/Os should be managed as events.</p>
<p>For example, imagine a general Web server application that creates a query for a database to respond to an HTTP request from a Web browser. The CPU of the Web server is used when one thread analyzes the HTTP request to execute proper business logic, and creates a query statement. However, the CPU is not used while the thread sends the query to the database and waits for a response. However, when the thread to be created equals the number of HTTP requests (Thread per Connection), another thread may be processing a task requiring the web server CPU, while one thread is waiting for response from the database. Finally, the web server CPU is used to process HTTP requests. As you know, the weakness of Thread per Connection is the cost for context switching at the kernel level since many threads must be created. This can be called waste.&nbsp;</p>
<p>The asynchronous event handling method can overcome this weakness (figuratively speaking, '<i>asynchronous event handling</i>' is the '<b>purpose</b>' and &lsquo;<i>Run Loop</i>&rsquo; is the '<b>means</b>'). If &lsquo;HTTP request itself&rsquo; and &lsquo;receiving a response from the database&rsquo; are created as an event, and the Run Loop calls the corresponding event handler whenever an event is received, the execution performance of the application can be enhanced by avoiding unnecessary context switching. In this fashion, to utilize a CPU efficiently, the number of Run Loops required equals the number of cores (i.e., thread should be created equaling the number of cores and each thread should run the Run Loop).</p>
<p>However, there is another problem creating threads equaling the number of cores, which is preventing as much context switching as possible. If a handler, using server resources, takes a long time to handle an event, other events received while the handler is being executed are not managed in a timely manner. A popular example is file searching on the server disk. In this case, it is better to create a separate thread for searching files.&nbsp;</p>
<p>Therefore, to build a universal server framework with asynchronous event handling, the framework should have a function for managing a thread pool. This is the aim of Vert.x. <b>Thread pool management</b> is the biggest difference between Vert.x and Node.js, except for polyglot. <b>Vert.x creates Run Loops (Event Loops) equaling the number of cores <i>and</i> provides thread pool-related function to handle tasks using server resources requiring long periods for event handling.</b></p>
<h3>Why is Hazelcast Used?</h3>
<p>Vert.x uses <a href="http://www.hazelcast.com/">Hazelcast</a>, an In-Memory Data Grid (<a href="/blog/tags/IMDG/">IMDG</a>). Hazelcast API is not directly revealed to users but is used in Vert.x. When Vert.x is started, Hazelcast is started as an embedded element.</p>
<p>Hazelcast is a type of distributed storage. When storage is embedded and used in a server framework, we can obtain expected effects from a distributed environment.</p>
<p>The most popular case is session data processing. Vert.x calls it Shared Data. It allows multiple Vert.x instances to share the same data. Of course, additional RDBMS, instead of Hazelcast, will bring the same effect from the functional side. It is natural that embedded memory storage can consistently provide results faster than remote RDBMS. Therefore, users who need sessions for e-commerce or chatting servers can build a system with a simple configuration by using only Vert.x.</p>
<p>Hazelcast allows a message queue use without additional costs or investments (without server costs or monitoring of message queue instances). As mentioned before, Hazelcast is a distributed storage. It can duplicate a storage for reliability. By using this distributed storage as a queue, the server application implemented by using Vert.x becomes a message processing server application and a distributed queue.</p>
<p>These benefits make Vert.x a strong framework in a distributed environment.</p>
<p>&nbsp;</p>
<h2>Understanding Vert.x Components</h2>
<p>&nbsp;</p>
<p style="text-align: center;"><img height="529" width="700" alt="vertx-architecture-diagram.png" src="/files/attach/images/220547/795/664/vertx-architecture-diagram.png" /></p>
<p style="text-align: center;"><b>Figure 1: Vert.x Architecture (Component) Diagram.</b></p>
<p><b>Figure 1</b> above shows a diagram of Vert.x components. As shown in the figure, in all Vert.x instances (these can be understood as a JVM), a Hazelcast is embedded and runs. The embedded Hazelcast is connected to Hazelcast in other Vert.x instances. Event Bus uses functions of Hazelcast. Hazelcast itself provides a certain level of reliability (because of WAL records and data duplication). So, events can be forwarded with a certain level of reliability.</p>
<h3>HTTP Server and Net Server</h3>
<p>HTTP Server and Net Server control network events and event handlers. A Net Server is for events and handlers private protocol, and an HTTP Server allows registering a handler to an HTTP event such as GET or POST. The reason for preparing an HTTP Server is eliminating the need to add event types, as well as the universality of HTTP itself. HTTP Server supports WebSocket as well as HTTP.</p>
<p style="text-align: center;"><img height="373" width="615" alt="vertx-event-and-handler-of-http-server.png" src="/files/attach/images/220547/795/664/vertx-event-and-handler-of-http-server.png" /></p>
<p style="text-align: center;"><b>Figure 2: Event and Handler of HTTP Server.</b></p>
<h3>Vert.x Thread Pool</h3>
<p>Vert.x has three types of thread pools:</p>
<ol>
<li><b>Acceptor</b>: A thread to accept a socket. One thread is created for one port.</li>
<li><b>Event Loops</b>: (same with Run Loop) equals the number of cores. When an event occurs, it executes a corresponding handler. When execution is performed, it repeats reading another event.</li>
<li><b>Background</b>: Used when Event Loop executes a handler and an additional thread is required. Users can specify the number of threads in vertx.backgroundPoolSize, an environmental variable. The default is 20. Using too many threads causes an increase in context switching costs, so be cautious.</li>
</ol>
<p>Event Loops can be described as follows in a detailed way. Event Loops use <b>Netty NioWorkder</b> as it is. All handlers specified by <a href="/blog/dev-platform/inside-vertx-comparison-with-nodejs/">verticles</a> run on Event Loops. Each verticle instance has its specified NioWorker. As such, it is guaranteed that a verticle instance is always executed on an identical thread. Therefore, verticles can be written in a thread-safe manner.</p>
<h2>Conclusion</h2>
<p>So far, I have briefly described Vert.x architecture. Since Vert.x framework is not widely used, I believe it would be better to detail the concept of designing Vert.x than detail each Vert.x component. Even if you have no interest in network server frameworks, it is helpful to review new products and determine differences between new and existing products. Doing so helps in understanding the evolution and direction of software products that are flooding today's market.</p>
<div>By <a href="/?mid=community&amp;act=dispMemberInfo&amp;member_srl=664924&amp;tab=blogs">Jaehong Kim</a>, Senior Engineer at Web Platform Development Lab, NHN Corporation.</div>]]></description>
                        <pubDate>Mon, 27 May 2013 16:47:16 +0900</pubDate>
                        <category>Vert.x</category>
                        <category>Node.js</category>
                        <category>architecture</category>
                        <category>comparison</category>
                        <category>Java</category>
                        <category>IMDG</category>
                                </item>
        										        <item>
            <title>Inside Vert.x. Comparison with Node.js.</title>
            <dc:creator>Woo Seongmin</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/inside-vertx-comparison-with-nodejs/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/inside-vertx-comparison-with-nodejs/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/inside-vertx-comparison-with-nodejs/#comment</comments>
                                    <description><![CDATA[<p><a href="http://vertx.io">Vert.x</a> is a server framework which is rapidly arising. Each server framework claims its strong points are high performance with a variety of protocols supported. Vert.x takes a step forward from that. Vert.x considers the environment of establishing and operating the server network environment. In other words, Vert.x includes careful consideration in producing several 'server process DAEMONs' that run on the clustering environment, as well as producing one server process DAEMON.</p>
<p>Therefore, it is important to review Vert.x: which network environment it considers as well as how it delivers high performance. So, I think it will be valuable to pay sufficient time examining Vert.x structure.</p>
<h2>Philosophy of Vert.x</h2>
<p>Vert.x is a project affected by <a href="http://nodejs.org">Node.js</a>. Like Node.js, Vert.x is a server framework providing an event-based programming model. Therefore, Vert.x API is very similar to that of Node.js. That's because both of these models provide asynchronous API.</p>
<p>Node.js is created by using JavaScript, but Vert.x is created by using <b>Java</b>. However, it is too much to understand Vert.x as a Java version of Node.js. Vert.x has been affected by Node.js; but Vert.x has its own unique philosophy, different from Node.js.</p>
<p>The most typical design philosophy of Vert.x is summarized as follows:</p>
<ul>
<li><strong>Polyglot - supports several languages</strong><br />Vert.x itself is built in Java. However, Java is not required to use Vert.x. <br />As well as languages based on JVM operation, such as Java or Groovy, Vert.x can be used with Ruby, Python, and even JavaScript. If you need to build a server application by using JavaScript, there is an alternative to Node.js. In addition, Scala and Closure are planned to be supported.</li>
<li><strong>Super Simple Concurrency model</strong><br />When building a server application by using Vert.x, users can write code as a single thread application. That means that the multi-thread programming effect can be achieved without synchronization, lock, or volatility.<br />In Node.js, the JavaScript execution engine does not support multi-thread. However, to utilize all CPU cores, several same JavaScript programs have to be executed. <br />However, Vert.x allows to create multiple threads based on the number of CPU cores whlie only one process is executed.  It handle the multi-threading so users can focus on implementing business logic. </li>
<li><strong>Provides Event Bus<br /></strong> As described in the introduction, the goal of Vert.x is not only to produce a &lsquo;one server process DAEMON'. Vert.x aims to make a variety of Vert.x-built server programs communicate well with each other. For this, Vert.x provides Event Bus. Therefore, MQ functions such as Point to Point or Pub/Sub can be used (to provide Event Bus function, Vert.x uses <a href="http://www.hazelcast.com">Hazelcastm</a>, an In-Memory Data Grid).<br />With this Event Bus, a server application built with different languages can easily communicate with each other. </li>
<li><strong>Module System &amp; Public Module Repository</strong><br />Vert.x has a module system. This module system can be understood as a type of component.  That means the Vert.x-built server application project itself is modularized. It aims at reusability. This module can be registered to Public Module Repository. Through the Public Module Repository, the module can be shared.</li>
</ul>
<h3>What is the relationship and defference between Netty and Vert.x?</h3>
<p><b>&nbsp;</b>Before discussing the Vert.x performance, we should summarize the relationship between Netty and Vert.x. Vert.x uses Netty. In other words, it processes all IOs by using Netty. Therefore, it is meaningless to verify differences of the performance between Vert.x and Netty.</p>
<p>Vert.x is a server framework that provides API and functions different and independent from Netty, designed with different purpose from Netty.</p>
<p>Netty is a framework that can process the low-level IO and Vert.x can process the higher-level IO than Netty.</p>
<h2>Comparison of Performance with Node.js</h2>
<p>Even if functions provided by Vert.x are different from those of Node.js, comparing the performance between them is a significant matter.  <b>Figure 1</b> and <b>Figure 2</b>&nbsp;below show the performance of Vert.x (Java, Ruby, Groovy) and Node.js. (Source: <a href="http://vertxproject.wordpress.com/2012/05/09/vert-x-vs-node-js-simple-http-benchmarks/">http://vertxproject.wordpress.com/2012/05/09/vert-x-vs-node-js-simple-http-benchmarks/</a>).</p>
<p><b>Figure 1</b> shows a comparison of performance when an HTTP server is built and only a 200/OK response has been returned. <b>Figure 2</b> shows the comparison of performance when a 72-byte static HTML file is returned as a response.</p>
<p style="text-align: center;"><img height="273" width="572" alt="vertx_nodejs_performance_comparison_200_ok.png" src="/files/attach/images/220547/772/629/vertx_nodejs_performance_comparison_200_ok.png" /></p>
<p style="text-align: center;"><strong>Figure 1: Comparison of Performance When Only 200/OK Response Has Been Returned.</strong></p>
<p style="text-align: center;"><img height="336" width="579" alt="vertx_nodejs_performance_comparison_72_byte_response.png" src="/files/attach/images/220547/772/629/vertx_nodejs_performance_comparison_72_byte_response.png" /></p>
<p style="text-align: center;"><strong>Figure 2: Comparison of Performance When a 72-byte Static HTML File is Returned.</strong></p>
<p>This performance is proclaimed by Vert.x developers but the test has not been made under a strict environment. Just look the relative differences in performance.</p>
<p>Also, a notable point is that the performance of Vert.x-JavaScript is better than Node.js. However, even if the performance result is very reliable, it may be difficult to say that Vert.x is better than Node.js. That's because Node.js provides great models such as Socket.io and has lots of references.</p>
<h2>Vert.x Terminology</h2>
<p>Vert.x defines its unique terms and redefines general terms for Vert.x itself. Therefore, to understand Vert.x, it is necessary to understand the Vert.x-defined terms. The followings are popular terms used in Vert.x:</p>
<h3>Verticle</h3>
<p>For Java, it is the class with a main method. Verticle can also include other scripts referred to by the main method. It can also include the jars files or resources. An application may consist of one Verticle or several Verticles, which communicate through Event Bus. Alongside Java, it can be understood as an independently executable class or a jar file.</p>
<h3>Vert.x Instance</h3>
<p>A Verticle is executed within a Vert.x instance and the Vert.x instance is executed in its JVM instance. So there will be a lot of Verticles which are simultaneously executed in a single Vert.x instance. Each Verticle can have its own unique class loader. In this manner, direct interactions between Verticles, made through static members and global variables, can be prevented. A lot of Verticles can be simultaneously executed in several hosts on the network and the Vert.x instances can be clustered through Event Bus.</p>
<h3>Concurrency</h3>
<p>The Verticle instance guarantees it is always executed on an identical thread. As all codes can be developed as a single thread operation type, developers who use environment where Vert.x can be easily developed. In addition, race condition or deadlock can be prevented.</p>
<h3>Event-based Programming Model</h3>
<p>Like the Node.js framework, Vert.x provides an event-based programming model. When programming a server by using Vert.x, most codes for development are related to event handlers. For example, an event handler should be set to receive data from a TCP socket or an event handler, which will be called when data is received, should be created. In addition, event handlers should be created to receive alarms 'when Event Bus receives a message,' 'when HTTP messages are received,' 'when a connection has been disconnected,' and 'when a timer is timeout.'</p>
<h3>Event Loops</h3>
<p>Vert.x instance internally manages the thread pool. Vert.x matches the number of thread pools to the number of CPU cores as closely as possible.</p>
<p>Each thread executes Event Loop.  Event Loop verifies the events as rounding the loop. For example, verifying whether there is data to read in the socket or on which timer an event has occurred. If there is an event to process on the loop, Vert.x calls the corresponding handler (of course, additional work is necessary if the handler-processing period is too long or there is a blocking I/O).</p>
<h3>Message Passing</h3>
<p>Verticles use Event Bus for communication. If a Verticle is assumed as an actor, Message Passing is similar to an actor model, which was famous in Erlang programming languages. The Vert.x server instance has a lot of Verticle instances and allows message passing among the instances. Therefore, the system can be extended according to the usable cores without executing the Verticle code through multi-thread.</p>
<h3>Shared data</h3>
<p>Message passing is very useful. However, it is not always the best approach in all types of application concurrency situations. Cache is one of the most popular examples. If only one vertical has a certain cache, it is very inefficient. If other Verticles need the cache, each Verticle should manage the same cache data.</p>
<p>Therefore, Vert.x provides a method for global access. It is the <b>Shared Map</b>. Verticles share immutable data only.</p>
<h3>Vert.x Core</h3>
<p>As named, this is the core function of Vert.x. Functions that the Verticle can directly call are included in the core. Therefore, the core can be accessed by each programming language API supported by Vert.x.</p>
<h2>Vert.x Architecture</h2>
<p>The simple architecture of Vert.x is shown in the following <b>Figure 3</b>.</p>
<p style="text-align: center;"><img height="245" width="304" alt="vertx_architecture.png" src="/files/attach/images/220547/772/629/vertx_architecture.png" /></p>
<p style="text-align: center;"><strong>Figure 3: Vert.x Architecture (source: <a href="http://www.javacodegeeks.com/2012/07/osgi-case-study-modular-vertx.html">http://www.javacodegeeks.com/2012/07/osgi-case-study-modular-vertx.html</a>) </strong></p>
<p>The default execution unit of Vert.x is Verticle and several Verticles can be simultaneously executed on one Vert.x instance. The Verticles are executed on the event-loop thread. Several Vert.x instances can be executed on several hosts, as well as on one host on the network. At this time, the Verticles or modules communicate by using the event bus.</p>
<p>To sum up, the vert.x application consists of combinations of Verticles or modules and communication among those is made by using Event Bus.</p>
<h2>vert.x Project Structure</h2>
<p>The following is the Vert.x project structure viewed from eclipse when clonning the source code from the <a href="https://github.com/vert-x/vert.x">Vert.x Github page</a>.</p>
<p style="text-align: center;"><img src="http://helloworld.naver.com/./files/attach/images/61/784/163/./files/attach/images/61/784/163/49e1c3d8d3329e1bd3a1043166caf566.png" editor_component="image_link" /></p>
<p style="text-align: center;"><strong> Figure 4: Vert.x source tree.</strong></p>
<p>The overall configuration is as follows:</p>
<ul>
<li><b> vertx-core</b>&nbsp;is the core library.</li>
<li><b>vertx-platform</b>&nbsp;manages distribution and lifecycle.</li>
<li><b> vert-lang</b> used to expose the Core Java API to another language.</li>
</ul>
<p><a href="http://www.gradle.org">Gradle</a> is used as its project build system. It has the advantages of ant and maven.</p>
<h2>Installing Vert.x and Executing Simple Examples</h2>
<p>To use Vert.x, JDK7 is required because Vert.x uses the <a href="http://oneminutedistraction.wordpress.com/2010/11/22/invokedynamic-bytecode-in-jdk-7/">invokeDynamic in JDK7</a>.</p>
<p><span style="font-size: 14px;">Vert.x can easily be installed.</span></p>
<ol>
<li><span>D</span><span>ownload the compressed installation file from <a href="http://vertx.io/downloads.html">http://vertx.io/downloads.html</a>&nbsp;to a desired location.</span></li>
<li><span>Decompress the file and add the bin directory to the PATH environment variables.</span></li>
</ol>
<p><span>This is all about the installation of Vert.x. </span><span>In the command window, execute the <code>vertx version</code>. If the version information successfully prints out, the installation is completed.</span></p>
<h3><span>Example 1</span></h3>
<p>Now, let's build and execute a simple Web server with JavaScript which&nbsp;<span style="font-size: 14px;">prints out "hello world". After writing the following codes, save it in <b>server.js</b>&nbsp;file. It is almost identical to the Node.js code.</span></p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Plain" editor_component="code_highlighter">load('vertx.js');<br /> <br /> vertx.createHttpServer().requestHandler(function(req) {<br /> req.response.end("Hello World!");}) .listen(8080, 'localhost');</div>
<p>Execute the created <b>server.js</b> application by using the vertx command as follows:</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Plain" editor_component="code_highlighter">$ vertx run server.js</div>
<p><span style="font-size: 14px;">Open a browser and connect to</span> <a href="http://localhost:8080">http://localhost:8080</a>. <span style="font-size: 14px;">If you can see the 'Hello World!'&nbsp;</span>message, you have succeeded.</p>
<h3>Example 2</h3>
<p>Let's see another example built other languages. The following code is written in Java. It shows a Web server that reads a static file and returns it as an HTTP response.</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Java" editor_component="code_highlighter">
<pre class="brush:java" style="font-size: 13.63636302947998px;">Vertx vertx = Vertx.newVertx();
vertx.createHttpServer().requestHandler(new Handler<httpserverrequest>() {
    public void handle(HttpServerRequest req) {
        String file = req.path.equals("/") ? "index.html" : req.path;
        req.response.sendFile("webroot/" + file);
    }
}).listen(8080);</httpserverrequest></pre>
</div>
<p>The following code is written in Groovy and provide the same functionality:</p>
<div style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nogutter="false" collapse="false" first_line="1" code_type="Plain" editor_component="code_highlighter">def vertx = Vertx.newVertx()<br /> vertx.createHttpServer().requestHandler { req -&gt;<br /> def file = req.uri == "/" ? "index.html" : req.uri<br /> req.response.sendFile "webroot/$file"<br /> }.listen(8080)</div>
<h2>Future of Vert.x and NHN</h2>
<p>At <a href="/blog/tags/NHN">NHN</a>&nbsp;we have been observing Vert.x development since its preofficial release. We think highly of Vert.x. We have been communicating with the main developer, Tim Fox, since June 2012 to discuss ways to improve Vert.x. For example,  <span style="font-size: 14px;">Socket.io on Vert.x. Socket.io is available on Node.js only. So we have ported it to Java and sent a Pull Request&nbsp;</span><a target="_blank" href="https://github.com/vert-x/vert.x/pull/320">https://github.com/vert-x/vert.x/pull/320</a>&nbsp;to&nbsp;Vert.x repository on Gitub. It's now merged to Vert.x-mod project.</p>
<p><span style="font-size: 14px;">Our effort, socket.io vert.x module, will be used for RTCS 2.0 version (vert.x + Socket.io) which is under-developing in NHN.</span></p>
<p>Node.js could remain very popular because of Socket.io. If Vert.x can use Socket.io, Vert.x may have many use cases. Furthermore, if this socket.io vertx module is used as an embedded library, it will be meaningful to use socket.io in Java based applications.</p>
<h3>What is RTCS?</h3>
<p>RTCS (Real Time Communication System) is a Real Time Web Development Platform created by NHN. It helps to transfer messages between a Browser and a Server in real time. RTCS has been deployed for NHN Web services such as Baseball 9, Me2Day Chatting, BAND Chatting and so on.</p>
<h2>Wrap-up</h2>
<p>The first version of Vert.x was released in May 2012. Compared to Node.js where the first version was released in 2009, the history of Vert.x is very short. Therefore, Vert.x does not have many references&nbsp;yet. However, Vert.x is supported by VMware and can run on CloudFoundry. So, we expect that many references will soon be obtained.</p>
<p>By&nbsp;<a href="/?mid=textyle&amp;act=dispMemberInfo&amp;member_srl=630183&amp;vid=blog&amp;tab=blogs">Seongmin Woo</a>, Software Engineer at Web Platform Development Lab, NHN Corporation.</p>
<h2>References</h2>
<ul>
<li>"Main Manual" <a href="http://vertx.io/manual.html">http://vertx.io/manual.html</a> </li>
<li>"Installation Guide" <a href="http://vertx.io/install.html">http://vertx.io/install.html</a> </li>
<li>"The C10K problem" <a href="http://www.kegel.com/c10k.html">http://www.kegel.com/c10k.html</a> </li>
<li>
<div>Gim Seongbak, Song Jihun "Java I/O &amp; NIO Network Programming&rdquo;, Hanbit Media 2004.</div>
</li>
</ul>]]></description>
                        <pubDate>Mon, 08 Apr 2013 20:30:00 +0900</pubDate>
                        <category>Vert.x</category>
                        <category>Node.js</category>
                        <category>performance</category>
                        <category>comparison</category>
                        <category>Java</category>
                        <category>JavaScript</category>
                        <category>Groovy</category>
                        <category>Gradle</category>
                        <category>IMDG</category>
                                </item>
        										        <item>
            <title>PostgreSQL at a glance</title>
            <dc:creator>Kim Sung Kyu</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/postgresql-at-a-glance/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/postgresql-at-a-glance/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/postgresql-at-a-glance/#comment</comments>
                                    <description><![CDATA[<p>PostgreSQL shows excellent functionalities and performance.&nbsp;Considering its high quality, it may seem strange that PostgreSQL is not more popular. However, PostgreSQL continues to make progress. This article will discuss this database.</p>
<h2>Why You Should Know about PostgreSQL</h2>
<p><a href="http://www.postgresql.org">PostgreSQL</a> is an RDBMS, which is popular mainly in North America and Japan. It is not used much in Korea yet, but as it is a very excellent RDBMS in terms of functionality and performance, it is worth learning about what kind of database PostgreSQL is.</p>
<p>PostgreSQL (<i><a href="http://en.wikipedia.org/wiki/PostgreSQL#Product_name">pronounced</a> as [Post-Gres-Q-L]</i>) is an object-relational database system (ORDBMS), and is an open-source DBMS that provides the enterprise-level DBMS functionalities and many other functionalities you can find only in advanced DBMS. PostgreSQL is also known as an open-source DBMS that Oracle users can adapt themselves to the most easily, as it has many functionalities similar to those of Oracle.</p>
<h2>History</h2>
<p>There were many ancestors of PostgreSQL, and of them, <a href="https://en.wikipedia.org/wiki/Ingres_(database)">Ingres</a>&nbsp;(<i>INteractive Graphics REtrieval System</i>) can be said to be the progenitor of PostgreSQL. Ingres was a project launched by <a href="https://en.wikipedia.org/wiki/Michael_Stonebraker">Michael Stonebraker</a>&nbsp;(<b>Picture 1</b>), a great master in the area of databases who is still working hard even today.</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" height="213" width="320" alt="320px-Michael_Stonebraker_1.jpg" src="/files/attach/images/220547/497/656/320px-Michael_Stonebraker_1.jpg" /></p>
<p style="text-align: center;"><b>Picture 1: Michael Stonebraker started Ingres project.</b></p>
<p>The Ingres project was launched at Berkeley University in the US in 1977. After Ingres&nbsp;Michael Stonebraker had started&nbsp;another project called Postgres (<i>Post-Ingres</i>). As Postgres version 3 was released in 1991, its user base grew to be quite large. But as the burden of providing support to users became too high, the project was terminated in 1993 (<i>Postgres is known to have had a huge influence on the current Informix product, even after the end of the project. Illustra, a commercial version of POSTGRES, was taken over by Informix in 1997, and then by IBM in 2001.</i>).</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/497/656/postgresql_history.png" alt="postgresql_history.png" width="470" height="180" /></p>
<p style="text-align: center;"><b>Figure 1:&nbsp;Product History.</b></p>
<p>Despite the project having ended, Postgres users and students continued its development and finally created Postgres95, which achieved 40% better performance than Postgres by supporting SQL and improving its structure.</p>
<p>When Postgre95 became an open-source system in 1996, it was given the name PostgreSQL, its current name, to reflect the fact that it succeeded Postgres and supports SQL (<i>Postgres supported a language called QUEL instead of SQL</i>). In 1997, PostgreSQL was finally released after determining its first version as 6.0.</p>
<p>Since then, PostgreSQL has been actively developed to this day through an open-source community, and the latest release is 9.2, as of May 2013.&nbsp;In addition, due to its open license (<i>like the BSD or MIT license, PostgreSQL allows commercial use and modification, but it also clarifies that the original developers are not liable for any problem that may occur in its use</i>), there have been more than 20 various forks, some of which have had an influence on PostgreSQL and some of which have disappeared.</p>
<p><img src="/files/attach/images/220547/497/656/postgresql_logo.png" alt="postgresql_logo.png" width="64" height="66" style="float: left; margin-right: 10px;" />PostgreSQL's logo is an elephant named '<em>Slonik</em>' (<i>a baby elephant in Russian language</i>). The true reason why an elephant was used for the logo is not known, but it has been said that just after it became an open-source system, one of its users was inspired by Agatha Christie's novel "Elephants Can Remember" and suggested it. Since then, the elephant logo has been visible at every official PostgreSQL event.</p>
<p>As elephants are thought of as large, strong, reliable and have a good memory, Hadoop and Evernote also use an elephant as their official logo.</p>
<h2>Functionalities and Limitations</h2>
<p>PostgreSQL supports transaction and <a href="http://en.wikipedia.org/wiki/ACID">ACID</a>, which are the basic functionalities of a relational DBMS. Moreover, PostgreSQL also has many progressive functionalities or expanded functionalities for academic research as well as for basic reliability and stability.&nbsp;Even a general list of PostgreSQL functionalities includes a large number of functionalities.</p>
<ul>
<li>Nested transactions (savepoints)</li>
<li>Point in time recovery</li>
<li>Online/hot backups, Parallel restore</li>
<li>Rules system (query rewrite system)</li>
<li>B-tree, R-tree, hash, GiST method indexes</li>
<li>Multi-Version Concurrency Control (MVCC)</li>
<li>Tablespaces</li>
<li>Procedural Language</li>
<li>Information Schema</li>
<li>I18N, L10N</li>
<li>Database &amp; Column level collation</li>
<li>Array, XML, UUID type</li>
<li>Auto-increment (sequences),&nbsp;</li>
<li>Asynchronous replication</li>
<li>LIMIT/OFFSET</li>
<li>Full text search</li>
<li>SSL, IPv6</li>
<li>Key/Value storage</li>
<li>Table inheritance</li>
</ul>
<p>In addition to these, it features a variety of functionalities and new functionalities of enterprise-level DBMS.</p>
<p>In general, PostgreSQL has the following limits:</p>
<table border="0">
<caption>Table 1: Basic Limits of PostgreSQL.</caption> <thead> 
<tr>
<th>Limit</th> <th>Value</th>
</tr>
</thead> 
<tbody>
<tr>
<td>Max. Database Size</td>
<td>Unlimited</td>
</tr>
<tr>
<td>Max. Table Size</td>
<td>32 TB</td>
</tr>
<tr>
<td>Max. Row Size</td>
<td>1.6 TB</td>
</tr>
<tr>
<td>Max. Field Size</td>
<td>1 GB</td>
</tr>
<tr>
<td>Max. Rows per Table</td>
<td>Unlimited</td>
</tr>
<tr>
<td>Max. Columns per Table</td>
<td>250~1600</td>
</tr>
<tr>
<td>Max. Indexes per Table</td>
<td>Unlimited</td>
</tr>
</tbody>
</table>
<h2>Roadmap</h2>
<p>As of May 2013, the latest release is 9.2. <b>Figure 2</b> provides some brief information on the progress of PostgreSQL by year.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/497/656/progress_of_postgresql_by_year.png" alt="progress_of_postgresql_by_year.png" width="430" height="108" /></p>
<p style="text-align: center;"><b>Figure 2:&nbsp;Progress of PostgreSQL by Year.</b></p>
<p>The main functionalities of each version are as follows:</p>
<table border="0">
<caption>Table 2:&nbsp;Main Functionalities by Version.</caption><thead> 
<tr>
<th>Version</th><th>Release Year</th><th>Main Functionalities</th>
</tr>
</thead> 
<tbody>
<tr>
<td>0.01</td>
<td>1995</td>
<td>
<ul>
<li>Postgres95 release</li>
</ul>
</td>
</tr>
<tr>
<td>1.0</td>
<td>1995</td>
<td>
<ul>
<li>Copyright change, open source</li>
</ul>
</td>
</tr>
<tr>
<td>6.0~6.5</td>
<td>
<p><span style="white-space: pre;"> </span>1997~1999</p>
</td>
<td>
<ul>
<li>Renamed PostgreSQL</li>
<li>Index, VIEWs and RULEs</li>
<li>Sequences, Triggers</li>
<li>Genetic Query Optimizer</li>
<li>Constraints, Subselect</li>
<li>MVCC, JDBC interface,</li>
</ul>
</td>
</tr>
<tr>
<td>7.0~7.4</td>
<td>
<p>2000~2010</p>
<p>&nbsp;</p>
</td>
<td>
<ul>
<li>Foreign keys</li>
<li>SQL92 syntax JOINs</li>
<li>Write-Ahead Log</li>
<li>Information Schema, Internationalization</li>
</ul>
</td>
</tr>
<tr>
<td>8.0~8.4</td>
<td>
<p>2005~2012</p>
</td>
<td>
<ul>
<li>Native Support for MS Windows</li>
<li>Savepoint, Point-in-time recovery</li>
<li>Two-phase commit</li>
<li>Table spaces, Partitioning</li>
<li>Full text search</li>
<li>Common table expressions (CTE)</li>
<li>SQL/XML, ENUM, UUID Type</li>
<li>Window functions</li>
<li>Per-database collation</li>
<li>Replication, Warm standby</li>
</ul>
</td>
</tr>
<tr>
<td>9.0</td>
<td>
<p>2010-09</p>
</td>
<td>
<ul>
<li>Streaming replication, Hot standby</li>
<li>Support for 64bit MS Windows</li>
<li>Per-column conditional trigger</li>
</ul>
</td>
</tr>
<tr>
<td>9.1</td>
<td>
<p>2011-09</p>
</td>
<td>
<ul>
<li>Functionality differentiation</li>
<li>Synchronous replication</li>
<li>Per-column collations</li>
<li>Unlogged tables</li>
<li>K-nearest-neighbor indexing</li>
<li>Serializable isolation level</li>
<li>Writeable CTE (WITH)</li>
<li>SQL/MED External Data</li>
<li>SE-Linux integration</li>
</ul>
</td>
</tr>
<tr>
<td>9.2</td>
<td>
<p>2012-09</p>
</td>
<td>
<ul>
<li>Performance optimization</li>
<li>linear scalability to 64 cores</li>
<li>Reduction in CPU power consumption</li>
<li>Cascade streaming replication</li>
<li>JSON, Range Type</li>
<li>Improved lock management</li>
<li>Space-partitioned GiST index</li>
<li>Index-only scans (covering)</li>
</ul>
</td>
</tr>
</tbody>
</table>
<p>The next PostgreSQL release under development is PostgreSQL 9.3, which is due to be released in the third quarter of 2013. This release features <a href="http://wiki.postgresql.org/wiki/Todo">many functionalities</a>, including an enhanced management functionality, parallel query, MERGE/UPSERT, multi-master replication, materialized view, and enhanced multi-language support.</p>
<h2>Internal Structure</h2>
<p>The following shows the process structure:</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/497/656/postgresql_process_structure.png" alt="postgresql_process_structure.png" width="430" height="282" /><br /><b>Figure 3:&nbsp;Process Structure.</b></p>
<p>If the client requests connection with the server through the&nbsp;(1)&nbsp;interface library (<i>variety of Interfaces Including libpg, JDBC and ODBC</i>), the Postmaster process relays connection with the server (2). Then, the client executes a query through connection with the allocated server (<b>Figure 3</b>).</p>
<p>The following shows the process of query execution in the server:</p>
<p><img src="/files/attach/images/220547/497/656/postgresql_query_execution_procedure.png" alt="postgresql_query_execution_procedure.png" width="430" height="360" style="text-align: center; display: block; margin-left: auto; margin-right: auto;" /></p>
<p style="text-align: center;"><b>Figure 4:&nbsp;Query Execution Procedure.</b></p>
<p>If it receives a query request from the client, the system creates a parse tree through the syntax analytics process (1), starts a new transaction through the semantic checking process (2) and creates a query tree.</p>
<p>Next, a query tree is re-generated according to the rules defined in the server (3), and of the many available execution plans, the most optimized plan tree is created (4). The server executes this (5) and sends the result of the requested query to the client.</p>
<p>While the server executes a query, a system catalog in the database is frequently used. In the system catalog, users can directly define the type of functions and data, as well as index access methods and rules. In PostgreSQL, therefore, a system catalog is utilized as an important point in adding or expanding its functionalities.</p>
<p>A file that stores data consists of multiple pages, and a single page has a scalable slotted page structure (Figures 5 and 6).</p>
<p><img src="/files/attach/images/220547/497/656/postgresql_data_page_structure.png" alt="postgresql_data_page_structure.png" width="430" height="288" style="display: block; margin-left: auto; margin-right: auto;" /></p>
<p style="text-align: center;"><b>Figure 5: Data Page Structure.</b></p>
<p><img src="/files/attach/images/220547/497/656/postgresql_index_page_structure.png" alt="postgresql_index_page_structure.png" width="700" height="488" style="display: block; margin-left: auto; margin-right: auto;" /></p>
<p style="text-align: center;"><b>Figure 6:&nbsp;Index Page Structure.</b></p>
<h2>Development Process</h2>
<p>The development process model of PostgreSQL can be explained by the following sentence:</p>
<blockquote>
<p>&lsquo;A community-based open-source project led by a few.&rsquo;</p>
</blockquote>
<p>Like the Linux, Apache and Eclipse projects, the PostgreSQL project is also composed only of a few administrators, a variety of developers and a large number of users. The small administrator group (Core Team) collects requests and feedback (<i>the group sometimes takes a vote to determine priorities at <a href="http://postgresql.uservoice.com">http://postgresql.uservoice.com</a></i>) from a large number of users, determines the direction of the product, has final approval right for the code and exerts its right for release.&nbsp;This is a different model from corporate management development processes such as MySQL and JBoss.</p>
<p>The developer group consists of code committers and code developers/contributors. They are located in many countries, including the U.S., Japan and Europe.</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" height="300" width="645" alt="distribution_of_postgresql_developers_by_region.png" src="/files/attach/images/220547/497/656/distribution_of_postgresql_developers_by_region.png" /></p>
<p style="text-align: center;"><b>Figure 7: Distribution of PostgreSQL Developers by Region.</b></p>
<p>Codes developed by a variety of developers go through a variety of review processes (<i>Submission Review, Usability Review, Feature Test, Performance Review, Coding Review, Architecture Review, Review Review</i>), and are reflected in the product after approval by the Core Team. The mailing list that has been used by the community for a long time is usually used, and a variety of documents, including manuals, are well maintained through the <a href="http://www.postgresql.org">official website</a>.</p>
<h2>Products in Competition</h2>
<p>PostgreSQL wants to be compared with enterprise-level commercial DBs, but it has been compared mainly with popular open-source DBMSs. The following are the catchphrases of these open-source DBMSs, each of which reflects its features:</p>
<ul>
<li><b>PostgreSQL</b>: The world's most advanced open source database</li>
<li><b>MySQL</b>: The world's most popular open source database</li>
<li><b>CUBRID</b>: Open Source Database Highly Optimized for Web Applications</li>
<li><b>Firebird</b>: The true open source database</li>
<li><b>SQLite</b>: self-contained library, serverless, zero-configuration, transactional SQL database engine</li>
</ul>
<p>It is not easy to compare these products using their catchphrases alone, but you can see that PostgreSQL seeks progressiveness and openness.</p>
<p>The following is brief comparison of PostgreSQL and its competitiors:</p>
<table border="0">
<caption>Table 3: Comparison of Products in Competition.</caption> 
<tbody>
<tr>
<td>Oracle</td>
<td>
<ul>
<li>An enormous amount of long-proven code and a variety of references.</li>
<li>High cost</li>
</ul>
</td>
</tr>
<tr>
<td>DB2,&nbsp;MS SQL</td>
<td>
<ul>
<li>Similar to Oracle</li>
</ul>
</td>
</tr>
<tr>
<td>MySQL</td>
<td>
<ul>
<li>A variety of applications and references.</li>
<li>Corporate development model</li>
<li>And the burden of licensing</li>
</ul>
</td>
</tr>
<tr>
<td>CUBRID</td>
<td>
<ul>
<li>An alternative to MySQL</li>
<li>Built-in HA and database sharding</li>
<li>Dual licensing</li>
</ul>
</td>
</tr>
<tr>
<td>Other commercial DBs</td>
<td>
<ul>
<li>Other commercial DBs show a downtrend due to open-source DBMSs</li>
</ul>
</td>
</tr>
<tr>
<td>Other open source DBs</td>
<td>
<ul>
<li>Struggle to attract developers</li>
</ul>
</td>
</tr>
</tbody>
</table>
<p>For a long time, the PostgreSQL community has made attempts to enter the enterprise DBMS market. In 2004, <a href="http://www.enterprisedb.com">EnterpriseDB</a>, a company using PostgreSQL, was established, and it is striving to strengthen its position in the enterprise DBMS market. One of the company's main products is Postgres Plus Advanced Server. Postgres Plus Advanced Server was developed by adding Oracle-compatible functionalities (<i>PL/SQL, SQL statements, functions, DB Links, OCI library, etc.</i>) to the open-source PostgreSQL, featuring easy data and application migration and a cost reduction of 20% compared to Oracle (<b>Figure 7</b>).</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" height="362" width="426" alt="postgresql_cost_reduction_compared_to_oracle.png" src="/files/attach/images/220547/497/656/postgresql_cost_reduction_compared_to_oracle.png" /></p>
<p style="text-align: center;"><b>Figure 8:&nbsp;Cost Reduction Compared to Oracle.</b></p>
<p>In addition, Postgres Plus Advanced Server provides differentiated services, including a training, consulting and migration, and technical support service from PostgreSQL experts. Through approximately 300 reference sites in a variety of areas, the product is promoted as a database for all industries, with a growing base of users across the world.</p>
<h2>Present Status and Trend</h2>
<p>As you can see from most posts on PostgreSQL, most PostgreSQL users have a developer-like tendency, and are very loyal to the product.</p>
<p>In fact, they have a good reason for their loyalty. PostgreSQL provides sufficient functionalities and conservative performance compared to other products, and one of its advantages is that it has good enough conditions for beginners to attract new developers.</p>
<p>These good conditions include a well-written manual on the project page, related documents, over 300 reference publications, and over 10 seminars and conferences held in a variety of countries every year. More recently, a PostgreSQL magazine has even appeared. And these are the results of the active PostgreSQL community.</p>
<p>The representative features that PostgreSQL users identify as being important are as follows:</p>
<ul>
<li><b>Reliability</b>&nbsp;is the top priority of the product</li>
<li>ACID and transaction</li>
<li>A variety of indexing techniques</li>
<li>Flexible full-text search</li>
<li>MVCC for better concurrency performance</li>
<li>Diverse and flexible replication methods</li>
<li>A variety of procedures (<i>PL/pgSQL, Perl, Python, Ruby, TCL, etc.</i>)/Interface (<i>JDBC, ODBC, C/C++, .Net, Perl, Python, etc.</i>) languages</li>
<li>Excellent community and commercial support</li>
<li>Well-made documents and a thorough manual</li>
</ul>
<p>A variety of expansion functionalities and ease of development of such functionalities are also advantages of PostgreSQL. The following are the differentiated expansion functionalities of PostgreSQL:</p>
<ul>
<li>GIS add-on (PostGIS)</li>
<li>Key-Value store expansion (HStore)</li>
<li>DBLink</li>
<li>Support for a variety of functions and types, including Crypto and UUID</li>
</ul>
<p>There are&nbsp;many&nbsp;other practical and experimental expansion functionalities as well.</p>
<p>Of these, you will see a brief account of GIS (<i>Geographic Information System</i>), which has recently become a hot topic. <a href="http://postgis.refractions.net">PostGIS</a>&nbsp;is a middleware expansion functionality that enables PostgreSQL to conform to the <a href="http://www.opengeospatial.org/standards/sfs">OpenGIS standard</a> and support geographic objects (<b>Figure 9</b>).</p>
<p style="text-align: center;"><img height="239" width="204" alt="postgis_structure.png" src="/files/attach/images/220547/497/656/postgis_structure.png" /></p>
<p style="text-align: center;">&nbsp;</p>
<p style="text-align: center;"><b>Figure 9:&nbsp;PostGIS Structure.</b></p>
<p>&nbsp;</p>
<p>PostGIS began to be developed from 2001, and with many functionality and performance improvements, it currently has the most users among the open-source products. There are some commercial products, such as Oracle Spatial, DB2 and MS SQL Server, but the commercial products have not been as well-received in terms of price-performance ratio.</p>
<p>In addition, you can easily find benchmark data that shows that the functionalities and performance of PostGIS/PostgreSQL are worthy of comparison to Oracle.</p>
<p>According to the recent trend, PostgreSQL is also much talked about in relation to cloud as well as GIS. With the recent increase in the number of companies providing DBaaS (<i>Database as a Service</i>), the demand for PostgreSQL, which has advantages in terms of costs and license, has increased, and as such EnterpriseDB has released Postgres Plus Cloud Database in the cloud market, with the following features:</p>
<ul>
<li>Simple setup &amp; web-based management</li>
<li>Automatic scaling, load balancing and failover</li>
<li>Automated online backup</li>
<li>Database Cloning</li>
</ul>
<p>It is used in many web services, including Amazon EC2, Eucalyptus cloud, and Red Hat Openshift development platform cloud. Other cloud service providers such as Heroku and dotCloud also provide services using PostgreSQL.</p>
<h2>Conclusion</h2>
<p>As Sun, which had acquired MySQL, was acquired by Oracle in 2009, MySQL began to be developed as a more closed corporate project, and many MySQL developers left the community around the same time. Afraid of this change, MySQL users are paying attention not only to the forks (<i>MariaDB, Drizzle, Percona, etc.</i>) of MySQL to which they can easily migrate, but also to the migration to PostgreSQL.</p>
<p>Looking at the trend of help-wanted ads related to PostgreSQL and MySQL in the most popular job finding portal <a href="http://www.indeed.com">http://www.indeed.com</a>&nbsp;(<b>Figure 9</b>), we can see the increase in help-wanted ads related to MySQL is slowing down, while help-wanted ads related to PostgreSQL continue to increase.</p>
<p style="text-align: center;"><img height="206" width="388" alt="trend_of_help_wanted_ads.png" src="/files/attach/images/220547/497/656/trend_of_help_wanted_ads.png" /></p>
<p style="text-align: center;"><b>Figure 10:&nbsp;Trend of Help-wanted Ads.</b></p>
<p>According to the trend of search frequency in search sites (<b>Figure 10</b>), MySQL shows a continued downtrend, while PostgreSQL seems to have almost no change. In Korea, however, the search frequency for PostgreSQL has shown an upward trend since mid 2010.</p>
<p style="text-align: center;"><img height="208" width="430" alt="search_frequency_trend.png" src="/files/attach/images/220547/497/656/search_frequency_trend.png" /></p>
<p style="text-align: center;"><b>Figure 11: Search Frequency Trend</b> (<a href="http://trend.naver.com/trend.naver?where=trend&amp;mobile=0&amp;startDate=200701&amp;endDate=201210&amp;dtype=&amp;query1=postgresql&amp;query2=&amp;query3=&amp;query4=&amp;query5=">source</a>)<b>.</b></p>
<p>Of course, the popularity and usage of MySQL is still much higher than PostgreSQL. Although you may not be able to determine the true status or prospects of these products from the above data alone, you could infer that if the popularity of MySQL declines, the popularity of PostgreSQL will increase.</p>
<p>PostgreSQL is not yet powerful enough to surpass MySQL in popularity, but the PostgreSQL open source project community continues to make the following efforts:&nbsp;</p>
<ul>
<li>Improvement of the reliability of basic DBMS functionalities</li>
<li>Provision of progressive and differentiated functionality expansion</li>
<li>Continuous attraction of more open source developers</li>
</ul>
<p>In addition, EnterpriseDB, which has stronger business purposes, is also striving to achieve the following objectives:</p>
<ul>
<li>Expansion of its share in the enterprise market</li>
<li>Expansion of its share in the cloud market</li>
<li>Efforts to replace Oracle and MySQL</li>
</ul>
<p>By <a href="/?mid=community&amp;act=dispMemberInfo&amp;member_srl=656646&amp;tab=blogs">Kim Sung Kyu</a>, Senior Software Engineer at CUBRID DBMS Lab, NHN Corporation.</p>
<div></div>]]></description>
                        <pubDate>Tue, 14 May 2013 16:33:08 +0900</pubDate>
                        <category>PostgreSQL</category>
                        <category>ORDBMS</category>
                        <category>RDBMS</category>
                        <category>comparison</category>
                                </item>
        										        <item>
            <title>NoSQL Benchmarking</title>
            <dc:creator>Hye Jeong Lee</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/nosql-benchmarking/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/nosql-benchmarking/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/nosql-benchmarking/#comment</comments>
                                    <description><![CDATA[<p>NoSQL is the talk of the town. And we have already covered <a title="What is NoSQL for?" href="http://blog.cubrid.org/web-2-0/what-is-nosql-for/">what it is for</a> in one of our previous blogs. Today I would like to share the NoSQL benchmark test results we have recently conducted. It will help you to understand if the soon to develop system is compatible to NoSQL, and which NoSQL product to select.</p>

<p>In this article we will reveal the characteristics of <a title="Cassandra - CUBRID Blog" href="http://blog.cubrid.org/tags/cassandra/">Cassandra</a>, <a title="HBase - CUBRID Blog" href="http://blog.cubrid.org/tags/hbase">HBase</a> and <a title="MongoDB - CUBRID Blog" href="http://blog.cubrid.org/tags/mongodb/">MongoDB</a> identified through multiple workloads.</p>

<h2>Why NoSQL?</h2>

<p>The interest in NoSQL continues to rise because the amount of data to process continues to increase. International internet companies, including <a title="Google - CUBRID Blog" href="http://blog.cubrid.org/tags/google/">Google</a> and <a title="Facebook - CUBRID Blog" href="http://blog.cubrid.org/tags/facebook/">Facebook</a>, have their own NoSQL solution designed to process the exploding amount of data.</p>

<blockquote>Why are they using NoSQL instead of RDBMS?</blockquote>

<p><a title="Twitter - CUBRID Blog" href="http://blog.cubrid.org/tags/twitter/">Twitter</a> is still <a title="Decomposing Twitter (Database Perspective)" href="http://blog.cubrid.org/web-2-0/decomposing-twitter-database-perspective/">using</a> MySQL. Although smaller than Facebook, the international SNS company MySpace uses MS SQL Server as their main storage.</p>

<p>RDBMS is known to experience burden when processing tera or peta unit large sized data. However, this can be resolved through sharding. For instance, <a title="NHN - CUBRID Blog" href="http://blog.cubrid.org/tags/nhn/">NHN</a>’s Café, Blog and News services use RDBMS through sharding.</p>

<p>There is no single correct answer in processing bulk data. Since every situation is different, the company must select the solution that best fits their situation and apply that for seamless service.</p>

<blockquote>Out of the RDBMSs, Oracle is an exception since Oracle’s performance and functions, such as mass data processing or data synchronization, are far more superior to other RDBMS. However, the high cost can be a problem. Depending on the size, it may be more economical to develop a NoSQL than purchasing an Oracle license.</blockquote>

<p>Even if you have enough expertise on how to process bulk data with RDBMS, we will need to have continuous interest and training in NoSQL. NoSQL is gaining popularity because of its ability to process mass data, but it still has many technical (or functional) limitations compared to RDBMS. However, this will be resolved as time passes.</p>

<a name="more"></a>
<p>NoSQL provides a <b>non-relational</b>, and in the long run, <b>schema-free data model</b>, allowing the horizontal extension of the system. Instead, it is <b>less structured</b> than RDBMS and <b>does not guarantee ACID</b>. Therefore, after the INSERT operation is completed and SELECT conducted, a different value can be acquired. And after the NoSQL demo has failed and restored, the stored value may be different than before. Operations such as <b>transactions or join cannot be conducted</b>.</p>

<p>While ACID has been abandoned, the flexibility, scalability and usability of data storage have increased. Therefore, it is more suitable for explosive amounts of data processing.</p>

<p>Now let’s see if the NoSQL products really play the expected part for the internet services you are to develop. Why don’t we take a look at the characteristics of the most widely used NoSQL products (Cassandra, HBase, MongoDB) by looking into each of its architectures and benchmark tests that were conducted.</p>

<h2>Benchmarking Tests using YCSB</h2>

<p><a title="Yahoo Cloud Servicing Benchmark" href="http://research.yahoo.com/Web_Information_Management/YCSB">YCSB</a> (Yahoo Cloud Servicing Benchmark) is a test framework developed by Yahoo. It allows to conduct benchmark tests on storages by creating "<em>work loads</em>". Through this benchmark, the storage most suitable for the service that is to be developed can be selected.</p>

<p>Basic operations are Insert, Update, Read, and Scan. There are basic workload sets that combine the basic operations, but new additional workloads can also be created.</p>

<p>YCBS currently supports Cassandra, HBase, MongoDB, Voldemort and JDBC. If tests on other storages are needed, then use YCBS interfaces to test during the development process.</p>

<p>This article contains tests conducted on the following products and versions.</p>

<ul>
  <li><b>Cassandra-0.7.4<br /></b><em>Although Cassandra’s latest version is 0.8.0, we have decided to use the previous version known to be stable. Because when testing with the 0.8 version, the gossip protocol between nodes malfunctioned and the node up/down information was incorrect.</em></li>
  <li><b>HBase-0.90.2</b> (Hadoop-0.20-append)<br /><em>The HBase-0.90.2 (Hadoop-0.20-append) was selected because, if not the Hadoop-append version, there may be problems on decreased durability in HDFS.</em></li>
  <li><b>MongoDB-1.8.1</b></li>
</ul>

<p>The test workload is as follows.</p>

<ul>
  <li><b>Insert Only<br /></b>Enter 50 million 1K-sized records to the empty DB.</li>
  <li><b>Read Only<br /></b>Search the key in the <a title="Zipf Distribution" href="http://mathworld.wolfram.com/ZipfDistribution.html">Zipfian Distribution</a> for a one hour period on the DB that contains 50 million 1K-sized records.</li>
  <li><b>Read &amp; Update<br /></b>Conduct <em>read</em> and <em>update</em> one-on-one instead of <em>read</em> under the identical conditions of <em>Read Only</em>.</li>
</ul>

<p>There are three testing equipments with the same specifications:</p>

<ul>
  <li>Nehalem 6 Core x 2 CPU, 16GB Memory.</li>
</ul>

<p>Each conduct replicates and distributes three copies. However, with MongoDB the performance result was abnormally high when both replications and distribution compositions were organized, so the test was conducted only with the replica set. The parameter for each product is shown in the following listings. Other items were set to default.</p>

<ul><li>Cassandra</li></ul>

<div editor_component="code_highlighter" code_type="Plain" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
# If a minimum of one succeeds while replicating, then it is returned to application.
Consistency Level, Read=ONE, Write=ONE

# Can be set to periodic(default) or batch.
# Periodically writes (fsync) the commit log on the disk,
# and batch executes regularly (1ms) collecting those to fsync every time.
commitlog_sync: batch, commitlog_sync_batch_window_in_ms: 1

# Degree the key location is cached. When 1.0 - everything is cached.
key_cached=1.0
</div>

<ul><li>HBase</li></ul>

<div editor_component="code_highlighter" code_type="Plain" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
HeapSize, HBase=8G, HDFS=2G
</div>

<ul><li>MongoDB</li></ul>

<div editor_component="code_highlighter" code_type="Plain" file_path="" description="" first_line="1" collapse="false" nogutter="false" nocontrols="false" style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(modules/editor/components/code_highlighter/code.png) no-repeat top right;">
# Size of Oplog. Oplog is the log the master accumulated for replication.
--oplogSize=100G
</div>

<p>The results are as shown in the following figure.</p>

<p><a href="http://blog.cubrid.org/wp-content/uploads/2011/09/db-test-results.png"><img title="Comparison of throughput per product" src="http://blog.cubrid.org/wp-content/uploads/2011/09/db-test-results.png" alt="Comparison of throughput per product" width="587" height="418" editor_component="image_link"/></a></p>

<ol>
  <li>All three products showed better throughput in INSERT.</li>
  <li>Cassandra showed outstanding throughput in INSERT-only with 20,000 ops.</li>
  <li>HBase also presented relatively good performance in INSERT-only.</li>
  <li>The products did not show large differences with READ-only than INSERT-only.</li>
  <li>HBase’s performance was better than Cassandra in READ-only.</li>
  <li>Cassandra’s performance in READ-and-UPDATE was better than the other two products. Cassandra’s READ-and-UPDATE might have been higher than READ-only due to Cassandra’s excellent insert throughout.</li>
  <li>MongoDB’s throughout in all three conditions was the lowest of the three products. MongoDB, which uses the Memory Mapped File (mmap), probably showed poor performance because the large data size exceeded the physical memory size.</li>
</ol>
Let’s take a look at why each product showed such difference in performance.

<h2>Cassandra and HBase Architecture</h2>

<p>Both Cassandra and HBase are influenced by Google’s BigTable. (<em>Although Cassandra was directly influenced by Amazon’s Dynamo.</em>) <b>BigTable</b> is a Column-Oriented DB that stores data in a Multidimensional Sorted Map format and has the <b>&lt;row key, column family, column key&gt;</b> data structure. <b><em>Column family</em></b> is a basic unit that stores data with column groups that are related to each other.</p>

<p>In BigTable, data is written basically with the append method. In other words, when modifying data, the updates are appended to a file, rather than an in-place update in the stored file. The figure below shows BigTable’s data read/insert path.</p>

<p><a href="http://blog.cubrid.org/wp-content/uploads/2011/09/bigtable-internal-structure.png"><img title="BigTable’s Internal Structure" src="http://blog.cubrid.org/wp-content/uploads/2011/09/bigtable-internal-structure.png" alt="BigTable’s Internal Structure" width="417" height="250" editor_component="image_link"/></a></p>

<p>When a write operation is inserted, it is first placed in a memory space called <em><b>memtable</b></em>. If the <em>memtable</em> is full, then the whole data is stored in a file called <em><b>SSTable</b> (Sorted String Table)</em>. Data may be lost if the server collapses before the <em>memtable</em> data is moved to <em>SSTable</em>, so to provide durability it is necessary to save the history of commit logs every time before writing to the <em>memtable</em>. When conducting the read operation, first find the pertaining key in the <em>memtable</em>. If it is not in the <em>memtable</em>, search for it in the <em>SSTable</em>. You may have to search multiple <em>SSTables</em>.</p>

<p>There are advantages in the writing operation if this architecture is used. This is because the ‘writing’ operation is only recorded in the memory and moved to the actual disk only after a certain amount has been accumulated. <b>Therefore, concurrent I/O can be avoided.</b> However, when "<em>reading</em>", if "<em>reading</em>" is done in the <em>SSTable</em> and not in the <em>memtable</em>, then the performance will relatively decrease. Both Cassandra and HBase use bloom filter to quickly judge whether the key exists in the SSTable and creates an index for use. However, if there are many <em>SSTable</em>s, then a lot of I/O will be created during the reading operation. Therefore, compaction is done to improve the reading performance. <b>Compaction</b> is where two <em>SSTable</em>s merge and sort to become one <em>SSTable</em>, which decreases the number of <em>SSTable</em>s. The reading and writing performance improved as more compactions are done.</p>

<p><em>For these reasons READ and READ-and-UPDATE are much slower than INSERT operations in these NoSQL solutions.</em></p>

<h2>Cassandra</h2>

<p>Additionally, <b>Cassandra uses consistent hashing</b>. The HBase and Casandra may have their similarities, there are differences as well. Cassandra prefers <b>AP</b> (Availability &amp; Partition tolerance) while HBase prefers <b>CP</b> (Consistency &amp; Partition tolerance). <b>CAP theorem</b> is a theory in Distributed Computing.</p>

<blockquote>The theory claims that there is no system that provides all three Consistency, Availability and Partition tolerance. For example, replications must be made in multiple nodes to increase usability, and the adjustability between replicated data must be met. However, in order to make operation possible, even during network partitioning, it becomes difficult to guarantee the adjustability between replications or data. Therefore, only a part of the CAP can be provided.</blockquote>

<p>As mentioned before, Amazon’s Dynamo has a direct influence on Cassandra. Dynamo uses consistent hashing to <em>disperse data to the key-value store, and provides high adjustability</em>. Consistent hashing sequences the value (slot) that hashed the key and placed it in a ring format. The multiple nodes that create the cluster processes certain ranges of the ring. Therefore, every time a node in the cluster falls out or comes in, the closed node on the ring can take over the concerned range or distribute the range without rehashing.</p>

<p>Cassandra also raises the usability level with the concept of consistency level. This concept is related to replication. Confirmation of the number of replications and the completion of replication can be adjusted with the system parameter. For example, if three replications are maintained and a write operation is inserted, then the operation is only considered to be successful if the three replications are completed.</p>

<p>However, Cassandra allows only N (under 3) number of executions to be checked and immediately returns to value. Therefore, write can be conducted successfully even with failures in the replication node, which raises usability. Histories of failed write operations are recorded on a different node, and the operation can be retried at a later date (this is called “<b>hinted handoff</b>”). Since the success of replicated writing is not guaranteed, the data suitability is checked in the reading stage. Generally, if there are multiple replications, it is collected into one when reading. However, Cassandra keeps in mind that not all replications match, and reads the data from all three replications, checks whether it is identical, and restores the latest data if it is not suitable (this is called “<b>read repair</b>”).</p>

<p><em>This is why Cassandra is slower in reading than writing.</em></p>

<h2>HBase</h2>

<p>Now let’s look into HBase. HBase has the same structure as BigTable. While BigTable operates on the <em><b>tablet</b></em> unit, HBase disperses and replicates with the <em><b>region</b></em> unit. Key is arranged according to ranges, and the location of the region, where the key is stored in, is saved as meta data (this is called <em><b>meta region</b></em>). Therefore, if a key is inserted, first find the region, and later the client will cache this information. HBase places the writing operation in a single region, without distributing and splits the region if it becomes too large. Adjustments, such as creating regions in advance, must be made for the action above.</p>

<p>HDFS is in charge of data storage and replication. When writing a file with Hadoop’s distribution file system, HDFS writes multiple replications (synchronous replication) and only reads one of them when reading. This guarantees data integrity and lowers reading overhead. However, since HBase and HDFS operates separately, the location of the node that contains <em>memtable</em> and <em>SSTable</em> may change, which may cause additional network I/O.</p>

<p>What happens if there is a failure in the node? If the region server collapses, other nodes take over the tasks which comprised the region. In other words, the commit log that pertains to the region is replayed and <em>memtable</em> is created. If the failed node contains HDFS’s data, then chunk that pertains to the node is migrated to a different node.</p>

<p>Thus, we have looked at the similarities and differences of Cassandra and HBase.</p>

<blockquote>Then why did the two products show a difference in writing during the comparison test?</blockquote>

<ol>
  <li>First of all, there is a <b>difference in commit log</b>. Cassandra writes the commit log once in the local file system. However, HBase enter the commit log in the HDFS and HDFS conducts the replication. Data locality is not put into consideration, so the file write request will probably be sent to a different physical node in the network.</li>
  <li>Second, Cassandra receives the write request on all three nodes, but HBase only ‘writes’ on a single region in the beginning, and receives requests on only one node.</li>
</ol>
<blockquote>Then why is there a difference in reading performance?</blockquote>

<p>There are various differences, but for one READ request Cassandra reads the data three times while HBase only reads the data once. This places an influence on the throughput.</p>

<h2>MongoDB Architecture</h2>

<p>MongoDB is unlike Cassandra or HBase eplained earlier. There are too many functional groups that can be classified as NoSQL, which makes it difficult to compare with Cassandra and HBase on the same level.</p>

<p><em>Cassandra and HBase is for processing full-scale large amounts of data, while MongoDB can be used quickly, schema-free when using a certain amount of data.</em></p>

<p>MongoDB adopts a <b>documented-oriented format</b>, so it is more similar to RDBMS than a key-value or column oriented format.</p>

<p>MongoDB <b>operates on a memory base and places high performance above data scalability</b>. If reading and writing is conducted within the usable memory, then high-performance is possible. However, performance is not guaranteed if operations exceed the given memory. Therefore, <b>MongoDB should be used as a medium or small-sized storage system</b>.</p>

<p>Configure a test condition similar to the original test on a lower specification server and test 300 thousand records instead of 50 million, and MongoDB will record greater performance than Cassandra or HBase.</p>

<p><b>Mongo DB uses <a title="BSON" href="http://bsonspec.org/">BSON</a> for data storage</b>. (As known as Binary JSON, but is similar to Hessian than JSON in data encoding format and expressive datatype.)</p>

<p>The logical unit that is expressed using BSON is called <em><b>document</b></em>. (When using NPC as an example, it is the same as when the NPC data is recorded on the disk and searched.) <em>Document</em> is a concept that corresponds to a <b>row</b> in RDBMS. The difference is that a <em>document</em> in MongoDB does not have a schema.</p>

<p>Accumulated documents<em> </em>are called <em><b>collections</b></em>. A collection corresponds to a <em><b>table</b></em> in RDBMS.</p>

<p>Basic query method is to search the collection to find documents that meet the conditions. Each document has a field called <b>_id</b>, which acts as the OID (Object Identifier), and all query results are returned with this <em>_id</em>.</p>

<p>Index can also be created for collections. An index can be prepared for the fields in the document. This index is implemented in <b>B-Tree</b>, similar to that of RDBM’s index. Each collection creates an index for <em>_id</em> by default. One of the strengths of MongoDB is that its functions are similar to RDBMS, in that it supports indexing and JOIN.</p>

<p>Further, we will now look at the structure of MongoDB.</p>

<h3>MongoDB Structure</h3>

<p>MongoDB is configurable in replication and distribution configuration. The system configuration recommended by MongoDB is three shard servers and three replica sets for each server (in other words, nine nodes). <b>Replications</b> are made possible by Master/Slave or a replica set. Replica set supports automatic failover. It has one master and multiple slaves, and write can only be sent to the master. If the master is terminated, service is resumed after one of the slaves are appointed as a master.</p>

<p>The distribution method provided by MongoDB is <b>sharding</b>. Each collection can shard, and a field in the Collection can designate a shard key. Like HBase, sharding is operated with the <b>Order-preserving partitioning method</b>. Chunk is included in the <b>&lt;collection, minkey, maxkey&gt;</b>, a continuous data range within shard, and can be as big as 64M. Chunks that have exceeded the maximum size can split into two chunks, and the divided chunks can migrate to different shard servers. The following figure contains a picture of MongoDB, which is composed of shards and replica sets.</p>

<p><a href="http://blog.cubrid.org/wp-content/uploads/2011/09/mongodb-composition.png"><img title="MongoDB structure" src="http://blog.cubrid.org/wp-content/uploads/2011/09/mongodb-composition.png" alt="MongoDB structure" width="570" height="376" editor_component="image_link"/></a></p>

<p>The client first contacts a process called <b>mongos</b>, which is in charge of <em>routing</em> and <em>coordination</em>. Coordination is done over multiple nodes through requests and the results are collected and returned. The config server retains meta information on shards and chunks.</p>

<p>Now let’s look into MongoDB's internal structure. MongoDB uses the <b>asynchronous method for replication</b>. When a “write” operation is inserted, master stores the history in <em>Oplog</em>. <b>Oplog</b> is a fixed sized, FlfO method collection called capped collection. Logs accumulated in the Oplog are regularly polled by the slave, read, then implemented. If the slave failed to replay the log while the Oplog fills up, then the slave must suspend log-based replications and synchronizes by resyncing the master’s data. One of Oplog’s other distinctive characteristics is that it can conduct the saved operations idempotently when replaying the log. Operation logs are saved in the order it was executed in, so logs can be replayed after that period. Ops such as <em>increment</em> are changed to set ops when logged for this purpose.</p>

<p>MongoDB’s replication and failover method and query power is similar to RDBMS. The most significant difference is that flexibility in data storage is provided to the application while schema-free data is stored.</p>

<h2>RDBMS Architecture</h2>

<p>Unlike NoSQL, RDBMS is <b>"read" operation oriented</b>. The detailed method differs for each product, but generally NoSQL products are implemented so that mass data can be ‘<b>written</b>’ quickly. On the other hand, RDBMS, which is ‘read’ oriented, not only enables quick reading, but also provides many functions and inquiry features. DBMS can quickly inquire values using indexes, but at the same time indexes slow down ‘writing’. Since an index must be configured for every ‘writing’ operation, the more indexes there are, the more time is needed to configure the indexes.</p>

<p>The difference of RDBMS and NoSQL can be compared with ACID and BASE, but comparisons can also be made with reading and writing.</p>

<p>If the business you are to build requires more reading than writing, and needs complex reading methods, as well as basic aggregate functions such as SUM() or AVG(), then RDBMS might be a better choice than NoSQL.</p>

<h2>Summary</h2>

<p>We have looked at benchmark tests done on Cassandra, HBase, MongoDB and the differences between the products. Cassandra and HBase show excellent writing abilities, but its reading performance did not meet expectations. This was because unlike existing RDBMSs, the two products were optimized for writing, and this resulted in a lot of concurrent I/O when reading.</p>

<p>MongoDB has similar structures to that of RDBMS and shows great flexibility in data modeling especially for medium and small-sized businesses.</p>

<p>By Lee Hye Jeong, Storage System Development Team, NHN Corporation.</p>]]></description>
                        <pubDate>Thu, 08 Sep 2011 02:27:43 +0900</pubDate>
                        <category>benchmark</category>
                        <category>MS SQL</category>
                        <category>NHN</category>
                        <category>Cassandra</category>
                        <category>Twitter</category>
                        <category>NoSQL</category>
                        <category>Facebook</category>
                        <category>Google</category>
                        <category>BigTable</category>
                        <category>Dynamo</category>
                        <category>Amazon</category>
                        <category>MongoDB</category>
                        <category>Oracle</category>
                        <category>Dev Platform Blog</category>
                        <category>HBase</category>
                                    <slash:comments>15</slash:comments>
                    </item>
        										        <item>
            <title>Comprehensive Overview of Top 14 Content Management Systems</title>
            <dc:creator>Esen Sagynov</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/comprehensive-overview-of-top-14-content-management-systems/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/comprehensive-overview-of-top-14-content-management-systems/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/comprehensive-overview-of-top-14-content-management-systems/#comment</comments>
                                    <description><![CDATA[<p><strong>Update: </strong>added the <em>Distinguished clients</em> for DotNetNuke.</p>

<p>These days many websites (<em>in fact, millions</em>) are implemented by the use of popular open source content management systems (CMS). I would say, for <em>everything</em> <em>you would imagine your site doing</em>, there is one or another CMS which can do that. Thus, the reason for choosing a particular one depends on the tasks you want to accomplish. For blogs there is a common sense of using WordPress blog management system, for large scale websites Drupal CMS is perceived to be a better fit, while Joomla CMS is easier to learn for newbies. .NET developers prefer alternative DonNetNuke or umbraco CMS. Besides these, CMS Made Simple and Liferay are amongst the most popular content management systems based on the number of the number of downloads. Elgg and MODx are amongst the rising stars.</p>

<p>I will not tell you what a CMS is. If you want to learn, see the <a title="Wikipedia: Content management system" href="http://en.wikipedia.org/wiki/Content_management_system" target="_blank">CMS Wikipedia page</a>. In this blog I will give a comprehensive overview of top 14 content management systems based on the number of weekly downloads, installations, and brand familiarity. The full statistical data can be found in the 2010 OPEN SOURCE CMS MARKET SHARE REPORT by water&amp;stone (2010). The report is distributed under the <a title="Creative Commons Attribution-Noncommercial License (3.0)" href="http://creativecommons.org/licenses/by-nc/3.0/" target="_blank">Creative Commons Attribution-Noncommercial License</a> (3.0).</p>

<p>Today we will cover <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#drupal">Drupal</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#joomla">Joomla</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#modx">MODx</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#wordpress">WordPress</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#dotnetnuke">DotNetNuke</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#umbraco">umbraco</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#liferay">Liferay</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#typo3">TYPO3</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#cmsmadeeasy">CMS Made Simple</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#movabletype">MOVABLE TYPE</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#plone">Plone</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#ezpublish">eZ Publish</a>, <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#concrete5">concrete5</a>, and <a href="http://blog.cubrid.org/web-2-0/comprehensive-overview-of-top-14-content-management-systems/#alfresco">Alfresco</a> content management systems. You might already be familiar with some of these, while some can be new to you. I bet you will be surprised to learn that some unknown for you systems suit your requirements much better than those you have got to use so far (my personal experience). So, let's learn about each of them, and <em>do not forget to leave your comments below on what other CMS should have been included in this list and why</em>.</p>

<a name="more"></a>
<p></p>

<p></p><blockquote><strong>Note:</strong><em> the order of the CMS does not represent its quality, performance, or superiority over the other. They are numbered to give a better perception to readers. The users feedback and opinions are retrieved from Stack Overflow, thus can be subjective as they are based on personal views and preferences.</em></blockquote><p></p>

<p></p><h2># 1: Drupal</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://drupal.org/" target="_blank"><img title="Drupal" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/drupal.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site:</strong> <a title="Drupal - Open Source CMS" href="http://drupal.org/" target="_blank">http://drupal.org/</a></p>

<p>1% of all websites on the Internet are based on this platform.  An estimated 7.2 million sites were powered by Drupal as of July 2010. Drupal started out in 2001. In one year from May 2007 to April 2008, Drupal was downloaded from their official site more than 1.4 million times, an increase of approximately 125% from the previous year.</p>

<p><strong>Weekly downloads:</strong> 33,671 (<em>ranked #3 after Joomla and before DotNetNuke</em>).</p>

<p><strong>Installations:</strong> 575 according to the survey (<em>#3 after WordPres and before DotNetNuke</em>), but 1.4% (<em>#3 after Joomla and before Typo3</em>) of Alexa Top 1 million sites.</p>

<p><strong>Brand Familiarity:</strong> Drupal is known to be <em>the #3 most familiar content management system</em>.</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>You can manage multiple sites with Drupal in multiple languages. You can use it for blogging site, corporate site, personal site, gallery, briefly, whatever you imagine.</li>
	<li>You can easily manage your site users, providing standard registration, including OpenID support. You can set various access control rules to limit the activity of your site users.</li>
	<li>Provides multiple-level menu system, template customization, advanced search, RSS feed aggregator.</li>
	<li>Officially Drupal supports several databases including MySQL, PostgreSQL, MariaDB, and SQLite.</li>
	<li>To increase its performance, you can use caching. At the same time it provides high security with notifications about the new update releases.</li>
	<li>Provides Search Engine Friendly descriptive URLs.</li>
	<li>Powered by jQuery JavaScript framework.</li>
</ul>
<strong>Extensions:</strong> over 7,000 free community-contributed addons, known as contrib modules.<p></p>

<p><strong>Distinguished Clients:</strong></p>

<p></p><ul>
	<li><a title="The White House of the United States of America" href="http://www.whitehouse.gov/" target="_blank">http://www.whitehouse.gov/</a></li>
	<li><a href="http://data.gov.uk/" target="_blank">http://data.gov.uk/</a></li>
	<li><a href="http://www.ubuntu.com/" target="_blank">http://www.ubuntu.com/</a></li>
	<li><a href="http://www.alrc.gov.au/" target="_blank">http://www.alrc.gov.au/</a></li>
	<li><a href="http://www.nysenate.gov/" target="_blank">http://www.nysenate.gov/</a></li>
	<li><a href="http://london.gov.uk/" target="_blank">http://london.gov.uk/</a></li>
	<li><a href="http://www.rutgers.edu/" target="_blank">http://www.rutgers.edu/</a></li>
	<li><a href="http://www.economist.com/" target="_blank">http://www.economist.com/</a></li>
	<li><a href="http://wfp.org/" target="_blank">http://wfp.org/</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>More difficult to master especially for newbies. It's more for advanced users. Though the new Drupal 7 claims to provide significantly improved usability (<em>maybe more toward WordPress style</em>). To achieve this, they hired web designers to specifically address the UX problems it had in previous versions.</li>
	<li>New major releases are not quite backward compatible. More focus on new features and functionality. External module developers should take care of compatibility themselves, except for data representation, which Drupal is intended to keep same.</li>
	<li>Drupal is often compared with Joomla and is perceived to be a bit slower. Though this depends on the type of website you develop. However, leveraging its caching and gzipping technologies, it is possible to achieve quite impressive results.</li>
	<li>jQuery, the default JavaScript framework in Drupal, allows to use alternative Mootools framework at the same time.</li>
</ul>
<h2># 2: Joomla</h2><p></p>

<p><a title="Joomla" href="http://www.joomla.org/" target="_blank"><img title="Joomla" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/joomla.png" width="701" height="325" editor_component="image_link"/></a></p>

<p><strong>Official site:</strong> <a href="http://www.joomla.org/" target="_blank">http://www.joomla.org/</a></p>

<p>Joomla is the result of a fork of Mambo CMS on August 17, 2005. Within its first year of release, Joomla had been downloaded 2.5 million times.</p>

<p><strong>Weekly downloads:</strong> 113,836 (<em>ranked #2 after WordPress and before Drupal</em>).</p>

<p><strong>Installations:</strong> 1,297 according to the survey (<em>#1 before WordPress</em>), but 2.5% (<em>#2 after WordPress and before Drupal</em>) of Alexa Top 1 million sites.</p>

<p><strong>Brand Familiarity:</strong> Joomla is known to be <em>the #1 most familiar content management system</em>.</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>You can manage multiple site with Joomla in multiple languages natively (since Joomla 1.6). You can use it for blogging site, corporate site, personal site, gallery, briefly, whatever you imagine.</li>
	<li>You can easily manage your site users, providing standard registration, including Google OpenID support. Full support for Access Control List.</li>
	<li>Provides Multiple-level menu and content category system, template customization, advanced search, RSS feed aggregator.</li>
	<li>Officially supports only MySQL.</li>
	<li>Page cashing for increased performance.</li>
	<li>Provides moderate descriptive URLs (<em>still not fully customizable as you can do in WordPress</em>).</li>
	<li>Powered by MooTools JavaScript framework.</li>
</ul>
<strong>Extensions:</strong> There are over 6,000 free and commercial plugins available from the official site.<p></p>

<p><strong>Distinguished Clients:</strong></p>

<p></p><ul>
	<li><a href="http://www.linux.com/" target="_blank">http://www.linux.com/</a></li>
	<li><a href="http://www.itwire.com/" target="_blank">http://www.itwire.com/</a></li>
	<li><a href="http://www.quizilla.com/" target="_blank">http://www.quizilla.com</a></li>
	<li><a href="http://www.ihop.com/" target="_blank">http://www.ihop.com</a></li>
	<li><a href="http://gsas.harvard.edu/" target="_blank">http://gsas.harvard.edu</a></li>
	<li>Citibank (Financial institution      intranet) - Not publicly accessible</li>
	<li><a href="http://www.greenmaven.com/" target="_blank">http://www.greenmaven.com</a></li>
	<li><a href="http://www.outdoorphotographer.com/" target="_blank">http://www.outdoorphotographer.com</a></li>
	<li><a href="http://www.playshakespeare.com/" target="_blank">http://www.playshakespeare.com</a></li>
	<li><a href="http://www.sensointeriors.co.za/" target="_blank">http://www.sensointeriors.co.za</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>More intuitive and easy to use than Drupal, though still not like WordPress.</li>
	<li>Powerful. Fully-fledged content management system, so you can create whatever site you want.</li>
	<li>Really strong security. If security problems found, immediately fixed.</li>
	<li>The new Joomla 1.6 release is expected to be faster, more convenient, with more features.</li>
	<li>Its strong dependency on Mootools JavaScript framework sometimes bothers users as Joomla does not give easy workaround to disable it and use jQuery instead.</li>
	<li>Support of only one database, limits Joomla a lot in terms of the number of users. However, this is a compromise for high optimizations for MySQL, thus increased overall performance.</li>
	<li>Does not allow to fully customize URLs - a must feature for CMS.</li>
</ul>
<h2>#3: modx</h2><p></p>

<p><a title="modx" href="http://www.modxcms.com/" target="_blank"><img title="modx" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/modx.png" width="701" height="325" editor_component="image_link"/></a></p>

<p><strong>Official site: </strong><a href="http://www.modxcms.com/" target="_blank">http://www.modxcms.com/</a></p>

<p><strong>modx</strong> is not just an open source CMS but also a web application framework. Raymond Irving and Ryan Thrash began the MODx CMS project in 2004 as a fork of Etomite. In 2008 MODx users created a new logo and branding for the project. Now MODx allows for full segregation of content (plain HTML), appearance and behavior (standards compliant CSS and JavaScript) and logic (PHP, snippets).</p>

<p><strong>Weekly downloads:</strong> 4,500 (<em>ranked #11 after umbraco and before Tiki</em>).</p>

<p><strong>Installations:</strong> 58 according to the survey (<em>#12 after eZ Publish and before umbraco</em>), has less than 0.1% among the Alexa Top 1 million sites.</p>

<p><strong>Brand Familiarity:</strong> #14 (<em>before Liferay and after eZ Publish</em>).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>As with Joomla, modx officially supports only MySQL database.</li>
	<li>Not just CMS but a PHP framework for Web.</li>
	<li>Freedom to choose jQuery, Mootools, ExtJS, Prototype or any other JavaScript library.</li>
	<li>Supports PHP 4.3.11 and above.</li>
	<li>Complete control of all metadata and URL structure for SEO (Search Engine Optimization).</li>
	<li>Unlimited hierarchical page depth.</li>
	<li>Can create custom fields and widgets for templates.</li>
	<li>Role-based permissions for the Manager.</li>
	<li>Ability to customize the Manager on a per-deployment basis.</li>
	<li>Ecommerce integration via Foxy Cart.</li>
</ul>
<strong>Extensions:</strong> 622, also known as add-ons.<p></p>

<p><strong>Distinguished Clients:</strong></p>

<p></p><ul>
	<li><a href="http://www.pippatoledo.com/" target="_blank">http://www.pippatoledo.com/</a></li>
	<li><a href="http://www.aquevix.com/" target="_blank">http://www.aquevix.com/</a></li>
	<li><a href="http://www.not1bug.com/" target="_blank">http://www.not1bug.com/</a></li>
	<li><a href="http://everlight-uva.com/" target="_blank">http://everlight-uva.com/</a></li>
	<li><a href="http://www.tritopora.ru/" target="_blank">http://www.tritopora.ru/</a></li>
	<li><a href="http://www.strategische-webloesungen.de/" target="_blank">http://www.strategische-webloesungen.de/</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>Good to have a choice for favorite JavaScript framework.</li>
	<li>Light CMS solution (but not necessarily the fastest).</li>
	<li>PHP 4 support for developers mean that a lot of compromises had to be made in terms of OOP (Object Oriented Programming) in order to offer PHP 4 support .</li>
	<li>Nice to have freedom to set custom URL.</li>
</ul>
<h2>#4: WordPress</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://wordpress.org/" target="_blank"><img title="WordPress" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/wordpress.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site:</strong> <a href="http://wordpress.org/" target="_blank">http://wordpress.org/</a></p>

<p>WordPress was first released on May 27, 2003, by Matt Mullenweg as a fork of b2/cafelog. As of August 2010, version 3.0 had been downloaded over 12.5 million times. Nowadays, known as the #1 CMS for blogging.</p>

<p><strong>Weekly downloads:</strong> 983,625 (<em>ranked #1 before Joomla</em>).</p>

<p><strong>Installations:</strong> 1,012 according to the survey (<em>#2 after Joomla and before Drupal</em>), but 12.9% of the Alexa Top 1 million sites (<em>#1 before Joomla</em>).</p>

<p><strong>Brand Familiarity:</strong> #2 (<em>after Joomla and before Drupal</em>).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Highly optimized for blogging.</li>
	<li>Custom and easy to switch themes.</li>
	<li>Users can re-arrange widgets without editing PHP or HTML code.</li>
	<li>Official support for only MySQL.</li>
	<li>Custom URL, clean permalink structure, excellent for SEO.</li>
	<li>Nested, multiple categories to articles.</li>
	<li>Support for tagging. Advanced search by tags.</li>
	<li>Highly intuitive UI (User Interface).</li>
	<li>jQuery JavaScript framework.</li>
	<li>Supports the Trackback and Pingback standards for displaying links to other sites that have themselves linked to a post or article.</li>
	<li>Rich plugin architecture which allows users and developers to extend its functionality beyond the features that come as part of the base install.</li>
</ul>
Native applications exist for Android, iPhone/iPod Touch,<a href="http://en.wikipedia.org/wiki/WordPress#cite_note-6" target="_blank"></a> and BlackBerry which provide access to some of the features in the WordPress Admin panel and work with WordPress.com and many WordPress.org blogs.<p></p>

<p><strong>Extensions:</strong> 12,780 plugins and 1,315 themes.</p>

<p><strong>Distinguished Clients:</strong></p>

<p></p><ul>
	<li><a href="http://www.nytimes.com/interactive/blogs/directory.html" target="_blank">http://www.nytimes.com/interactive/blogs/directory.html</a></li>
	<li><a href="http://tmagazine.blogs.nytimes.com/" target="_blank">http://tmagazine.blogs.nytimes.com/</a></li>
	<li><a href="http://crowdfavorite.com/" target="_blank">http://crowdfavorite.com/</a></li>
	<li><a href="http://blog.broadband.gov/" target="_blank">http://blog.broadband.gov/</a></li>
	<li><a href="http://blogs.america.gov/" target="_blank">http://blogs.america.gov/</a></li>
	<li><a href="http://www.number10.gov.uk/" target="_blank">http://www.number10.gov.uk/</a></li>
	<li><a href="http://www.speaker.gov/" target="_blank">http://www.speaker.gov/</a></li>
	<li><a href="http://allthingsd.com/" target="_blank">http://allthingsd.com/</a></li>
	<li><a href="http://politicalticker.blogs.cnn.com/" target="_blank">http://politicalticker.blogs.cnn.com/</a></li>
	<li><a href="http://stylenews.peoplestylewatch.com/" target="_blank">http://stylenews.peoplestylewatch.com/</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>Perhaps, the most convenient, easy to use and intuitive CMS (<em>or BMS</em>) in the world. Perfect for blog sites. But if you need to develop a dynamic site with various components, perhaps, other CMS would fit better, though, WordPress provides enough plugins to accomplish almost all tasks.</li>
	<li>I would avoid Wordpress as a CMS in a professional environment. As stated earlier, it's a great blogging platform, but doesn't generally offer the robustness that most professional environments require.</li>
	<li>Availability of jQuery makes the plugin development a lot easier for external developers and site owners.</li>
	<li>Endless themes - no need to worry about the new design for your site, unless you really need something special.</li>
</ul>
<h2>#5: DotNetNuke</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://www.dotnetnuke.com/" target="_blank"><img title="DotNetNuke" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/dotnetnuke.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site:</strong> <a href="http://www.dotnetnuke.com/" target="_blank">http://www.dotnetnuke.com/</a></p>

<p>DotNetNuke is an open source platform for building web sites based on Microsoft .NET technology. It is written in VB.NET and distributed under both a Community Edition BSD-style license <a href="http://en.wikipedia.org/wiki/DotNetNuke#cite_note-autogenerated1-2" target="_blank"></a>and a commercial proprietary license. The Community Edition is a popular web content management (WCM) system and application development framework for ASP.NET, with over 6 million downloads and 600,000 production web sites as of October 2010. More than 8,000 DotNetNuke apps are available for purchase on Snowcovered.com. DotNetNuke.com has over 800,000 registered members as of October 2010.</p>

<p><strong>Weekly downloads:</strong> 13,000 (<em>ranked #4 after Drupal and before CMS Made Simple</em>).</p>

<p><strong>Installations:</strong> 402 according to the survey (<em>#4 after Drupal and before Liferay</em>), but 0.2% of the Alexa Top 1 million sites (<em>#4 after Typo3 and before MOVABLE TYPE</em>).</p>

<p><strong>Brand Familiarity:</strong> #4 (<em>after Drupal and before Typo3</em>).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Distinguishes between community (common features) and enterprise (full set of features) editions.</li>
	<li>Various modules, and data providers.</li>
	<li>Provides language packs for about 60 languages.</li>
	<li>Customizable through skins and templates.</li>
</ul>
<strong>Distinguished&nbsp;clients:</strong><p></p>

<p></p><ul>
	<li><a href="http://www.chamberlain.edu/" target="_blank">http://www.chamberlain.edu/</a></li>
	<li><a href="http://magenic.com/" target="_blank">http://magenic.com/</a></li>
	<li><a href="http://www.graphiksolutions.ca/" target="_blank">http://www.graphiksolutions.ca/</a></li>
	<li><a href="http://www.marly.com/" target="_blank">http://www.marly.com/</a></li>
	<li><a href="http://www.dotcomsoftwaresolutions.com/" target="_blank">http://www.dotcomsoftwaresolutions.com</a></li>
	<li><a href="http://www.cityplaceascent.com/" target="_blank">http://www.cityplaceascent.com/</a></li>
	<li><a href="http://sites.kiwanis.org/kiwanis/en/home.aspx" target="_blank">http://sites.kiwanis.org/kiwanis/en/home.aspx</a></li>
	<li><a href="http://www.zonediet.com/" target="_blank">http://www.zonediet.com/</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>A little bit difficult to create modules.</li>
	<li>It is perceived as a little bit bulky CMS.</li>
	<li>Does not provide extensive documentation and user guides.</li>
	<li>Similar to most CMS, does not provide full backward compatibility in its new major releases.</li>
	<li>Unlike its enterprise edition, the community edition is not tested and certified by the DotNetNuke Corporation.</li>
	<li>Even if it provides the language packs, the sites cannot be created to support multiple languages. Third party module should be used to enable this feature.</li>
	<li>Auto-upgrade, Advanced Site Search, Page Cashing, and many must-have features are not included in the Community edition, only in Professional or Enterprise editions.</li>
</ul>
<h2>#6: Umbraco</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://umbraco.org/" target="_blank"><img title="umbraco" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/umbraco.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site: </strong><a href="http://umbraco.org/" target="_blank">http://umbraco.org/</a></p>

<p>Umbraco is also an open source content management system. It was developed by Niels Hartvig in 2000 and released as open source software in 2004<a href="http://en.wikipedia.org/wiki/Umbraco#cite_note-history-1" target="_blank"></a>. It is written in C# and can be deployed on Microsoft based infrastructure. In 2010, with about <strong>1,000 downloads a day</strong>, Umbraco was in the Top 5 most popular downloads via the <em>Microsoft Web Platform Installer</em>, two places below its main rival DotNetNuke.</p>

<p><strong>Weekly downloads:</strong> 5,420 (<em>ranked #10 after Alfresco and before MODx</em>).</p>

<p><strong>Installations:</strong> 57 according to the survey (<em>#13 after MODx and before e107</em>), but less than 0.1% of the Alexa Top 1 million sites.</p>

<p><strong>Brand Familiarity:</strong> #16 (after Liferay and before e107).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Can be deployed with several databases, including MySQL, SQL Server, and VistaDB.</li>
	<li>SEO-friendly URLs.</li>
</ul>
<strong>Extensions:</strong><span style="font-weight: normal;"> 310 add-on modules.</span><p></p>

<p><strong>What users say:</strong></p>

<p></p><ul>
	<li>Limited number of extensions.</li>
	<li>Official support for Windows OS only.</li>
	<li>Umbraco is oriented to small low-cost sites.</li>
	<li>Video trainings have to be purchased.</li>
	<li>Many must-have features have to purchased.</li>
</ul>
<h2>#7: Liferay</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://www.liferay.com/" target="_blank"><img title="Liferay" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/liferay.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site: </strong><a href="http://www.liferay.com/" target="_blank">http://www.liferay.com/</a></p>

<p><strong>Liferay </strong>Portal is a free and open source enterprise portal written in Java and distributed under the GNU Lesser General Public License. It allows users to set up features common to websites. It is fundamentally constructed of functional units called portlets. Liferay is sometimes described as a content management framework or a web application framework. It comes with certain portlets preinstalled. These comprise the core functionality of the portal system.</p>

<p><strong>Weekly downloads:</strong> 9,435 (<em>ranked #6 after CMS Made Simple and before TYPO3</em>).</p>

<p><strong>Installations:</strong> 154 according to the survey (<em>#5 after DotNetNuke and before TYPO3</em>), but less than 0.1% of the Alexa Top 1 million sites.</p>

<p><strong>Brand Familiarity:</strong> #15 (after MODx and before Umbraco).</p>

<p><strong>Major Features/ Portlets:</strong></p>

<p></p><ul>
	<li>Can tag and categorize contents.</li>
	<li>Document Library Manager, Recent Documents.</li>
	<li>Alfresco, Documentum, and other document library integration.</li>
	<li>User management based on various roles and groups (ACL).</li>
	<li>WebDAV Integration (Web-based Distributed Authoring and Versioning which allows users to collaboratively edit and manage files on remote web servers).</li>
	<li>Nested Portlets</li>
	<li>User Directory</li>
	<li>LDAP Integration</li>
	<li>Microsoft Office Integration</li>
	<li>Calendar/Chat/Mail/Message Boards/Polls</li>
	<li>Wiki (supports Creole as well as MediaWiki syntax)</li>
	<li>Alerts and Announcements</li>
	<li>Knowledge Base</li>
	<li>Social Equity</li>
	<li>Can create multi-language sites.</li>
	<li>Asset Publisher to publish many contents, tagged by a specific term, at once.</li>
</ul>
<strong>Extensions:</strong> 27 official plugins and 208 community developed plugins.<p></p>

<p><strong>Distinguished Clients:</strong> It is primarily used to power corporate business sites.</p>

<p></p><ul>
	<li><a href="http://developer.cisco.com/" target="_blank">http://developer.cisco.com</a></li>
	<li><a href="http://www.t-mobile.cz/" target="_blank">http://www.t-mobile.cz</a></li>
	<li><a href="http://www.betavine.net/" target="_blank">http://www.betavine.net</a></li>
	<li><a href="http://www.foxchannel.de/" target="_blank">http://www.foxchannel.de/</a></li>
	<li><a href="http://www.ixarm.com/" target="_blank">http://www.ixarm.com</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>It is mostly used by enterprise companies rather than for powering personal or community sites, though it has social and collaboration features.</li>
	<li>More professional developers driven project (<em>backed by Liferay Inc.</em>) rather than the community driven.</li>
	<li>Provide paid Enterprise edition.</li>
	<li>All features are available in Community edition (<em>except for support and customization related</em>).</li>
</ul>
<h2>#8: Typo3</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://typo3.org/" target="_blank"><img title="TYPO3" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/typo3.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site:</strong> <a href="http://typo3.org/" target="_blank">http://typo3.org/</a></p>

<p><strong>TYPO3</strong> is a free and open source CMS released under the GNU General Public License oriented to small to mid size enterprise-class users. TemplaVoila is an alternative template engine extension for TYPO3. A graphical mapping tool for creating templates is included, an alternative page module, the ability to create flexible content elements and an API for developers. New content element types can be created without programming. TemplaVoila facilitates more flexibility for maintaining web pages than TYPO3's standard templating, while making it possible to enforce a strict corporate design and allowing editors to work with content more intuitively.</p>

<p><strong>Weekly downloads:</strong> 7,461 (ranked #7 after Liferay and before eZ Publish).</p>

<p><strong>Installations:</strong> 122 according to the survey (#6 after Liferay and before Tiki), 0.6% of the Alexa Top 1 million sites (#4 after Drupal and before DotNetNuke).</p>

<p><strong>Brand Familiarity:</strong> #5 (after DotNetNuke and before OpenCMS).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Supports MySQL, Oracle, MS-SQL, PostgreSQL, ODBC, LDAP - virtually any external data source.</li>
	<li>You can undo any change you make on the site.</li>
	<li>Can create multiple sites with multiple domains for each.</li>
	<li>Can have multiple template per site.</li>
	<li>User management.</li>
	<li>Able to switch from administrator user to general user to check permissions.</li>
	<li>Sandbox: administrators can set up a section within the system to test new features without disturbing the main site.</li>
	<li>Versioning of content pages.</li>
	<li>Advanced caching: template, navigation or page level.</li>
	<li>Link management.</li>
	<li>Multi-language content.</li>
	<li>Search Engine friendly URLs.</li>
</ul>
<strong>Extensions:</strong> more than 4,500 pluggable extensions are available for TYPO3.<p></p>

<p><strong>Distinguished clients:</strong></p>

<p></p><ul>
	<li><a href="http://www.volkswagen-ir.de/" target="_blank">http://www.volkswagen-ir.de</a></li>
	<li><a href="http://www.eadssecureuk.com/" target="_blank">http://www.eadssecureuk.com</a></li>
	<li><a href="http://www.geveriwise-eu.com/" target="_blank">http://www.geveriwise-eu.com</a></li>
	<li><a href="http://www.rewe-xxl.de/" target="_blank">http://www.rewe-xxl.de</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>The administrator's UI (User Interface) is not user-friendly, though intuitive and functional.</li>
	<li>It is extremely powerful which provide more enterprise level features, but the learning curve is <em>incredibly</em> steep.</li>
	<li>Great ease of multi-lingual site management.</li>
	<li>Difficult to customize templates. Need to learn TypoScript and TemplaVoila, two TYPO3-specific systems.</li>
</ul>
<h2>#9: CMS Made Simple</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://www.cmsmadesimple.org/" target="_blank"><img title="CMS Made Simple" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/cms-made-simple.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site: </strong><a href="http://www.cmsmadesimple.org/" target="_blank">http://www.cmsmadesimple.org/</a></p>

<p>CMS Made Simple is an open source cms built using PHP with support for MySQL and PostgreSQL. The template system is driven using the Smarty Template Engine.</p>

<p><strong>Weekly downloads:</strong> 9,948 (ranked #5 after DotNetNuke and before Liferay).</p>

<p><strong>Installations:</strong> 72 according to the survey (#7 after Tiki and before Alfresco), 0.1% of the Alexa Top 1 million sites (#8 after Xoops and before eZ Publish).</p>

<p><strong>Brand Familiarity:</strong> #12 (after Xoop and before Ez Publish).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Officially supports MySQL and PostgreSQL.</li>
	<li>Search Engine Friendly URLs.</li>
	<li>Users management and group based permissions system.</li>
	<li>Content tagging.</li>
	<li>Site localization is available in 20 languages.</li>
</ul>
<strong>Extensions:</strong> unknown, but many.<p></p>

<p><strong>What users say:</strong></p>

<p></p><ul>
	<li>It is oriented more toward professional users.</li>
	<li>It does not provide many templates, so coding knowledge is a must.</li>
</ul>
<h2><strong>#10: Movable Type</strong></h2><p></p>

<p></p><p style="text-align: center;"><strong><a href="http://movabletype.org/" target="_blank"><img title="MOVABLETYPE" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/movabletype.png" width="701" height="325" editor_component="image_link"/></a> </strong></p><p></p>

<p><strong>Official site: </strong><a href="http://movabletype.org/" target="_blank">http://movabletype.org/</a></p>

<p><strong>Movable Type</strong> is a weblog publishing system, similar to WordPress, developed by the Six Apart company in Perl programming language. At various times, this company maintained three other publishing systems - TypePad, Vox, and LiveJournal. Movable Type was publicly announced on September 3, 2001. Version 1.0 was publicly released on October 8, 2001, thus it is a blogging system older than WordPress. On 12 December 2007, Movable Type was relicensed as free software under the GNU General Public License. Based on the list of its customers, Movable Type is quite credible CMS.</p>

<p><strong>Weekly downloads:</strong> unavailable.</p>

<p><strong>Installations:</strong> 30 according to the survey (#17 after Silverstripe and before OpenCMS), 0.1% of the Alexa Top 1 million sites (#6 after DotNetNuke and before Xoops).</p>

<p><strong>Brand Familiarity:</strong> #8 (after Plone and before Alfresco).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Convenient blogging system with social community features.</li>
	<li>Since version 5 officially supports only MySQL. PostgreSQL and SQLite can be used via plugins. Databases such as Oracle can be integrated with Movable Type Enterprise edition.</li>
	<li>Manage user roles. OpenID support.</li>
	<li>Multiple site hosting.</li>
	<li>Easily customizable templates.</li>
	<li>Revision history.</li>
	<li>Can add custom fields.</li>
	<li>Content tags and categories.</li>
	<li>Feeds and trackback links.</li>
	<li>Available localization and internationalization.</li>
	<li>Can generate static pages (updated whenever the content of the site is changed).</li>
</ul>
<strong>Extensions:</strong> about 1,000 plugins.<p></p>

<p><strong>Distinguished clients:</strong></p>

<p></p><ul>
	<li><a href="http://www.britneyspears.com/" target="_blank">http://www.britneyspears.com/</a></li>
	<li><a href="http://www.barackobama.com/" target="_blank">http://www.barackobama.com/</a></li>
	<li><a href="http://boeingblogs.com/" target="_blank">http://boeingblogs.com/</a></li>
	<li><a href="http://blogs.oracle.com/" target="_blank">http://blogs.oracle.com/</a></li>
	<li><a href="http://gehealthcare.typepad.com/" target="_blank">http://gehealthcare.typepad.com/</a></li>
	<li><a href="http://www.spd.org/" target="_blank">http://www.spd.org/</a></li>
	<li><a href="http://blog.ted.com/" target="_blank">http://blog.ted.com/</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>Users often deploy Movable Type for their blog sites, newspaper or other type of publishing sites.</li>
	<li>A powerful alternative for WordPress.</li>
</ul>
<h2>#11: Plone</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://plone.org/" target="_blank"><img title="Plone" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/plone.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site: </strong><a href="http://plone.org/" target="_blank">http://plone.org/</a></p>

<p><strong>Plone</strong>, a free and open source CMS, started in 1999 by Alexander Limi, Alan Runyan, and Vidar Andersen. It was made as a usability layer on top of the Zope content management framework, thus Plone is written in Python. The first version was released in 2001. In 2004, Plone 2.0 was released. This release brought more customizable features to Plone, and enhanced the add-on functions. In 2007, Plone 3 was released. This new release brought inline editing, an upgraded visual editor, and strengthened security, among many other enhancements. Recently in 2010, Plone 4 was released with major improvements in performance.</p>

<p><strong>Weekly downloads:</strong> unavailabe.</p>

<p><strong>Installations:</strong> 34 according to the survey (#15 after e107 and before Silverstripe), 0.1% of the Alexa Top 1 million sites (#10 after eZ Publish).</p>

<p><strong>Brand Familiarity:</strong> #7 (after OpenCMS and before Movable Type).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Inline editing - no need to reload the page for editing.</li>
	<li>Localized into 40 languages.</li>
	<li>Plone can be integrated with Active Directory, Salesforce, LDAP, SQL, Web Services, and Oracle.</li>
	<li>Working Copy support - keep the old version of a content published until you publish a new version.</li>
	<li>Cut/copy/paste operations on content.</li>
	<li>Link and reference integrity checking - no more broken links within your site.</li>
	<li>Can create workflows - useful for organizations.</li>
	<li>LiveSearch - instant site search powered by AJAX.</li>
	<li>Full-text indexing of Word and PDF documents.</li>
	<li>Wiki support</li>
	<li>Collaboration and sharing</li>
	<li>Automatic locking and unlocking</li>
	<li>Versioning, history and reverting content</li>
	<li>Authentication back-end</li>
	<li>Collections</li>
	<li>Multilingual content management</li>
	<li>Automatic previous/next navigation</li>
	<li>Human-readable URLs</li>
	<li>Caching proxy integration</li>
	<li>Drag and drop reordering of content</li>
	<li>Adjustable templates on content</li>
	<li>RSS feed support</li>
	<li>Automatic image scaling and thumbnail generation</li>
	<li>Comment capabilities on any content</li>
	<li>WebDAV and FTP support</li>
	<li>Hot -backup support</li>
</ul>
<strong>Extensions:</strong> 1490 add-ons.<p></p>

<p><strong>Distinguished clients:</strong></p>

<p></p><ul>
	<li><a href="http://www.amnesty.ch/en" target="_blank">http://www.amnesty.ch/en</a></li>
	<li><a href="http://www.brasil.gov.br/" target="_blank">http://www.brasil.gov.br/</a></li>
	<li><a href="http://www.chicagohistory.org/" target="_blank">http://www.chicagohistory.org/</a></li>
	<li><a href="http://ccnmtl.columbia.edu/" target="_blank">http://ccnmtl.columbia.edu/</a></li>
	<li><a href="http://cnx.org/" target="_blank">http://cnx.org/</a></li>
	<li><a href="http://discovermagazine.com/" target="_blank">http://discovermagazine.com/</a></li>
	<li><a href="http://www.ece.rice.edu/" target="_blank">http://www.ece.rice.edu/</a></li>
	<li><a href="http://www.engagemedia.org/" target="_blank">http://www.engagemedia.org/</a></li>
	<li><a href="http://science.nasa.gov/" target="_blank">http://science.nasa.gov/</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>It has a lot of features (<em>several times more than listed above</em>) built-in.</li>
	<li>To become a Plone expert is long and expensive. It has a steep learning curve.</li>
	<li>Plone is backed by Zope framework which is very powerful with support for caching, rollback, etc. - everything what your organization might need.</li>
	<li>Plone is really complex to deeply tweak.</li>
	<li>Plone is known for its high security.</li>
	<li>Plone runs slow if you don't know how to optimize it.</li>
</ul>
<h2>#12: eZ Publish</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://ez.no/" target="_blank"><img title="eZ Publish" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/ez.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site: </strong><a href="http://ez.no/" target="_blank">http://ez.no/</a></p>

<p>eZ Publish is an open source enterprise CMS developed by the Norwegian company eZ Systems in 1999 using PHP programming language. eZ Publish is freely available under the GPL licence, as well as under proprietary licenses that include commercial support. eZ Publish supports the development of customized web applications. Typical applications range from a personal homepage to a multilingual corporate website, which include role-based multi-user access, e-commerce functions and online communities.</p>

<p><strong>Weekly downloads:</strong> 7,031 (ranked #8 after TYPO3 and before Alfresco).</p>

<p><strong>Installations:</strong> 60 according to the survey (#11 after Concrete5 and before MODx), 0.1% of the Alexa Top 1 million sites (#9 after CMS Made Simple and before Plone).</p>

<p><strong>Brand Familiarity:</strong> #13 (after CMS Made Simple and before MODx).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Advanced Search feature is available in only Enterprise Edition.</li>
	<li>Online image editor.</li>
	<li>Can create building blocks for the site and later reuse them in several pages.</li>
</ul>
<strong>Distinguished clients:</strong><p></p>

<p></p><ul>
	<li><a href="http://www.schemexpert.com/" target="_blank">http://www.schemexpert.com/</a></li>
	<li><a href="http://www.ticotimes.net/" target="_blank">http://www.ticotimes.net/</a></li>
	<li><a href="http://jp.wsj.com/" target="_blank">http://jp.wsj.com/</a></li>
	<li><a href="http://www.laborange.fr/" target="_blank">http://www.laborange.fr/</a></li>
	<li><a href="http://canalstreet.canalplus.fr/" target="_blank">http://canalstreet.canalplus.fr/</a></li>
	<li><a href="http://www.hks.harvard.edu/" target="_blank">http://www.hks.harvard.edu/</a></li>
	<li><a href="http://www.vogue.com.au/" target="_blank">http://www.vogue.com.au/</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>The documentation is a bit limited especially there is almost no documentation for module development.</li>
</ul>
<h2>#13: Concrete 5</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://www.concrete5.org/" target="_blank"><img title="concrete5" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/concrete5.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site: </strong><a href="http://www.concrete5.org/" target="_blank">http://www.concrete5.org/</a></p>

<p><strong>Concrete5</strong> is an open source CMS started in 2003 as a rapid-design approach to building the now-defunct LewisAndClark200.org, the official site for the Ad Council's National Council for the Lewis &amp; Clark Bicentennial. Concrete5 is developed in PHP and is distributed under MIT software license.</p>

<p><strong>Weekly downloads:</strong> unavailable.</p>

<p><strong>Installations:</strong> 62 according to the survey (#10 after Alfresco and before eZ Publish), has less than 0.1% of the Alexa Top 1 million sites.</p>

<p><strong>Brand Familiarity:</strong> #20 (after TextPattern).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Integrated server side caching.</li>
	<li>Support for only MySQL.</li>
	<li>Inline content editing.</li>
	<li>Image editing tool.</li>
	<li>Editable areas are defined in concrete5 templates which allow editors to insert 'blocks' of content. Additional blocks are available as add-ons.</li>
	<li>Automatic upgrade is available.</li>
	<li>Advanced Permissions to track content versions.</li>
</ul>
<strong>Distinguished clients:</strong><p></p>

<p></p><ul>
	<li><a href="http://www.genco.com/" target="_blank">http://www.genco.com</a></li>
	<li><a href="http://www.cottonfrombluetogreen.org/" target="_blank">http://www.cottonfrombluetogreen.org/</a></li>
	<li><a href="http://www.cs.uh.edu/" target="_blank">http://www.cs.uh.edu/</a></li>
	<li><a href="http://www.signals.ca/" target="_blank">http://www.signals.ca/</a></li>
	<li><a href="http://www.paulraymondgregory.com/" target="_blank">http://www.paulraymondgregory.com</a></li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>It's PHP based and quite new but it's quite a nice layout and it's really natural for new CMS users. You can go from a paper-based sitemap and PSD to a full site structure, ready for data entry, within a day, two at a push.</li>
	<li>Concrete5 is simple, suitable for creating sites quickly.</li>
	<li>Creating template is very easy with Concrete5.</li>
</ul>
<h2>#14: Alfresco</h2><p></p>

<p></p><p style="text-align: center;"><a href="http://www.alfresco.com/" target="_blank"><img title="Alfresco" src="http://blog.cubrid.org/wp-content/uploads/2011/1-/alfresco.png" width="701" height="325" editor_component="image_link"/></a></p><p></p>

<p><strong>Official site: </strong><a href="http://www.alfresco.com/" target="_blank">http://www.alfresco.com/</a></p>

<p><strong>Alfresco</strong> is an open source enterprise content management system for Microsoft Windows and Unix-like operating systems. Alfresco includes a content repository, an out-of-the-box web portal framework for managing and using standard portal content, a CIFS interface that provides file system compatibility on Microsoft Windows and Unix-like operating systems, a web content management system capable of virtualizing web apps and static sites via Apache Tomcat, Lucene indexing, and jBPM workflow. The Alfresco system is developed using Java technology. John Newton (co-founder of Documentum) and John Powell (a former COO of Business Objects) founded Alfresco Software, Inc. in 2005.</p>

<p><strong>Weekly downloads:</strong> 7,000 (ranked #9 after Ez Publish and before Umbraco).</p>

<p><strong>Installations:</strong> 70 according to the survey (#9 after CMS Made Simple and before Concrete5) Alexa ranking is not available.</p>

<p><strong>Brand Familiarity:</strong> #9 (after Movable Type and before Tiki).</p>

<p><strong>Major Features:</strong></p>

<p></p><ul>
	<li>Document Management.</li>
	<li>Web Content Management (including full webapp &amp; session virtualization).</li>
	<li>Repository-level versioning (similar to Subversion).</li>
	<li>Records Management, including 5015.2 certification.</li>
	<li>Repository access via CIFS /SMB, FTP, Web DAV, NFS and CMIS.</li>
	<li>j BPM workflow.</li>
	<li><a title="Lucene" href="http://en.wikipedia.org/wiki/Lucene" target="_blank"></a>Advanced search with Lucene.</li>
	<li>Multi-language support.</li>
	<li>Officially runs on Windows, Linux and Solaris.</li>
	<li>User Interface official supports Internet Explorer and Firefox.</li>
	<li>Desktop integration with Microsoft Office and OpenOffice.org.</li>
	<li>Clustering support.</li>
	<li>Pluggable authentication: NTLM, LDAP, Kerberos, CAS.</li>
</ul>
<strong>Distinguished clients:</strong> no links are available but numerous case studies can be found on Alfresco home page.<p></p>

<p></p><ul>
	<li>France Air Force</li>
	<li>Harvard Business School Publishing</li>
	<li>Toyota</li>
	<li>Sony Pictures</li>
	<li>Fox</li>
	<li>National Academy of Sciences</li>
	<li>Cisco</li>
</ul>
<strong>What users say:</strong><p></p>

<p></p><ul>
	<li>Alfresco is mostly for enterprises rather than for personal sites.</li>
	<li>Simple to install and use, flexible and open-ended.</li>
	<li>Alfresco is a solution with the broadest range of technical capabilities and the best feedback from users. In addition to demonstrating a promising roadmap for collaboration tools, Alfresco was highly attractive from a cost perspective, compared to the proprietary products offered by other ECM vendors.</li>
	<li>All in one solution for enterprises.</li>
</ul>
<h2>See also</h2><p></p>

<p>Besides, the official home pages, you can refer to Wikipedia for&nbsp;general information on these content management systems. Also <a title="Stack Overflow" href="http://stackoverflow.com/" target="_blank">Stack Overflow</a> provides very informative users feedbacks on each of these CMS.</p>

<p></p><ol>
	<li><a href="http://en.wikipedia.org/wiki/Drupal" target="_blank">http://en.wikipedia.org/wiki/Drupal</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Joomla" target="_blank">http://en.wikipedia.org/wiki/Joomla</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Modx" target="_blank">http://en.wikipedia.org/wiki/Modx</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Wordpress" target="_blank">http://en.wikipedia.org/wiki/Wordpress</a></li>
	<li><a href="http://en.wikipedia.org/wiki/DotNetNuke" target="_blank">http://en.wikipedia.org/wiki/DotNetNuke</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Umbraco" target="_blank">http://en.wikipedia.org/wiki/Umbraco</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Liferay" target="_blank">http://en.wikipedia.org/wiki/Liferay</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Typo3" target="_blank">http://en.wikipedia.org/wiki/Typo3</a></li>
	<li><a href="http://en.wikipedia.org/wiki/CMS_Made_Simple" target="_blank">http://en.wikipedia.org/wiki/CMS_Made_Simple</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Movable_Type" target="_blank">http://en.wikipedia.org/wiki/Movable_Type</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Plone_(software" target="_blank">http://en.wikipedia.org/wiki/Plone_(software)</a></li>
	<li><a href="http://en.wikipedia.org/wiki/EZ_Publish" target="_blank">http://en.wikipedia.org/wiki/EZ_Publish</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Concrete5" target="_blank">http://en.wikipedia.org/wiki/Concrete5</a></li>
	<li><a href="http://en.wikipedia.org/wiki/Alfresco_(software" target="_blank">http://en.wikipedia.org/wiki/Alfresco_(software)</a></li>
	<li><a title="OPEN SOURCE CMS MARKET SHARE REPORT" href="http://www.waterandstone.com/sites/default/files/2010 OSCMS Report.pdf" target="_blank">http://www.waterandstone.com/sites/default/files/2010%20OSCMS%20Report.pdf</a></li>
	<li><a href="http://php.opensourcecms.com/" target="_blank">http://php.opensourcecms.com/</a></li>
	<li><a href="http://stackoverflow.com/" target="_blank">http://stackoverflow.com/</a></li>
</ol><p></p>]]></description>
                        <pubDate>Fri, 21 Jan 2011 11:22:43 +0900</pubDate>
                        <category>Wordpress</category>
                        <category>Joomla</category>
                        <category>CMS</category>
                        <category>content management system</category>
                        <category>Web 2.0</category>
                        <category>Drupal</category>
                        <category>Concrete5</category>
                        <category>MODx</category>
                        <category>DotNetNuke</category>
                        <category>Liferay</category>
                        <category>TYPO3</category>
                        <category>CMS Made Simple</category>
                        <category>MOVABLE TYPE</category>
                        <category>Plone</category>
                        <category>eZ Publish</category>
                        <category>Alfresco</category>
                        <category>Umbraco</category>
                                    <slash:comments>10</slash:comments>
                    </item>
        										        <item>
            <title>Our Experience of Creating Large Scale Log Search System Using ElasticSearch</title>
            <dc:creator>Lee Jae Ik</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/our-experience-creating-large-scale-log-search-system-using-elasticsearch/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/our-experience-creating-large-scale-log-search-system-using-elasticsearch/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/our-experience-creating-large-scale-log-search-system-using-elasticsearch/#comment</comments>
                                    <description><![CDATA[<p>At NHN we have a service called NELO (NHN Error Log System) to manage and search logs pushed to the system by various applications and other Web services. The&nbsp;search performance and functionality of&nbsp;NELO2, the&nbsp;second generation of the system,&nbsp;have significantly been improved through ElasticSearch.&nbsp;Today I would like to share our experience at <a href="http://cubrid.org/blog/tags/NHN">NHN</a> in deploying ElasticSearch in Log Search Systems.</p>
<p><a href="http://www.elasticsearch.org">ElasticSearch</a> is a distributed search engine based on Lucene developed by&nbsp;Shay Banon. Shay and his team have recently released the long awaited version 0.90. Here is a link to a <a href="http://info.elasticsearch.com/Recorded_0.90_Webinar.html">one-hour recorded webinar</a>&nbsp;where Clinton Gormley, one of the core ElasticSearch developers, explains what's new in ElasticSearch 0.90.</p>
<p>If you are developing a system which requires a search functionality, I would recommend ElasticSearch as its installation and server expansion are very easy. Since it is a distributed system, ElasticSearch can easily cope with an increase in the volume of search targets. At NHN all logs coming into NELO2 are stored and indexed by ElasticSearch for faster near real-time search results.</p>
<h2>Features of ElasticSearch</h2>
<p>Let's get started with familiarizing ourselves with the terms widely used in ElasticSearch.&nbsp;For those who are familiar with relational database systems, the following table compares the terms used in relational databases with the terms used in ElasticSearch.</p>
<table border="0">
<caption>Table 1: Comparison of the terms of RDBMS and ElasticSearch.</caption> <thead> 
<tr>
<th>Relational DB</th> <th>ElasticSearch</th>
</tr>
</thead> 
<tbody>
<tr>
<td>Database</td>
<td>Index</td>
</tr>
<tr>
<td>Table</td>
<td>Type</td>
</tr>
<tr>
<td>Row</td>
<td>Document</td>
</tr>
<tr>
<td>Column</td>
<td>Field</td>
</tr>
<tr>
<td>Schema</td>
<td>Mapping</td>
</tr>
<tr>
<td>Index</td>
<td>Everything is indexed</td>
</tr>
<tr>
<td>SQL</td>
<td>Query DSL</td>
</tr>
</tbody>
</table>
<h3>JSON-based Schemaless Storage</h3>
<p>ElasticSearch is a search engine but can be used like <a href="/blog/tags/NoSQL/">NoSQL</a>. Since a data model is represented in JSON, both requests and responses are exchanged as JSON documents. Moreover, sources are also stored in JSON. Although schema is not defined in advance, JSON documents are automatically indexed when they are transferred. Number and date types are automatically mapped.</p>
<h3>Multitenancy</h3>
<p>ElasticSearch supports <a href="http://en.wikipedia.org/wiki/Multitenancy">multitenancy</a>. Multiple indexes can be stored in a single ElasticSearch server, and data of multiple indexes can be searched with a <i>single</i> query. NELO2 separates indexes by date and stores logs. When executing a search, NELO requests indexes of dates within the scope of search with a single query.</p>
<p style="text-align: center;"><b>Code 1: Multitenancy Example Query.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;Store&nbsp;logs&nbsp;in&nbsp;the&nbsp;log-2012-12-26&nbsp;index<br /> curl&nbsp;-XPUT&nbsp;http://localhost:9200/log-2012-12-26/hadoop/1&nbsp;-d&nbsp;'{<br /> &nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-26T14:12:12",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host1.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.StateChange:&nbsp;DIR*&nbsp;NameSystem.completeFile"<br /> }'<br /> <br /> #&nbsp;Store&nbsp;logs&nbsp;in&nbsp;the&nbsp;log-2012-12-27&nbsp;index<br /> curl&nbsp;-XPUT&nbsp;http://localhost:9200/log-2012-12-27/hadoop/1&nbsp;-d&nbsp;'{<br /> &nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-27T02:02:02",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host2.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.server.namenode.FSNamesystem"<br /> }'<br /> <br /> #&nbsp;Request&nbsp;search&nbsp;to&nbsp;the&nbsp;nelo2-log-2012-12-26&nbsp;and&nbsp;nelo2-log-2012-12-27&nbsp;indexes&nbsp;at&nbsp;once<br /> curl&nbsp;-XGET&nbsp;http://localhost:9200/nelo2-log-2012-12-26,nelo2-log-2012-12-27/_search</div>
<h3>Scalability and Flexibility</h3>
<p>ElasticSearch provides excellent scalability and flexibility. It enables the expansion of functionality through plug-ins, which was further improved in recent 0.90 release. For example, by using Thrift or Jetty plugin, you can change transfer protocol. If you install BigDesk or Head, which is a required plugin, you can use the functionality of ElasticSearch monitoring. As shown in the following <b>Code 2</b>, you can also adjust the number of replicas dynamically. The number of shards is not changeable as it is fixed for each index, so an appropriate number of shards should be allocated in the first time by taking the number of nodes and future server expansion into account.</p>
<p style="text-align: center;"><b>Code 2: Dynamic Configuration Change Query.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;curl&nbsp;-XPUT&nbsp;http://localhost:9200/log-2012-12-27/&nbsp;-d&nbsp;'{<br /> &nbsp;&nbsp;&nbsp;&nbsp;"settings":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"number_of_shards":&nbsp;10,<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"number_of_replicas":&nbsp;1<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }'</div>
<h3>Distributed Storage</h3>
<p>ElasticSearch is a distributed search engine. It distributes data by configuring multiple shards according to keys. An index is configured for each shard. Each shard has 0 or more replicas. Moreover, ElasticSearch supports clustering, and when a cluster runs, one of many nodes is selected as the master node to manage metadata. If the master node fails, another node in the cluster automatically becomes the master. It is also very easy to add nodes. When a node is added to the same network, the added node will automatically find the cluster through multicast and add itself to the cluster. If the same network is not used, the master node address should be specified through unicast (see a related video: <a href="http://youtu.be/l4ReamjCxHo">http://youtu.be/l4ReamjCxHo</a>).</p>
<h2>Installing</h2>
<h3>Quick Start</h3>
<p>ElasticSearch supports zero configuration installation. As shown in the following code snippets, all you have to do for execution is download a file from the official homepage and unzip it.</p>
<ol>
<li>Download<br />
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">~$&nbsp;wget&nbsp;http://download.ElasticSearch.org/ElasticSearch/ElasticSearch/ElasticSearch-0.20.1.tar.gz&nbsp;&nbsp;&nbsp;<br /> ~$&nbsp;tar&nbsp;xvzf&nbsp;ElasticSearch-0.20.1.tar.gz</div>
</li>
<li>Executing Server<br />
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">~$&nbsp;bin/ElasticSearch&nbsp;-f</div>
</li>
</ol>
<h3>Installing Plugins</h3>
<p>You can easily expand the functionality of ElasticSearch through plugins. You can add management functionalities, change the analyzer of Lucene, and change the basic transfer module from Netty to Jetty. The following is a command we use to install plugins for NELO2. Head and bigdesk, which are found in the first and second lines, are the plugins required for ElasticSearch monitoring. It is strongly recommended to install them and check their functionalities. After installing them, visit <a href="http://localhost:9200/plugin/head/">http://localhost:9200/plugin/head/</a> and <a href="http://localhost:9200/plugin/bigdesk/">http://localhost:9200/plugin/bigdesk/</a>, and you can see the status of ElasticSearch in your Web browser.</p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">bin/plugin&nbsp;-install&nbsp;Aconex/ElasticSearch-head<br /> bin/plugin&nbsp;-install&nbsp;lukas-vlcek/bigdesk<br /> bin/plugin&nbsp;-install&nbsp;ElasticSearch/ElasticSearch-transport-thrift/1.4.0<br /> bin/plugin&nbsp;-install&nbsp;sonian/ElasticSearch-jetty/0.19.9</div>
<h3>Main Configurations</h3>
<p>You don't need to change configurations when conducting a simple functionality test. When you carry out a performance test or apply it to production services, then you should change some default configurations. See the following snippet and try to find for yourself the configurations which should be changed from the initial configuration file.</p>
<p style="text-align: center;"><b>Code 5:&nbsp;Main Configurations (<i>config/ElasticSearch.yml</i>).</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;As&nbsp;it&nbsp;is&nbsp;a&nbsp;name&nbsp;used&nbsp;to&nbsp;identify&nbsp;clusters,&nbsp;use&nbsp;a&nbsp;name&nbsp;with&nbsp;uniqueness&nbsp;and&nbsp;a&nbsp;meaning.<br /> cluster.name:&nbsp;ElasticSearch-nelo2<br /> <br /> #&nbsp;A&nbsp;node&nbsp;name&nbsp;is&nbsp;automatically&nbsp;created&nbsp;but&nbsp;it&nbsp;is&nbsp;recommended&nbsp;to&nbsp;use&nbsp;a&nbsp;name&nbsp;that&nbsp;is&nbsp;discernible&nbsp;in&nbsp;a&nbsp;cluster&nbsp;like&nbsp;a&nbsp;host&nbsp;name.<br /> node.name:&nbsp;"xElasticSearch01.nelo2"<br /> <br /> #&nbsp;The&nbsp;default&nbsp;value&nbsp;of&nbsp;the&nbsp;following&nbsp;two&nbsp;is&nbsp;all&nbsp;true.&nbsp;node.master&nbsp;sets&nbsp;whether&nbsp;the&nbsp;node&nbsp;can&nbsp;be&nbsp;the&nbsp;master,&nbsp;while&nbsp;node.data&nbsp;is&nbsp;a&nbsp;configuration&nbsp;for&nbsp;whether&nbsp;it&nbsp;is&nbsp;a&nbsp;node&nbsp;to&nbsp;store&nbsp;data.&nbsp;Usually&nbsp;you&nbsp;need&nbsp;to&nbsp;set&nbsp;the&nbsp;two&nbsp;values&nbsp;as&nbsp;true,&nbsp;and&nbsp;if&nbsp;the&nbsp;size&nbsp;of&nbsp;a&nbsp;cluster&nbsp;is&nbsp;big,&nbsp;you&nbsp;should&nbsp;adjust&nbsp;this&nbsp;value&nbsp;by&nbsp;node&nbsp;to&nbsp;configure&nbsp;three&nbsp;types&nbsp;of&nbsp;node.&nbsp;More&nbsp;details&nbsp;will&nbsp;be&nbsp;explained&nbsp;in&nbsp;the&nbsp;account&nbsp;of&nbsp;topologies&nbsp;configuration&nbsp;later.<br /> node.master:&nbsp;true<br /> node.data:&nbsp;true<br /> <br /> #&nbsp;You&nbsp;can&nbsp;change&nbsp;the&nbsp;number&nbsp;of&nbsp;shards&nbsp;and&nbsp;replicas.&nbsp;The&nbsp;following&nbsp;value&nbsp;is&nbsp;a&nbsp;default&nbsp;value:&nbsp;<br /> index.number_of_shards:&nbsp;5<br /> index.number_of_replicas:&nbsp;1<br /> <br /> #To&nbsp;prevent&nbsp;jvm&nbsp;swapping,&nbsp;you&nbsp;should&nbsp;set&nbsp;the&nbsp;following&nbsp;value&nbsp;as&nbsp;true:<br /> bootstrap.mlockall:&nbsp;true<br /> <br /> #&nbsp;It&nbsp;is&nbsp;a&nbsp;timeout&nbsp;value&nbsp;for&nbsp;checking&nbsp;the&nbsp;status&nbsp;of&nbsp;each&nbsp;node&nbsp;in&nbsp;a&nbsp;cluster.&nbsp;You&nbsp;should&nbsp;set&nbsp;an&nbsp;appropriate&nbsp;value;&nbsp;if&nbsp;the&nbsp;value&nbsp;is&nbsp;too&nbsp;small,&nbsp;nodes&nbsp;may&nbsp;frequently&nbsp;get&nbsp;out&nbsp;of&nbsp;a&nbsp;cluster.&nbsp;The&nbsp;default&nbsp;value&nbsp;is&nbsp;3&nbsp;seconds.<br /> discovery.zen.ping.timeout:&nbsp;10s<br /> <br /> #&nbsp;The&nbsp;default&nbsp;value&nbsp;is&nbsp;multicast,&nbsp;but&nbsp;in&nbsp;an&nbsp;actual&nbsp;environment,&nbsp;unicast&nbsp;should&nbsp;be&nbsp;employed&nbsp;due&nbsp;to&nbsp;the&nbsp;possibility&nbsp;of&nbsp;overlapping&nbsp;with&nbsp;other&nbsp;clusters.&nbsp;It&nbsp;is&nbsp;recommended&nbsp;to&nbsp;list&nbsp;servers&nbsp;that&nbsp;can&nbsp;be&nbsp;a&nbsp;master&nbsp;in&nbsp;the&nbsp;second&nbsp;setting.<br /> discovery.zen.ping.multicast.enabled:&nbsp;false<br /> discovery.zen.ping.unicast.hosts:&nbsp;["host1",&nbsp;"host2:port",&nbsp;"host3[portX-portY]"]</div>
<h2>Using REST API</h2>
<p>ElasticSearch provides a REST API as shown below. It provides most of its functionalities through REST API, including the creation and deletion of indexes, mappings, as well as search and change of settings. In addition to REST API, it also provides various client APIs for Java, Python and Ruby.</p>
<p style="text-align: center;"><b>Code 6:&nbsp;REST API Format in ES.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">http://host:port/(index)/(type)/(action|id)</div>
<p>As mentioned earlier, NELO2 classifies indexes (<i>databases</i> in RDBMS terms) by date, and type (<i>table</i>) is separated by project. <b>Code 7</b> below shows the process of creating logs that came into the <b>hadoop</b> project on <b>December 27, 2012</b>, in the unit of <b>document</b> by using a REST API.</p>
<p style="text-align: center;"><b>Code 7:&nbsp;An Example of Using ElasticSearch REST API.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#Creating&nbsp;documents<br /> curl&nbsp;-XPUT&nbsp;http://localhost:9200/log-2012-12-27/hadoop/1<br /> curl&nbsp;-XGET&nbsp;http://localhost:9200/log-2012-12-27/hadoop/1<br /> curl&nbsp;-XDELETE&nbsp;http://localhost:9200/log-2012-12-27/hadoop/1<br /> <br /> #Search<br /> curl&nbsp;-XGET&nbsp;http://localhost:9200/log-2012-12-27/hadoop/_search<br /> curl&nbsp;-XGET&nbsp;http://localhost:9200/log-2012-12-27/_search<br /> curl&nbsp;-XGET&nbsp;http://localhost:9200/_search<br /> <br /> #Seeing&nbsp;the&nbsp;status&nbsp;of&nbsp;indexes<br /> curl&nbsp;-XGET&nbsp;http://localhost:9200/log-2012-12-27/_status</div>
<h3>Creating Documents and Indexes</h3>
<p>As shown in the following <b>Code 8</b>, when the request is sent, ElasticSearch creates the <b>log-2012-12-27</b> index and <b>hadoop</b> type automatically without any pre-defined index or type. If you want to create them specifically instead of using auto creation, you should specify the setting of <code>action.auto_create_index</code> and <code>index.mapper.dynamic</code> as <code>false</code> in the configuration file.</p>
<p style="text-align: center;"><b>Code 8:&nbsp;Creating Documents.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;Request<br /> curl&nbsp;-XPUT&nbsp;http://localhost:9200/log-2012-12-27/hadoop/1&nbsp;-d&nbsp;'{<br /> &nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-27T02:02:02",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host2.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.server.namenode.FSNamesystem"<br /> }'<br /> <br /> #&nbsp;Result<br /> {<br /> &nbsp;&nbsp;&nbsp;&nbsp;"ok":&nbsp;true,<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_index":&nbsp;"log-2012-12-27",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_type":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_id":&nbsp;"1",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_version":&nbsp;1<br /> }</div>
<p>As shown in Code 9 below, you can make a request after including type in a document.</p>
<p style="text-align: center;"><b>Code 9:&nbsp;A Query Including Type.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">curl&nbsp;-XPUT&nbsp;http://localhost:9200/log-2012-12-27/hadoop/1&nbsp;-d&nbsp;'{<br /> &nbsp;&nbsp;&nbsp;&nbsp;"hadoop":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-27T02:02:02",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host2.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.server.namenode.FSNamesystem"<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }'</div>
<p>If an&nbsp;<code>id</code>&nbsp;value is omitted as in <b>Code 10</b>, an&nbsp;<code>id</code>&nbsp;will be created automatically when a document is created. Note that the<code>POST</code>&nbsp;method was used instead of&nbsp;<code>PUT</code>&nbsp;when a request was made.</p>
<p style="text-align: center;"><b>Code 10:&nbsp;A Query Creating a Document without an ID.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;Request<br /> curl&nbsp;-XPOST&nbsp;http://localhost:9200/log-2012-12-27/hadoop/&nbsp;-d&nbsp;'{<br /> &nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-27T02:02:02",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host2.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.server.namenode.FSNamesystem"<br /> }'<br /> <br /> #&nbsp;Result<br /> {<br /> &nbsp;&nbsp;&nbsp;&nbsp;"ok":&nbsp;true,<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_index":&nbsp;"log-2012-12-27",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_type":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_id":&nbsp;"kgfrarduRk2bKhzrtR-zhQ",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_version":&nbsp;1<br /> }</div>
<h3>Deleting a Document</h3>
<p><b>Code 11</b>&nbsp;below shows how to delete a document (<i>a record</i>&nbsp;in RDBMS terms) in type (<i>a table</i>). You can delete a <b>hadoop</b> type document with <code>id=1</code> of the <b>log-2012-12-27</b> index by using the <code>DELETE</code> method.</p>
<p style="text-align: center;"><b>Code 11:&nbsp;A Query to Delete a Document.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;Request<br /> $&nbsp;curl&nbsp;-XDELETE&nbsp;'http://localhost:9200/log-2012-12-27/hadoop/1'<br /> <br /> #&nbsp;Result<br /> {<br /> &nbsp;&nbsp;&nbsp;&nbsp;"ok":&nbsp;true,<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_index":&nbsp;"log-2012-12-27",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_type":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_id":&nbsp;"1",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"found":&nbsp;true<br /> }</div>
<h3>Getting a Document</h3>
<p>You can get a <b>hadoop</b> type document with <code>id=1</code> of the <b>log-2012-12-27 </b>index by using the <code>GET</code> method as shown in Code 12.</p>
<p style="text-align: center;"><b>Code 12:&nbsp;A Query to Get a Document.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#Request<br /> curl&nbsp;-XGET&nbsp;'http://localhost:9200/log-2012-12-27/hadoop/1'<br /> <br /> #&nbsp;Result<br /> {<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_index":&nbsp;"log-2012-12-27",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_type":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_id":&nbsp;"1",<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_source":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-27T02:02:02",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host2.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.server.namenode.FSNamesystem"<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }</div>
<h3>Search</h3>
<p>When the Search API is called, ElasticSearch executes the Search API and returns the search results that match the content of the query. Code 13 shows an example of using Search API.</p>
<p style="text-align: center;"><b>Code 13:&nbsp;An Example Query of Using Search API.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;All&nbsp;types&nbsp;of&nbsp;a&nbsp;specific&nbsp;index<br /> $&nbsp;curl&nbsp;-XGET&nbsp;'http://localhost:9200/log-2012-12-27/_search?q=host:host2.nelo2'<br /> <br /> #&nbsp;A&nbsp;specific&nbsp;type&nbsp;of&nbsp;a&nbsp;specific&nbsp;index<br /> $&nbsp;curl&nbsp;-XGET&nbsp;'http://localhost:9200/log-2012-12-27/hadoop,apache/_search?q=host:host2.nelo2'<br /> <br /> #&nbsp;A&nbsp;specific&nbsp;type&nbsp;of&nbsp;all&nbsp;indexes<br /> $&nbsp;$&nbsp;curl&nbsp;-&nbsp;XGET&nbsp;'http://localhost:9200/_all/hadoop/_search?q=host:host2.nelo2'<br /> <br /> #&nbsp;All&nbsp;indexes&nbsp;and&nbsp;types<br /> $&nbsp;curl&nbsp;-XGET&nbsp;'http://localhost:9200/_search?q=host:host2.nelo2'</div>
<h3>Search API by Using URI Request</h3>
<table border="0">
<thead> </thead><caption>Table 2: Main Parameters.</caption> 
<tbody>
<tr>
<th>Name</th> <th>Description</th>
</tr>
</tbody>
<tbody>
<tr>
<td>q</td>
<td>Query string.</td>
</tr>
<tr>
<td>default_operator</td>
<td>The operator used as a default (<code>AND</code> or <code>OR</code>). The default is <code>OR</code>.</td>
</tr>
<tr>
<td>fields</td>
<td>The field to get as a result. The default is the "_source" field.</td>
</tr>
<tr>
<td>sort</td>
<td>Sort method. Ex) fieldName:asc/fieldName:desc.</td>
</tr>
<tr>
<td>timeout</td>
<td>Search timeout value. The default is "unlimited".</td>
</tr>
<tr>
<td>size</td>
<td>The number of result values. The default is 10.</td>
</tr>
</tbody>
</table>
<p>If you use URI, you can search easily by using parameters in <b>Table 2</b> and a query string. As it does not provide all search options, it is useful when used for tests.</p>
<p style="text-align: center;"><b>Code 14:&nbsp;Search Query by Using URI Request.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;Request<br /> $&nbsp;curl&nbsp;-XGET&nbsp;'http://localhost:9200/log-2012-12-27/hadoop/_search?q=host:host2.nelo2'<br /> <br /> #&nbsp;Result<br /> {<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_shards":{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"total":&nbsp;5,<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"successful":&nbsp;5,<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"failed":&nbsp;0<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> &nbsp;&nbsp;&nbsp;&nbsp;"hits":{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"total":&nbsp;1,<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"hits":&nbsp;[<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_index":&nbsp;"log-2012-12-27",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_type":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_id":&nbsp;"1",&nbsp;<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_source":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-27T02:02:02",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host2.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.server.namenode.FSNamesystem"<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;]<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }</div>
<h3>Search API by Using Request Body</h3>
<p>When HTTP body is used, perform a search by using query DSL. As query DSL has a large amount of contents, you are advised to refer to a guide from the <a href="http://www.elasticsearch.org/guide/reference/query-dsl/">official website</a>.</p>
<p style="text-align: center;"><b>Code 15:&nbsp;Search by Using Query DSL.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;Request<br /> $&nbsp;curl&nbsp;-XPOST&nbsp;'http://localhost:9200/log-2012-12-27/hadoop/_search'&nbsp;-d&nbsp;'{<br /> &nbsp;&nbsp;&nbsp;&nbsp;"query":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"term":&nbsp;{&nbsp;"host":&nbsp;"host2.nelo2"&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }'<br /> <br /> #&nbsp;Result<br /> {<br /> &nbsp;&nbsp;&nbsp;&nbsp;"_shards":{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"total":&nbsp;5,<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"successful":&nbsp;5,<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"failed":&nbsp;0<br /> &nbsp;&nbsp;&nbsp;&nbsp;},<br /> &nbsp;&nbsp;&nbsp;&nbsp;"hits":{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"total":&nbsp;1,<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"hits":&nbsp;[<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_index":&nbsp;"log-2012-12-27",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_type":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_id":&nbsp;"1",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"_source":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;"hadoop",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;"hadoop-log",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;"namenode",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logTime":"2012-12-27T02:02:02",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;"host2.nelo2",<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;"org.apache.hadoop.hdfs.server.namenode.FSNamesystem"<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;]<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }</div>
<h2>Mapping</h2>
<h3>Put Mapping API</h3>
<p>To add a mapping to a specific type, you can define a mapping in the form shown in <b>Code 16</b>.</p>
<p style="text-align: center;"><b>Code 16:&nbsp;Query to Register a Mapping.</b></p>
<div editor_component="code_highlighter" code_type="Bash" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;curl&nbsp;-XPUT&nbsp;'http://localhost:9200/log-2012-12-27/hadoop/_mapping'&nbsp;-d&nbsp;'<br /> {<br /> &nbsp;&nbsp;&nbsp;&nbsp;"hadoop":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"properties":&nbsp;{<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"projectName":&nbsp;{"type":&nbsp;"string",&nbsp;"index":&nbsp;"not_analyzed"},<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logType":&nbsp;{"type":&nbsp;"string",&nbsp;"index":&nbsp;"not_analyzed"},<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logSource":&nbsp;{"type":&nbsp;"string",&nbsp;"index":&nbsp;"not_analyzed"},<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"logTime":&nbsp;{"type":&nbsp;"date"},<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"host":&nbsp;{"type":&nbsp;"string",&nbsp;"index":&nbsp;"not_analyzed"},<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"body":&nbsp;{"type":&nbsp;"string"},<br /> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}<br /> &nbsp;&nbsp;&nbsp;&nbsp;}<br /> }'</div>
<h3>Get Mapping API</h3>
<p>To get defined mapping information, you can use a query in the form shown in <b>Code 17</b>.</p>
<p style="text-align: center;"><b>Code 17:&nbsp;Query to Get a Mapping.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;curl&nbsp;-XGET&nbsp;'http://localhost:9200/log-2012-12-27/hadoop/_mapping'</div>
<h3>Delete Mapping API</h3>
<p>Code 18 shows an example of deleting a defined mapping.</p>
<p style="text-align: center;"><b>Code 18:&nbsp;Query to Delete a Mapping.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;curl&nbsp;-XDELETE&nbsp;'http://localhost:9200/log-2012-12-27/hadoop'</div>
<h2>How to Optimize Performance</h2>
<h3>Memory and the Number of Open Files</h3>
<p>If the amount of data to search increases, you will need more memory. When you run ElasticSearch, you will encounter many problems due to the use of memory. In an operating method recommended by an ElasticSearch community, when you run a server exclusively for ElasticSearch, you are advised to allocate only half of the memory capacity to ElasticSearch, and to allow the OS to use the other half for system cache. You can set the memory size by setting the <code>ES_HEAP_SIZE</code> environmental variable or by using <code>-Xms</code> and <code>-Xmx</code> of JVM.</p>
<p style="text-align: center;"><b>Code 19:&nbsp;Execution by Specifying Heap Size.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">bin/ElasticSearch&nbsp;-Xmx=2G&nbsp;-Xms=2G</div>
<p>When using ElasticSearch, you will see <code>OutOfMemory</code> errors frequently. This error occurs when the field cache exceeds the maximum heap size. If you change the setting for <code>index.cache.field.type</code> from <b>resident</b> (default) to <b>soft</b>, <b>soft</b> reference will be used and the cache area will be preferentially <a href="/blog/tags/Garbage%20Collection/">GC</a>, and this problem can be resolved.</p>
<p style="text-align: center;"><b>Code 20:&nbsp;Configuring Field Cache Type.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">index.cache.field.type:&nbsp;soft</div>
<p>If the amount of data increases, the number of index files also increases. This is because Lucene, which is used by ElasticSearch, manages indexes in the unit of <i>segments</i>. Sometimes the number will even exceed the number of <code>MAX_OPEN</code> files. For this reason, you need to change the maximum open file limit by using the <code>ulimit</code> command. The recommended value is <b>32000-64000</b>, but sometimes you may need to set a larger value depending on the size of the system or data.</p>
<h3>Index Optimization</h3>
<p>NELO2 manages indexes by date. If indexes are managed by date, you can delete old logs that don't need to be managed easily and quickly, as shown in <b>Code 21</b>. In this case, the overhead imposed on the system is smaller than when deleting logs by specifying the TTL value for each document.</p>
<p style="text-align: center;"><b>Code 21:&nbsp;Deleting an Index.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;curl&nbsp;-XDELETE&nbsp;'http://localhost:9200/log-2012-10-01/'</div>
<p>If index optimization is performed, <i>segments</i> are incorporated. Using this method, you can enhance search performance. As index optimization can impose a burden on the system, it is better to perform it when the system is being used less.</p>
<p style="text-align: center;"><b>Code 22:&nbsp;Index Optimization.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">$&nbsp;curl&nbsp;-XPOST&nbsp;'http://localhost:9200/log-2012-10-01/_optimize'</div>
<h3>Shards and Replicas</h3>
<p>You can't change the number of shards after setting it. For this reason, you need to decide this value carefully by taking the current number of nodes in the system and the number of nodes expected to be added in the future into account. For example, if there are 5 nodes and the number is expected to reach 10 in the future, it is recommended to set the number of shards as 10 from the beginning. If you set it as 5 in the beginning, you can add 5 more nodes later, but you won't be able to use the added 5 nodes. If you set the number of replicas to 1, of course, you can utilize the added 5 nodes as nodes exclusively for replication.</p>
<p>If the number of shards increases, it is more advantageous to process a large amount of data because queries are distributed as much as the number of shards. But you need to set this value appropriately, because the performance could be deteriorated due to increasing traffic if the value is too high.</p>
<h2>Configuring Cluster Topologies</h2>
<p>The content of the configuration file of ElasticSearch is shown in <b>Code 23</b>&nbsp;below. There are three types of nodes:</p>
<ol>
<li><b>data node<br /><span style="font-weight: normal;">This does not act as the master, and only stores data. When it receives a request from a client, it searches data from shards or creates an index.</span>&nbsp;</b></li>
<li><b>master node<br /></b>I<b><span style="font-weight: normal;">t functions to maintain a cluster, and requests indexing or search to data nodes.&nbsp;</span>&nbsp;</b></li>
<li><b>search balancer node</b><br />If it receives a search request, it requests data, gathers data and delivers the result.</li>
</ol>
<p>You can have one node which will function both like a master and a data node. But if you use the three types of node separately, you can reduce the burden of the data node. In addition, if you configure the master node separately, you can improve the stability of a cluster. Also, you can reduce operation costs by using low-spec. server equipment for the master and search node.</p>
<p style="text-align: center;"><b>Code 23: Settings Related to Topology.</b></p>
<div editor_component="code_highlighter" code_type="Plain" first_line="1" collapse="false" nogutter="false" style="border: #666 1px dotted; border-left: #2AE 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;">#&nbsp;You&nbsp;can&nbsp;exploit&nbsp;these&nbsp;settings&nbsp;to&nbsp;design&nbsp;advanced&nbsp;cluster&nbsp;topologies.<br /> #<br /> #&nbsp;1.&nbsp;You&nbsp;want&nbsp;this&nbsp;node&nbsp;to&nbsp;never&nbsp;become&nbsp;a&nbsp;master&nbsp;node,&nbsp;only&nbsp;to&nbsp;hold&nbsp;data.<br /> #&nbsp;&nbsp;&nbsp;&nbsp;This&nbsp;will&nbsp;be&nbsp;the&nbsp;"workhorse"&nbsp;of&nbsp;your&nbsp;cluster.<br /> #<br /> #&nbsp;node.master:&nbsp;false<br /> #&nbsp;node.data:&nbsp;true<br /> #<br /> #&nbsp;2.&nbsp;You&nbsp;want&nbsp;this&nbsp;node&nbsp;to&nbsp;only&nbsp;serve&nbsp;as&nbsp;a&nbsp;master,&nbsp;to&nbsp;not&nbsp;store&nbsp;any&nbsp;data&nbsp;and<br /> #&nbsp;&nbsp;&nbsp;&nbsp;to&nbsp;have&nbsp;free&nbsp;resources.&nbsp;This&nbsp;will&nbsp;be&nbsp;the&nbsp;"coordinator"&nbsp;of&nbsp;your&nbsp;cluster.<br /> #<br /> #&nbsp;node.master:&nbsp;true<br /> #&nbsp;node.data:&nbsp;false<br /> #<br /> #&nbsp;3.&nbsp;You&nbsp;want&nbsp;this&nbsp;node&nbsp;to&nbsp;be&nbsp;neither&nbsp;a&nbsp;master&nbsp;nor&nbsp;a&nbsp;data&nbsp;node,&nbsp;but<br /> #&nbsp;&nbsp;&nbsp;&nbsp;to&nbsp;act&nbsp;as&nbsp;a&nbsp;"search&nbsp;load&nbsp;balancer"&nbsp;(fetching&nbsp;data&nbsp;from&nbsp;nodes,<br /> #&nbsp;&nbsp;&nbsp;&nbsp;aggregating&nbsp;results,&nbsp;etc.)<br /> #<br /> #&nbsp;node.master:&nbsp;false<br /> #&nbsp;node.data:&nbsp;false</div>
<p><b>Figure 1</b> below shows the configuration of NELO2 topologies that use ElasticSearch. The efficiency of equipment use and the stability of the entire cluster has been improved as follows: only ElasticSearch runs on the 20 data nodes (server) so that they can achieve sufficient performance, while other daemon server processes in addition to ElasticSearch run on the 4 master nodes and 3 search nodes.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/202/651/nhn_nelo2_elasticsearch_topologies.png" alt="nhn_nelo2_elasticsearch_topologies.png" width="663" height="348" /></p>
<p style="text-align: center;"><b>Figure 1:&nbsp;NELO2 ElasticSearch Topologies.</b></p>
<h3>Configuring Routing</h3>
<p>When a large amount of data needs to be indexed, increasing the number of shards will improve the overall performance. On the other hand, if the number of shards increases, the traffic among nodes will also go up. For example, when there are 100 shards, if it receives a single search request, it sends the request to all the 100 shards and aggregates data, and this imposes a burden on the entire cluster.</p>
<p>If you use routing, data will be stored only in a specific shard. Even if the number of shards increases, the application will still send a request only to a single shard, and consequently the traffic can be reduced dramatically. <b>Figure 2</b>, <b>3</b>, and <b>4</b> are excerpted from the <a href="http://www.slideshare.net/kucrafal/scaling-massive-elastic-search-clusters-rafa-ku-sematext">slides&nbsp;Rafal Kuc presented at Berlin Buzzwords 2012</a>. If you don't use routing, as shown in <b>Figure 2</b>, the application will send a request to all the shards. But if you use routing, it will send a request only to a specific shard, as shown in <b>Figure 3</b>.</p>
<p>According to the material cited, in <b>Figure 4</b>&nbsp;when there are 200 shards, the response time is over 10 times faster with routing than without routing. If routing is applied, the number of threads will increase by 10 to 20 times compared to when it is not applied, but the CPU usage is much smaller. In some cases, however, the performance will be better when routing is not applied. For a search query whose result should be collected from multiple shards, it could be more advantageous in terms of performance to send the request to multiple shards. To complement this, NELO2 determines the use of routing depending on the log usage of the project.</p>
<p style="text-align: center;"><img src="/files/attach/images/220547/202/651/nhn_nelo_before_using_routing.png" alt="nhn_nelo_before_using_routing.png" width="470" height="324" /></p>
<p style="text-align: center;"><strong>Figure 2: Before Using Routing.</strong></p>
<p style="text-align: center;"><img src="/files/attach/images/220547/202/651/nhn_nelo2_after_using_routing.png" alt="nhn_nelo2_after_using_routing.png" width="466" height="343" /></p>
<p style="text-align: center;"><strong>Figure 3:&nbsp;After Using Routing.</strong></p>
<p style="text-align: center;"><strong><img src="/files/attach/images/220547/202/651/nhn_nelo2_performance_comparison_before_after_using_routing.png" alt="nhn_nelo2_performance_comparison_before_after_using_routing.png" width="521" height="218" /></strong></p>
<p style="text-align: center;"><b>Figure 4: Performance Comparison before and after Using Routing.</b></p>
<h4>Conclusion</h4>
<p>The number of users of ElasticSearch is increasing rapidly, thanks to its easy installation and high scalability. It was several days only since the release of the latest ElasticSearch version 0.90. Its functionality is improving very quickly thanks to its active community. In addition, more and more companies are beginning to use ElasticSearch for their services.&nbsp;Recently, some committers, including the developer Shay Banon, gathered together and established ElasticSearch.com, which provides consulting and training services.</p>
<p>In this article I have explained the basic information on the installation of ElasticSearch, how to use it, and do performance tuning. We have started testing the latest 0.90 release and soon will migrate the current 0.20.1 ES deployment.&nbsp;In the next post I will continue this topic and tell you about our experience with 0.90 as well as the critical split-brain problem we have previously experienced. Due to the scarcity of solutions for this problem, I believe it will be very useful for our readers.</p>
<p>By <a href="/?mid=textyle&amp;act=dispMemberInfo&amp;member_srl=651249&amp;vid=blog&amp;tab=blogs">Lee Jae Ik</a>, Senior Software Engineer at Global Platform Development Lab, NHN Corporation.</p>
<h2>References</h2>
<ul>
<li>Official guide: <a href="http://www.ElasticSearch.org/guide/">http://www.ElasticSearch.org/guide/</a></li>
<li>Introduction to ElasticSearch and comparison of the terms of ElasticSearch and RDB: <a href="http://www.slideshare.net/clintongormley/cool-bonsai-cool-an-introduction-to-ElasticSearch">http://www.slideshare.net/clintongormley/cool-bonsai-cool-an-introduction-to-ElasticSearch</a></li>
<li>About ElasticSearch: <a href="http://www.slideshare.net/dadoonet/ElasticSearch-devoxx-france-2012-english-version">http://www.slideshare.net/dadoonet/ElasticSearch-devoxx-france-2012-english-version</a></li>
<li>Shay Banon's articles: <a href="http://2011.berlinbuzzwords.de/sites/2011.berlinbuzzwords.de/files/ElasticSearch-bbuzz2011.pdf">http://2011.berlinbuzzwords.de/sites/2011.berlinbuzzwords.de/files/ElasticSearch-bbuzz2011.pdf</a></li>
<li>Using ElasticSearch for logs: <a href="http://www.ElasticSearch.org/tutorials/2012/05/19/ElasticSearch-for-logging.html">http://www.ElasticSearch.org/tutorials/2012/05/19/ElasticSearch-for-logging.html</a></li>
<li>Concept of multitenancy: <a href="http://en.wikipedia.org/wiki/Multitenancy">http://en.wikipedia.org/wiki/Multitenancy</a></li>
<li>Shay Banon's ElasticSearch optimization: <a href="https://github.com/logstash/logstash/wiki/ElasticSearch-Storage-Optimization">https://github.com/logstash/logstash/wiki/ElasticSearch-Storage-Optimization</a></li>
<li>Rafal Kuc's article on performance tuning presented at Berlin Buzzwords 2012: <a href="http://www.slideshare.net/kucrafal/scaling-massive-elastic-search-clusters-rafa-ku-sematext">http://www.slideshare.net/kucrafal/scaling-massive-elastic-search-clusters-rafa-ku-sematext</a></li>
</ul>]]></description>
                        <pubDate>Mon, 06 May 2013 19:06:36 +0900</pubDate>
                        <category>ElasticSearch</category>
                        <category>NHN</category>
                        <category>NELO</category>
                        <category>Log Analysis System</category>
                        <category>Lucene</category>
                        <category>Scalability</category>
                        <category>Sharding</category>
                                </item>
        										        <item>
            <title>Announcing CUBRID ALL-IN-ONE Windows Downloader</title>
            <dc:creator>Esen Sagynov</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-appstools/announcing-cubrid-all-in-one-windows-downloader/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-appstools/announcing-cubrid-all-in-one-windows-downloader/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-appstools/announcing-cubrid-all-in-one-windows-downloader/#comment</comments>
                                    <description><![CDATA[<p><img style="display: block; margin-left: auto; margin-right: auto;" height="168" width="637" alt="cubrid-all-in-one-windows-downloader.png" src="/files/attach/images/220547/162/646/cubrid-all-in-one-windows-downloader.png" /></p>
<p>We are very glad to announce the immediate availability of <b>CUBRID ALL-IN-ONE Windows Downloader</b>&nbsp;version 1.0 <i>beta</i>. You can download CUBRID ALL-IN-ONE Windows Downloader from <a href="/wiki_tools/entry/cubrid-all-in-one-windows-downloader">http://www.cubrid.org/wiki_tools/entry/cubrid-all-in-one-windows-downloader</a>. The source code is available at <a href="http://svn.cubrid.org/cubridtools/cubrid-downloader/">http://svn.cubrid.org/cubridtools/cubrid-downloader/</a>&nbsp;which is open sourced under <a href="/bsd_license">BSD license</a> just like all other <a href="/wiki_tools/entry/cubrid-tools-wiki">CUBRID Tools</a>.</p>
<p>CUBRID ALL-IN-ONE Windows Downloader is an application that allows our users to easily download CUBRID components including the server engine, drivers and GUI tools. All you have to do is to select the components you want to download on your local Windows machine and the Downloader will download them for you, one by one, without any other actions required.</p>
<p>Application key features:</p>
<ul>
<li>The application can <b>auto-update</b> itself, anytime a new version is available (it uses the <a href="http://msdn.microsoft.com/en-us/library/ms227123%28v=vs.80%29.aspx">ClickOnce</a>&nbsp;technology).</li>
<li>Retrieves all the components information from a remote CUBRID online location, so it is always up-to- date with the latest application releases.</li>
<li>Detects local machine specifics - CUBRID version, OS architecture &ndash; and automatically selects the appropriate list of components.</li>
<li>Can handle software pre-prerequisites dependencies and download them as well.</li>
<li>Supports both HTTP and FTP protocols for&nbsp;downloads.</li>
<li>Provides additional information to users like&nbsp;links&nbsp;to&nbsp;online resources.</li>
<li>Handles download errors and auto-retries in case of failures.</li>
<li>Supports alternate download locations to try in case of failures.</li>
<li>Saves the user preferences and re-uses them next time.</li>
<li>Provides a comprehensive operations log information.</li>
<li>Supports UI localization.</li>
</ul>
<p>Here is a mute video which shows how to use CUBRID ALL-IN-ONE Downloader.</p>
<p><iframe width="640" height="480" src="http://www.youtube.com/embed/8Hh0SE-I7c0" frameborder="0"></iframe></p>
<p>&nbsp;</p>
<p>If you have questions or suggestions, leave your comments below.</p>
<div></div>]]></description>
                        <pubDate>Mon, 29 Apr 2013 17:05:10 +0900</pubDate>
                        <category>Downloader</category>
                        <category>Windows</category>
                        <category>downloads</category>
                                </item>
        										        <item>
            <title>CUBRID Node.js Driver 1.1 is now available at NPM</title>
            <dc:creator>Esen Sagynov</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-appstools/cubrid-nodejs-driver-1-1-is-now-available-at-npm/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-appstools/cubrid-nodejs-driver-1-1-is-now-available-at-npm/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-appstools/cubrid-nodejs-driver-1-1-is-now-available-at-npm/#comment</comments>
                                    <description><![CDATA[<p style="text-align: center;"><img editor_component="image_link" height="140" width="245" alt="cubrid_nodejs_logo.png" src="/files/attach/images/220547/207/534/cubrid_nodejs_logo.png" /></p>
<p>I am very glad to announce the immediate availability of CUBRID Node.js driver version 1.1. You can download <b>node-cubrid</b> from NPM. For more details, see check out the official repository at&nbsp;<a target="_self" href="https://github.com/CUBRID/node-cubrid">https://github.com/CUBRID/node-cubrid</a>.</p>
<h2>What's New</h2>
<p>In this new release we have improved many aspects of the driver.</p>
<ul>
<li>We now follow "<b>One driver to rule them all</b>" concept. In other words, you can use <b>node-cubrid</b>&nbsp;with any version of CUBRID Database including&nbsp;8.4.1,&nbsp;8.4.3, and&nbsp;9.0.0 (beta).</li>
<li>The new driver comes with many code fixes, bugs fixes, and design changes.</li>
<li>Rich database support: Connect, Query, Fetch, Execute, Commit, Rollback, DB Schema etc.</li>
<li>Out of the box driver events model.</li>
<li>10.000+ LOC, including the driver test code and demos</li>
<li>New test cases added. Now 50+.</li>
<li>We have significantly refactored the code using JSHint/JSLint code analysis.</li>
<li>New documentation.</li>
<li>Created new tutorials that we will publish later.</li>
</ul>
<h2>Testing</h2>
<p>The 1.1 release was successfully tested, using the driver test suite, on 2 OS x 3 CUBRID engine = 6 different test environments.</p>
<h2>What's next</h2>
<p>The upcoming new version, 2.0, will feature many new features and improvements. For example:</p>
<ul>
<li>CUBRID 9.1 stable release compatibility.</li>
<li>Improved queries support. In particular, implementing <b>queries queuing</b> support similar to MySQL.</li>
<li>Extended database schema support</li>
<li>... and many more.</li>
</ul>
<p>If you have questions, ask at our <a target="_self" href="/questions">CUBRID Q&amp;A</a> site, or at <a target="_self" href="http://webchat.freenode.net/?channels=cubrid">#cubrid</a> IRC. We will be glad to answer you!</p>]]></description>
                        <pubDate>Mon, 24 Dec 2012 09:45:24 +0900</pubDate>
                        <category>Node.js</category>
                        <category>JavaScript</category>
                        <category>Drivers</category>
                        <category>Web development</category>
                        <category>programming</category>
                        <category>APIs</category>
                                </item>
        										        <item>
            <title>Database Sharding Platform at NHN</title>
            <dc:creator>Jeon Won Hee</dc:creator>
            <link>http://www.cubrid.org/blog/dev-platform/database-sharding-platform-at-nhn/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/dev-platform/database-sharding-platform-at-nhn/</guid>
                        <comments>http://www.cubrid.org/blog/dev-platform/database-sharding-platform-at-nhn/#comment</comments>
                                    <description><![CDATA[<p>"<i>Ins and outs of NHN</i>" is a series of articles that compares platforms and services from third-party vendors with <a target="_self" href="/blog/tags/NHN/">NHN</a>'s own solutions.  The topic of this first article in the series is <b>Database&nbsp;Sharding Platform</b>. I will introduce about the efforts being made from inside and outside of NHN to implement&nbsp;Database Sharding. I will first explain the concept of Sharding data vs. Partitioning data. Then review the common methodology to implement sharding. And finally I will compare all sharding platforms. This article is recommented for developers interested in big data management.</p>
<h2>Database Expansion</h2>
<p>To store and search a volume of data which is so big that it cannot be handled by one database, you must find a way to use multiple databases.  Although there are some databases made for distributed environments, such as Cassandra or Dynamo, these have many functional constraints, such as weakness in terms of search range or inability to use the JOIN operations.  In order to expand data while using a relatively feature-rich functionality, it is recommended to use RDBMS by sharding the databases.</p>
<p>In the past, the Sharding logic was implemented directly on the application layer, but now there is an increasing number of examples which introduce&nbsp;<i>Sharding platforms</i>&nbsp;which allow to move the Sharding logic from the application layer to database or middleware layers.  At a core level, Sharding platforms must respond effectively to ever-increasing data without failure, and handle different data characteristics and models depending on services.</p>
<p>In this article I will  compare <a href="http://spockproxy.sourceforge.net/">Spock Proxy</a>, the Sharding platform based on MySQL Proxy, <a href="https://github.com/twitter/gizzard">Gizzard</a>, created by Twitter, and <b>CUBRID SHARD</b>, native database sharding feature in CUBRID which is set to launch in the first half of 2012. A table of other solutions was previous posted at <a target="_self" href="/blog/cubrid-life/database-sharding-with-cubrid/">Database Sharding with CUBRID</a>.</p>
<h2>Horizontal Partitioning and Sharding</h2>
<p><b>Horizontal partitioning</b> is a design that divides and stores data, such as a schema, into two or more tables <em><strong>within one database</strong></em>. For example, to handle large data of user messages, such schema may be created so that messages by users from city A are stored in one table, while messages by users from city B are stored in another table. This allows to reduce the size of the index and increase the concurrency of operations. The key point in this approach is that the data is partitioned between tables&nbsp;<i><b>within a single database</b></i>.</p>
<p><b>Sharding</b>, on the other hand, is the <i><b>distributed</b></i>&nbsp;approach where data is horizontally partitioned between tables created in <i><b>physically different databases</b></i>. &nbsp;Thus, Sharding is a method to store the&nbsp;messages by users from city A&nbsp;in <i>database A</i> and&nbsp;messages by users from city B&nbsp;in <i>database B</i>. Here each database is called a&nbsp;<b>Shard</b>.</p>
<p>As you have to work on multiple databases, there could be functional limitations based on circumstances and also drawbacks in terms of consistency and replication, including JOIN operations.  In many cases, Sharding used to be implemented at the application server level.  There have been many attempts to provide this at the platform level.   These can be classified into a pattern that operates in the application server, such as  <a href="http://www.hibernate.org/subprojects/shards.html">Hibernate Shards</a>, a middle tier pattern such as CUBRID SHARD, Spock Proxy, or Gizzard, and a pattern that provides the Sharding functionality from a database itself, e.g. <a href="http://devcafe.nhncorp.com/nStore">nStore</a> or <a href="http://www.mongodb.org/">MongoDB</a>.</p>
<h2>Middle Tier Sharding Platform</h2>
<p>By default, a Sharding platform should consider the following items:</p>
<ul>
<li>Database location abstraction</li>
<li>Scalability</li>
<li>Monitoring/Ease of operations</li>
</ul>
<p><b>Database location abstraction</b> and <b>scalability</b> are different from each other but connected.  <i>Database location abstraction</i> ensures that&nbsp;on the application layer&nbsp;you do not need to know which data (which Row) is located in which database. The application is connected only with the <i>Sharding platform</i>. Connecting to a database is what the Sharding platform should do. In addition, Sharding platform should carry out the task to add a replicated storage to a specific Shard (one of the partitioned databases) in order to migrate database for replacement <b><i>without restarting or changing the application code</i></b>. When the Sharding platform comes to a stop, it is obvious that an application server will not be able to access the database.  For this reason, the Sharding platform should provide&nbsp;<b><i>redundancy</i></b>.</p>
<p>For monitoring, the Sharding platform should be able to provide a number of requests and error information according to <b>Shard keys</b> (a standard to determine which data is stored in which database).</p>
<h2>Comparison between Spock Proxy and CUBRID SHARD</h2>
<p>The key function of the Sharding platform is the CRUD operation (CREATE, READ, UPDATE, DELETE) which chooses one database among many according to '<b><i>a standard set by a developer</i></b>', in other words the <b>Sharding strategy</b>.   We will compare <b>Spock Proxy</b>, a typical Sharding platform of MySQL, with the <b>CUBRID SHARD</b> platform developed by <a target="_self" href="/blog/tags/NHN/">NHN</a>.</p>
<h3>Spock Proxy</h3>
<p><b>Spock Proxy</b> is a Sharding platform designed based on MySQL Proxy. In MySQL Proxy, you can execute the Lau scrip code, which is written by a developer before and after performing SQL.  The primary purpose of using MySQL Proxy is to analyze and modify SQL. To use Spock Proxy, you need to create a MySQL database to manage the information about shards and how the data should be distributed.  The row is a rule for Sharding.</p>
<p style="text-align: center;"><img editor_component="image_link" height="219" width="477" alt="Figure 1: Specifying Sharding rules in Spock Proxy." src="/files/attach/images/220547/507/323/specifying-sharding-rules-in-spock-proxy.png" /></p>
<p style="text-align: center;"><span style="font-family: Arial; font-size: 12px;"><strong>Figure 1: Specifying Sharding rules in Spock Proxy.</strong></span></p>
<h3>CUBRID SHARD</h3>
<p><b>CUBRID SHARD</b> is the Sharding platform for CUBRID. The uniqueness of CUBRID SHARD is that it can also be used with MySQL as well as Oracle. It is set to be launched in the first half of 2012, and is planned to be used for processing the meta information database system of <a style="font-weight: bold;" target="_self" href="http://ndrive.naver.com/index.nhn">NDrive</a>&nbsp;service, the cloud storage system developed by NHN.</p>
<p>The following table shows a summary of a comparison between Spock Proxy and CUBRID SHARD.</p>
<p style="text-align: center;"><strong>Table 1 Comparison between Spock Proxy and CUBRID SHARD</strong></p>
<table style="margin: 0 auto;">
<thead> 
<tr>
<td>&nbsp;</td>
<td>Spock Proxy</td>
<td>CUBRID SHARD</td>
</tr>
</thead> 
<tbody>
<tr>
<td>Sharding rule storage</td>
<td>DBMS Table</td>
<td>Configuration file</td>
</tr>
<tr>
<td>How to create Shard keys</td>
<td>Modulo</td>
<td>
<ul>
<li>Modulo</li>
<li>Developer's own sharding strategy provided in a library</li>
</ul>
</td>
</tr>
<tr>
<td>How to find a Shard key</td>
<td>SQL parsing</td>
<td>Using HINT</td>
</tr>
<tr>
<td>Strength</td>
<td>No need to change SQL</td>
<td>
<ul>
<li>Supports CURBID, MySQL, and Oracle</li>
<li>Higher performance</li>
</ul>
</td>
</tr>
<tr>
<td>Weakness</td>
<td>
<ul>
<li>Lower performance due to extra SQL parsing</li>
<li>Supports MySQL only</li>
</ul>
</td>
<td>Requires the change to SQL queries to insert sharding HINT</td>
</tr>
</tbody>
</table>
<p><b>Spock Proxy</b> stores Sharding rules in the table in MySQL <b>universal_db</b>&nbsp;database. The SQL received from an application server is parsed and checked whether the query has <b>shard keys</b>.  If shard keys are provided, the MySQL instance will be identified according to the standard recorded in <b>universal_db</b>, then SQL will be relayed to that MySQL instance.</p>
<p>When using this method, you do not need to describe information related to shard keys in SQL, unlike in CUBRID SHARD.  Therefore, if you did not use Sharding before for your coding, but recently had to use due to the data increase, you may use Spock Proxy which will work without requiring you to change the SQL in your application.  Note that this is limited to cases where there is no need to change the schema for Sharding, or when Sharding can be applied without changing the SQL that you use.  However, the method used to find the Shard after parsing SQL, as used by Spock Proxy, has weaknesses in terms of performance.  It could lead to unnecessary work, as SQL should be parsed twice: once by Spock Proxy to determine the MySQL instance, then by MySQL itself.</p>
<p>In CUBRID SHARD, <strong>SQL&nbsp;</strong><b>HINT</b> is used which allows to avoid parsing SQL twice. Suppose there is a table as follows:</p>
<table style="margin: 0 auto;">
<thead> 
<tr>
<td colspan="3">student</td>
</tr>
</thead> 
<tbody>
<tr>
<td>student_no</td>
<td>name</td>
<td>age</td>
</tr>
<tr>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
</tbody>
</table>
<p>To use the <b>student_no</b> column data as a <i>shard key</i>, an application server sends the following prepared SQL to  CUBRID SHARD.</p>
<div style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nocontrols="false" nogutter="true" collapse="false" first_line="1" code_type="Sql" editor_component="code_highlighter">SELECT student_no, name, age FROM student WHERE student_no = /*+ SHARD_KEY */ ?</div>
<p>CUBRID SHARD checks if a HINT displays <span style="font-family: monospace;">/*+ SHARD_KEY */</span> in SQL, in which case the column data with the corresponding HINT will be used as a <b>Shard key</b>. It then reads the <b>student_no</b> value, which follows the hint, and identifies RDBMS based on the configuration file and transmits the corresponding query. As such, the benefit of using HINT is that you can improve processing efficiency by avoiding parsing SQL twice, and react to various RDBMS without violating the database location abstract.</p>
<ol>
<li>CUBRID SHARDING provides various HINTs, in addition to <span style="font-family: monospace;">/*+ SHARD_KEY */</span>.</li>
<li>Typically, there is  <span style="font-family: monospace;">/*+ SHARD_ID(</span><em><span style="font-family: monospace;">__id__</span></em><span style="font-family: monospace;">) */</span>, a HINT that allows you to find a special shard.</li>
<li>The&nbsp;<span style="font-family: monospace;">/*+ SHARD_VAL(</span><em><span style="font-family: monospace;">__shard_key_val__</span></em><span style="font-family: monospace;">) */&nbsp;</span>HINT can also play a role in finding a special shard for tables which have no shard keys. While this HINT is the same in that it searches for tables that have no specific shard keys, it configures a value for shard key column directly without choosing a shard, and selects the shard according to the internal rules of the middleware.</li>
</ol>
<p>Unlike Spock Proxy, where Sharding rules are inserted into a database table,&nbsp;in CUBRID SHARD&nbsp;Sharding rules are specified in the configuration file. If there are three access addresses for actual RDBMS storage, you should specify DB addresses to CUBRID SHARD as follows.</p>
<div style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nocontrols="false" nogutter="false" collapse="false" first_line="1" code_type="Plain" editor_component="code_highlighter">0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardDB&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardNODE1:3306<br /> 1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardDB&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardNODE2:3306<br /> 2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardDB&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardNODE3:3306<br /> 3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardDB&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;shardNODE4:3306</div>
<p>If you want to use the <b>Modulo</b> method, write the corresponding value in the configuration file as below.</p>
<div style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nocontrols="false" nogutter="true" collapse="false" first_line="1" code_type="Plain" editor_component="code_highlighter">SHARD_KEY_MODULAR = 256</div>
<p>In addition, specify <span style="font-family: monospace;">[MIN..MAX]</span> according to the value generated from <span style="font-family: monospace;">SHARD_KEY_MODULAR</span>, and describe which shard to send the corresponding query to.</p>
<div style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nocontrols="false" nogutter="false" collapse="false" first_line="1" code_type="Plain" editor_component="code_highlighter">#min&nbsp;max&nbsp;shard_id<br /> 0&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;63&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;0<br /> 64&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;127&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;1<br /> 128&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;191&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;2<br /> 192&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;255&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3</div>
<p>If you need a more subtle method than <b>Modulo</b>, you can create program code of your own that calculates the Shard ID for Shard keys.</p>
<div style="border: #666666 1px dotted; border-left: #22aaee 5px solid; padding: 5px; background: #FAFAFA url(/modules/editor/components/code_highlighter/code.png) no-repeat top right;" nocontrols="false" nogutter="false" collapse="false" first_line="1" code_type="Plain" editor_component="code_highlighter">SHARD_KEY_LIBRARY_NAME = libshardkeyid.so<br />SHARD_FUNCTION_NAME = user_get_shard_key</div>
<p>As shown in the above example, you can register your own&nbsp;library which will&nbsp;calculate the&nbsp;Shard ID&nbsp;logic.</p>
<p>The common weakness of both Spock Proxy and CUBRID SHARD is that they both require additional network IO time for each additional hop&nbsp;because they are implemented as a middle tier.</p>
<p>The following figure displays a general process of internal execution in CUBRID SHARD, which is performed when a developer executes a query. When a developer executes a query, the query is analyzed by the DB shard middleware, which determines to which shard it will be sent.  Then the query is transmitted to the selected shard, and finally the middleware delivers the response to the client.</p>
<p style="text-align: center;"><img editor_component="image_link" height="510" width="648" alt="Figure 2: CUBRID SHARD process." src="/files/attach/images/220547/507/323/cubrid-shard-process.png" /></p>
<p style="text-align: center;"><span style="font-family: Arial; font-size: 12px;"><strong>Figure 2: CUBRID SHARD process.</strong></span></p>
<h2>Gizzard</h2>
<p>Gizzard is a Sharding platform developed by <a target="_self" href="/blog/tags/Twitter/">Twitter</a>. It is also a middle tier just like Spock Proxy and CUBRID SHARD. However, its usage and architecture are quite different from Spock Proxy or CUBRID SHARD.</p>
<p>The following figure shows how the data can be shared between several databases deploying two units of Gizzard.</p>
<p style="text-align: center;"><img editor_component="image_link" height="450" width="390" alt="gizzard-deployment-diagram.png" src="/files/attach/images/220547/507/323/gizzard-deployment-diagram.png" /></p>
<p style="text-align: center;"><span style="font-family: Arial; font-size: 12px;"><strong>Figure 3: Gizzard deployment diagram.</strong></span></p>
<p>In this case, the interface being used by an application server is <b>Thrift</b>, an RPC protocol, not JDBC. Therefore, it is similar to&nbsp;<b>DBGW (CUBRID SHARD)&nbsp;</b>platform&nbsp;developed by NHN. When the schema of a storage changes, Gizzard may also need modifications. While it can be a constraint, it could also open up access not only to RDBMS, but also to various other databases (e.g. Lucene).</p>
<p>Gizzard is written based on Scala, and you can expand its functions by adding Scala codes as needed. Instead of viewing it as a complete product, we recommend that you download the source and modify it to suit your needs.</p>
<p>The biggest advantage of Gizzard is that you can perform hotspot response and database migration in the middle tier platform level.  For example, Gizzard provides the direct replication feature, as shown in the following figure.</p>
<p style="text-align: center;"><img editor_component="image_link" height="298" width="365" alt="Figure 4: The replication feature of Gizzard." src="/files/attach/images/220547/507/323/replication-feature-of-gizzard.png" /></p>
<p style="text-align: center;"><span style="font-family: Arial; font-size: 12px;"><strong>Figure 4: The replication feature of Gizzard.</strong></span></p>
<p>Spock Proxy or  CUBRID SHARD do not offer this feature, as they do not need it.  CUBRID, MySQL, and Oracle have their own replication feature, thus they do not need to offer the feature from the middle tier.</p>
<p>The replication feature provided by Gizzard is very simple - send each request to several replicas -  but cannot handle variety of requests that may occur while running the replication feature.  However, it can be useful when using a database without the replication feature.</p>
<p style="text-align: center;"><img editor_component="image_link" height="396" width="365" alt="Figure 5: The migration feature of Gizzard." src="/files/attach/images/220547/507/323/migration-feature-of-gizzard.png" /></p>
<p style="text-align: center;"><span style="font-family: Arial; font-size: 12px;"><strong>Figure 5: The migration feature of Gizzard.</strong></span></p>
<p>As shown in the figure above, it offers the replication feature which does not require a service halt.  When migrating a certain database, the feature replicates the details of a database storage before replication configuration, after allowing the new data to be written in two locations by configuring the replication.  This approach, of course, cannot be used on RDBMS due to consistency issue.</p>
<h2>Limitations of Sharding</h2>
<p>Sharding, on the other hand, also has its own limitations. Typical constraints of Sharding are as follows:</p>
<ul>
<li>It cannot perform JOIN operations for two or more Shardings. </li>
<li>auto_increment (serial) values can vary depending on Sharding. </li>
<li>last_insert_id() value is no more valid. </li>
<li><b>shard key</b> column value must not be updated; requires delete, then insert. </li>
<li>does not allow to access two or more Shards from one transaction. </li>
</ul>
<p>When using Sharding, therefore, it is important that you perform right data modeling and schema design to prevent the above constraint issues.</p>
<p>To learn more about Sharding and other features of CUBRID I suggest you to see the Slideshare presentation at:</p>
<ul>
<li><a target="_self" href="http://www.slideshare.net/cubrid/growing-in-the-wild-the-story-by-cubrid-database-developers">http://www.slideshare.net/cubrid/growing-in-the-wild-the-story-by-cubrid-database-developers</a>.</li>
<li><a href="/blog/cubrid-life/database-sharding-the-right-way-easy-reliable-open-source/">http://www.cubrid.org/blog/cubrid-life/database-sharding-the-right-way-easy-reliable-open-source/</a></li>
</ul>
<p>By&nbsp;Jeon Won Hee, Senior Software Engineer at CUBRID&nbsp;DBMS Development Lab,&nbsp;NHN Corporation.</p>]]></description>
                        <pubDate>Tue, 10 Apr 2012 15:40:32 +0900</pubDate>
                        <category>CUBRID Internals</category>
                        <category>Sharding</category>
                        <category>CUBRID SHARD</category>
                        <category>Spock Proxy</category>
                        <category>MySQL</category>
                        <category>Gizzard</category>
                                </item>
        										        <item>
            <title>Basic Operations of CUBRID Processes</title>
            <dc:creator>Park Kieun</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-life/basic-operations-of-cubrid-processes/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-life/basic-operations-of-cubrid-processes/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-life/basic-operations-of-cubrid-processes/#comment</comments>
                                    <description><![CDATA[<p><i>This article is a part of "<a href="/blog/tags/CUBRID%20Internals/">CUBRID Internals</a>" series.</i> In the <a target="_self" href="/blog/cubrid-life/cubrid-database-processes/">previous article</a>&nbsp;I have explained about CUBRID Processes and where to configure them. In this article I would like to dive into how those processes run. If you have not read the previous article, I recommend you to do so before reading further.</p>
<p><b>Figure 1</b> below illustrates the basic operation flow in CUBRID.</p>
<ul>
<li>First, when a user enters the <code>cubrid service start</code> command,&nbsp;CUBRID gets started.</li>
<li>Then, the <b>cub_master</b>&nbsp;and <b>cub_broker</b> processes are started.&nbsp;At this time, a number of&nbsp;<b>cub_cas</b> processes will be started which corresponds to the&nbsp;value of <code>MIN_NUM_APPL_SERVER</code> parameter in the&nbsp;<b><span style="font-family: monospace;">cubrid_broker.conf</span></b>&nbsp;file.</li>
<li>Then, a user enters the <code>cubrid server start demodb</code> command which creates the&nbsp;<b>cub_server</b> process that mounts the <b>demodb</b> database volume. As described in <a target="_self" href="/blog/cubrid-life/cubrid-database-processes/">CUBRID Processes</a>, the <b>cub_master</b> process connects the <b>cub_server</b> process with the <b>cub_cas</b> process, which sends the requests, or the <b>csql</b> program.</li>
</ul>
<p style="text-align: center;"><img editor_component="image_link" height="818" width="700" alt="cubrid_operation_procedure.png" src="/files/attach/images/220547/987/411/cubrid_operation_procedure.png" /></p>
<p style="text-align: center;"><b>Figure 1: CUBRID Operation Procedure.</b></p>
<p>As shown in <b>Figure 1</b>, if a&nbsp;JDBC application is connected to the server process which mounts the database volume file, the SQL statement can be executed in CUBRID. Two most popular methods to use CUBRID is to execute SQL manually using the command line&nbsp;<a target="_self" href="/how_to_use_csql_utilities">CSQL Interpreter</a> program or to write a program which uses various APIs like JDBC, PHP, Node.js, etc.</p>
<blockquote class="q4">
<p><b>Note:<br /></b>As we have discussed, the JDBC program connects to the database server through a broker while CSQL Interpreter directly connects to the database server bypassing the broker. This is an important difference between APIs and CSQL.</p>
</blockquote>
<p>The <a target="_self" href="/wiki_tools/entry/cubrid-manager">CUBRID Manager</a>&nbsp;database administration tool is developed on top of the JDBC driver. The SQL statement executed in CUBRID Manager are passed to the server through the JDBC API. Other database management functions that are available in CUBRID Manager (database creating/deleting, etc.) are executed through a <b>management utility</b> called <i>CUBRID Manager Server</i> which runs as a&nbsp;separate manager server daemon&nbsp;outside of the database.</p>
<p>Now, let's take a look at CUBRID Manager as an example to see how an SQL statement is executed in CUBRID. Refer to <b>Figure 2</b>&nbsp;for this example.</p>
<ol>
<li>The SQL statement entered in the query editor of the CUBRID Manager (for example, <code>SELECT * FROM olympic WHERE host_year &gt; 1988 LIMIT 4;</code>) is sent to <a target="_self" href="/wiki_apis/entry/cubrid-jdbc-driver">CUBRID JDBC driver</a>. This assumes that a database has already been selected in CUBRID Manager before executing this SQL, and&nbsp;the JDBC connection has been established.</li>
<li>In the JDBC connection process, the connection is made to the port of the host specified in the JDBC connection information (<a target="_self" href="/questions/236982">JDBC Connection URL</a>).</li>
<li>The <b>cub_broker</b>&nbsp;process receives the connection and allocates a <b>cub_cas</b> process for the session to the connection.</li>
<li>Then, it sends the socket connection to the <b>cub_cas</b> process so the connection between the JDBC driver and the <b>cub_cas</b> process is established.</li>
<li> Back to the SQL statement execution, the JDBC driver included in the CUBRID Manager sends the SQL statement of the query editor to the connected broker, the <b>cub_cas</b> process.</li>
</ol>
<p>The broker sequentially calls <code>db_open_buffer()</code>, <code>db_compile_statement()</code> and <code>db_execute_statement()</code> among C APIs provided by the client library, to execute the received SQL statement. The <code>db_open_buffer()</code> function parses the SQL statement, the <code>db_compile_statement()</code> function compiles the execution plan, and the <code>db_execute_statement()</code> function executes SQL by sending <a target="_self" href="/wiki_tutorials/entry/CUBRID_Query_Processing#XASL_Generation">XASL</a> (eXtended Access Spec Language) to the server.</p>
<p style="text-align: center;"><img editor_component="image_link" height="756" width="684" alt="sql_statement_execution_procedure.png" src="/modules/editor/styles/default/files/attach/images/220547/987/411/sql_statement_execution_procedure.png" /></p>
<p style="text-align: center;"><b>Figure 2: SQL Statement Execution Procedure in CUBRID.</b></p>
<p>As shown in <b>Figure 2</b>, the <code>qmgr_execute_query()</code> function is executed in the <b>cub_cas</b>, and the <code>xqmgr_execute_query()</code> function is executed in the <b>cub_server</b> process. The <code>qmgr_execute_query()</code> function in the client <b><i>is </i></b>the <code>xqmgr_execute_query()</code> function in the server. As shown above, CUBRID implements a communication interface of Remote Procedure Call (RPC) between a client and a server. In the server, the <code>qexec_execute_query()</code>&nbsp;and <code>qexec_execute_mainblock()</code> functions of the query processing module are used to execute XASL.</p>
<p>If you are interesting how queries are processed in CUBRID, I recommend you to read <a target="_self" href="/wiki_tutorials/entry/CUBRID_Query_Processing">CUBRID Query Processing</a>&nbsp;which provides the detailed step by step explanation using real examples.</p>
<p>By <a target="_self" href="http://www.linkedin.com/pub/kieun-park/39/a78/487">Park Kieun</a>, Senior Software Engineer and Architect at Service Platform Development Center &amp; IT Service Center, NHN Corporation.</p>]]></description>
                        <pubDate>Tue, 21 Aug 2012 12:20:30 +0900</pubDate>
                        <category>CUBRID Internals</category>
                        <category>CUBRID Processes</category>
                        <category>architecture</category>
                        <category>CUBRID Query Processing</category>
                        <category>CUBRID Service</category>
                                </item>
        										        <item>
            <title>Announcing CUBRID 9.1 stable release with big improvements</title>
            <dc:creator>Esen Sagynov</dc:creator>
            <link>http://www.cubrid.org/blog/cubrid-life/announcing-cubrid-9-1-stable-release-with-big-improvements/</link>
            <guid isPermaLink="true">http://www.cubrid.org/blog/cubrid-life/announcing-cubrid-9-1-stable-release-with-big-improvements/</guid>
                        <comments>http://www.cubrid.org/blog/cubrid-life/announcing-cubrid-9-1-stable-release-with-big-improvements/#comment</comments>
                                    <description><![CDATA[<p><img src="/files/attach/images/220547/686/612/cubrid_9_1_banner_blog.jpg" alt="cubrid_9_1_banner_blog.jpg" width="648" height="297" style="display: block; margin-left: auto; margin-right: auto;" /></p>
<p>We released <a href="/blog/news/announcing-cubrid-9-0-with-3x-performance-increase-and-sharding-support/">CUBRID 9.0 <i>beta</i></a> version in October last year. Since then we have been working hard on stabilizing the beta features, fixing bugs, and improving the overall engine performance. Today I am excited to announce the immediate availability of the CUBRID 9.1 stable release. You can download CUBRID Database Server from&nbsp;<a href="/?mid=downloads&amp;item=cubrid&amp;os=detect&amp;cubrid=9.1.0">http://www.cubrid.org/?mid=downloads&amp;item=cubrid&amp;os=detect&amp;cubrid=9.1.0</a>.</p>
<p>I would also like to announce that we will give a talk about <a href="/blog/cubrid-life/cubrid-shard-talk-at-2013-percona-mysql-conference-dont-miss/">CUBRID Database Sharding at Percona MySQL Conference</a> on April 24, 2013, in Santa Clara, CA. Join us there to meet CUBRID Engineers and get the first-hand insight into the new CUBRID 9.1.</p>
<p>Below I will provide an overview of the latest changes and improvements in CUBRID 9.1.</p>
<h2>Overview</h2>
<p>CUBRID 9.1 is an upgraded and stabilized version of CUBRID 9.0 Beta. To learn more about the biggest features introduced in 9.x family, refer to <a href="/blog/news/announcing-cubrid-9-0-with-3x-performance-increase-and-sharding-support/">9.0 official announcement</a>. Issues found in the 9.0 Beta version have been fixed and stabilized in&nbsp;this new&nbsp;9.1 stable release. With a variety of query-related functionalities, CUBRID 9.1 offers improved query processing performance as well as improved query optimization. In addition, its multi-language related functionalities have been further improved. This new 9.1 release is accompanied by new <a href="/?mid=downloads&amp;item=any&amp;os=detect&amp;cubrid=9.1.0">CUBRID Tools and Drivers</a>&nbsp;releases.</p>
<h2>Backward Compatibility</h2>
<h3>Database compatibility</h3>
<p>As a database volume of CUBRID 9.1 is not compatible with the database of CUBRID 9.0 Beta, users of CUBRID 9.0 Beta or previous versions should&nbsp;migrate their database. We have created a migration instructions which you can find in <a href="/manual/91/en/upgrade.html">Upgrade</a>&nbsp;section&nbsp;of the Release Notes.</p>
<h3>Driver compatibility</h3>
<p>The JDBC and CCI driver of CUBRID 9.1 are compatible with CUBRID 9.0 Beta and CUBRID 2008 R4.x version. Some features that are fixed and improved for 9.1 are not supported when 9.1 drivers connect to the previous versions.</p>
<h2>Major enhancements</h2>
<h3>New SQL functions and index hints</h3>
<ul>
<li>New SQL analytics functions like <code><a href="/manual/91/en/sql/function/analysis_fn.html?highlight=ntile#NTILE">NTILE</a></code>, <code><a href="/manual/91/en/sql/function/analysis_fn.html?highlight=lead#LEAD">LEAD</a></code> and <code><a href="/manual/91/en/sql/function/analysis_fn.html?highlight=lag#LAG">LAG</a></code>&nbsp;have been introduced in CUBRID 9.1.</li>
<li><code><a href="/manual/91/en/sql/function/numeric_fn.html?highlight=width_bucket#WIDTH_BUCKET">WIDTH_BUCKET</a></code> new SQL numeric function is also&nbsp;introduced.</li>
<li><code><a href="/manual/91/en/sql/function/numeric_fn.html?highlight=trunc#TRUNC">TRUNC</a></code> and <code><a href="/manual/91/en/sql/function/numeric_fn.html?highlight=trunc#round">ROUND</a></code> functions now also accept the date types.</li>
<li>New SQL Hints:          
<ul>
<li>Support a <a href="/manual/91/en/release_note/r91.html#support-new-index-hint-clause-cubridsus-6675">new index hint clause</a>.</li>
<li><a href="/manual/91/en/release_note/r91.html#sql-hints-for-update-join-and-delete-join-statement-cubridsus-9491">SQL hints for Multi <code>UPDATE</code> and <code>DELETE</code></a> statement.</li>
<li><a href="/manual/91/en/release_note/r91.html#index-hints-for-merge-statement-cubridsus-10134">SQL hints for <code>MERGE</code></a> statement.</li>
</ul>
</li>
</ul>
<h3>Performance improvements and optimizations</h3>
<ul>
<li>The performance of data replication in HA environment has been significantly improved in CUBRID 9.1.<br /><img src="/files/attach/images/220547/686/612/data_replication_performance_comparison.png" alt="data_replication_performance_comparison.png" width="412" height="106" /></li>
<li>Improved multi-key range optimization.</li>
<li>Enhanced optimization of <code>ORDER BY</code> and <code>GROUP BY</code> clause.</li>
<li>Improved analytic function performance.</li>
<li>Improved performance of <code>INSERT ON DUPLICATE KEY UPDATE</code> and <code>REPLACE</code> statement.</li>
<li>Improved search and delete performance for non-unique indexes with many duplicate keys.</li>
<li>Improved delete performance when insert and delete operations are repeated.</li>
<li>The overall performance of <code>SELECT</code> operations has been improved by nearly 20%.<br /><img src="/files/attach/images/220547/686/612/select_operation_result_of_ycsb_benchmark.png" alt="select_operation_result_of_ycsb_benchmark.png" width="456" height="270" />&nbsp;</li>
<li>Based on the results obtained from the basic performance test, we have found that the performance of <code>INSERT</code>, <code>DELETE</code>, and <code>UPDATE</code> operations are almost same as that of 9.0 Beta.</li>
</ul>
<h3>Multi-language support</h3>
<ul>
</ul>
<ul>
<li>In CUBRID 9.1 we now support <a href="/manual/91/en/release_note/r91.html#change-collation-coercibility-level-cubridsus-10057">collation for tables</a>.</li>
<li><code><a href="/manual/91/en/sql/query/show.html?highlight=show%20collation#show-collation">SHOW COLLATION</a></code> statement and new <code><a href="/manual/91/en/sql/function/information_fn.html?highlight=charset#CHARSET">CHARSET</a></code>, <code><a href="/manual/91/en/sql/function/information_fn.html?highlight=collation#COLLATION">COLLATION</a></code>, and <code><a href="/manual/91/en/sql/function/information_fn.html?highlight=coercibility#COERCIBILITY">COERCIBILITY</a></code>&nbsp;functions&nbsp;are now supported.</li>
<li>Support collation with <a href="/manual/91/en/release_note/r91.html#support-collation-with-expansion-sort-by-backward-accents-cubridsus-9407">expansion which sorts French with backward accent order</a>.</li>
<li>Improved and fixed restrictions and issues of 9.0 Beta version.</li>
</ul>
<h3>CUBRID SHARD</h3>
<ul>
<li>We have added&nbsp;<code><a href="/manual/91/en/shard.html?highlight=cubrid%20shard%20getid#checking-cubrid-shard-id">cubrid shard getid</a></code> command to verify the shard ID of the shard key.</li>
<li>CUBRID SHARD is now available on Windows as well.</li>
</ul>
<h3>Administration utility</h3>
<ul>
<li><code><a href="/manual/91/en/ha.html#cubrid-applyinfo">cubrid applyinfo</a></code> utility now also shows information about the replication delay.</li>
<li><code><a href="/manual/91/en/admin/admin_utils.html?highlight=killtran#killing-transactions">cubrid killtran</a></code> utility now has the ability to show the query execution information of each transaction as well as the&nbsp;ability to remove transactions which executes a designated SQL.</li>
<li>When a query timeout occurs,&nbsp;added a functionality to log the query execution information to the server error log and the CAS log files.</li>
</ul>
<h3>Behavioral Changes</h3>
<ul>
<li><code>CUBRID_LANG</code> environment variable is no longer used.</li>
<li><code>CUBRID_CHARSET</code> environment variable which sets the database charset instead of <code>CUBRID_LANG</code> and the <code>CUBRID_MSG_LANG</code> environment variable which sets the charset for utility and error messages.</li>
<li>Change array execution functions such as <code>cci_execute_array</code>, <code>cci_execute_batch</code> function and <code>Statement.executeBatch</code> and <code>PreparedStatement.executeBatch</code> method of JDBC to commit whenever it executes an individual query under auto commit mode, while the previous versions commit once for entire execution.</li>
<li>Change the behavior of <code>cci_execute_array</code>, <code>cci_execute_batch</code> and <code>cci_execute_result</code> function when an error occurs while they are executing multiple statements. These functions now continue to execute the entire given queries while the previous versions stop execution and return an error. Users can access the results and identify the errors with <code>CCI_QUERY_RESULT_*</code> macros.</li>
<li><code>OFF</code> is no longer supported for <code>KEEP_CONNECTION</code> broker parameter.</li>
<li><code>SELECT_AUTO_COMMIT</code> broker parameter is no longer supported.</li>
<li>Change the allowed value range of a broker parameter <code>APPL_SERVER_MAX_SIZE_HARD_LIMIT</code> to <code>1 - 2,097,151</code>.</li>
<li>Change the default value of a broker parameter <code>SQL_LOG_MAX_SIZE</code> from <code>100 MB to 10 MB</code>.</li>
<li>Change the behavior of the <code>call_stack_dump_activation_list</code> parameter.</li>
</ul>
<h3>Numerous Improvements and Bug Fixes</h3>
<ul>
<li>Fix many critical issues of the previous versions.</li>
<li>Improve of fix many issues of stability, SQL, partitioning, HA, Sharding, utilities, and drivers.</li>
</ul>
<ul>
</ul>
<p>For more details on changes, see the Release Notes in <a href="/manual/91/en/release_note/r91.html">English</a> or <a href="/manual/91/ko/release_note/r91.html">Korean</a>.</p>
<p>So far CUBRID 9.1 is our biggest release which we would like you to try. In fact we have released new improved drivers for <a href="/wiki_apis/entry/cubrid-node-js-driver">Node.js</a>, <a href="/wiki_apis/entry/cubrid-php-driver">PHP</a>, <a href="/wiki_apis/entry/cubrid-pdo-driver">PDO</a>, <a href="/wiki_apis/entry/cubrid-python-driver">Python</a>, <a href="/wiki_apis/entry/cubrid-perl-driver">Perl</a>, <a href="/wiki_apis/entry/cubrid-jdbc-driver">JDBC</a>, <a href="/wiki_apis/entry/cubrid-odbc-driver">ODBC</a>, <a href="/wiki_apis/entry/cubrid-oledb-driver">OLEDB</a>, <a href="/wiki_apis/entry/cubrid-ado-net-driver">ADO.NET</a>, and <a href="/wiki_apis/entry/cubrid-cci-driver">C</a>. So you should definitely try the new, more performant and stable CUBRID 9.1 Database.</p>
<p>If you have any questions, feel free to leave your comment below.</p>]]></description>
                        <pubDate>Fri, 15 Mar 2013 20:51:03 +0900</pubDate>
                        <category>New Release</category>
                        <category>performance</category>
                        <category>Sharding</category>
                        <category>HA</category>
                        <category>multilingual support</category>
                                </item>
            </channel>
</rss>
