<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
    <channel>
        <title>CUBRID Architecture</title>
        <link>http://www.cubrid.org/?mid=cubrid_architecture</link>
        <description>CUBRID Architecture</description>
        <language>en</language>
        <pubDate>Mon, 06 Dec 2010 11:59:22 -0800</pubDate>
        <lastBuildDate>Thu, 01 Sep 2011 00:19:57 -0800</lastBuildDate>
        <generator>XpressEngine 1.4.4.1</generator>
                        										        <item>
            <title>CUBRID Architecture</title>
            <dc:creator>admin</dc:creator>
            <link>http://www.cubrid.org/cubrid_architecture</link>
            <guid isPermaLink="true">http://www.cubrid.org/cubrid_architecture</guid>
                                    <description><![CDATA[<h1>The Architecture of CUBRID</h1>

<div class="contents-table">
<h3>Table of Contents</h3>
<ul>
	<li><a class="toTop">Back to Top</a></li>
	<li><a href="#_Toc244403457">Introduction</a></li>
	<ul>
		<li><a href="#_Toc244403458">Overall Architecture of the CUBRID System</a></li>
		<li><a href="#_Toc244403459">Process Architecture</a></li>
		<ul><li><a href="#_Toc244403460">Connection Configuration</a></li></ul>
	</ul>
	<li><a href="#_Toc244403461">Broker</a></li>
	<ul>
		<li><a href="#_Toc244403462">The cub_broker Process</a></li>
		<li><a href="#_Toc244403463">The cub_cas Process</a></li>
	</ul>
	<li>
		<a href="#_Toc244403464">Client and Server Modules</a>
	</li>
	<ul>
		<li><a href="#_Toc244403465">Module Configuration</a></li>
		<ul>
			<li><a href="#_Toc244403466">Transaction Management Component</a></li>
			<li><a href="#_Toc244403467">Server Storage Management Component</a>
			</li>
			<li><a href="#_Toc244403468">Client Storage Management Component</a>
			</li>
			<li><a href="#_Toc244403469">Object Management Component</a>
			</li>
			<li><a href="#_Toc244403470">Client-Server Communications</a>
			</li>
			<li><a href="#_Toc244403471">Thread Management Component</a>
			</li>
			<li><a href="#_Toc244403472">Query Processing</a>
			</li>
		</ul>
	</ul>
	<li><a href="#_Toc244403473">Detailed Description for the Modules</a>
	</li>
	<ul>
		<li><a href="#_Toc244403474">Transaction Management Component</a>
		</li>
		<li><a href="#_Toc244403475">Object Management Component</a>
		</li>
		<li><a href="#_Toc244403476">Query Processing</a>
		</li>
	</ul>
</ul>
</div>

<div class="category">
<a class="pdf right" href="/files/docs/misc/The-Architecture-of-CUBRID.pdf" target="_self" title="Download this document in PDF">Download this document in PDF</a>
</div>

<p>CUBRID is an <b>object-relational</b> database management system (DBMS). It has a <b>3-tier architecture</b> which consists of the <i>Database Server</i>, the <i>Broker</i>, and the <i>CUBRID Manager</i>.</p>

<ul>
<li><b>Database Server:</b> It is the core component of the CUBRID Database Management System, which saves and manages data in a multi-threaded client/server
	architecture. The Database Server processes the queries entered by users and manages objects in the database. The CUBRID Database Server provides
	seamless transactions using locking and logging methods even when multiple users use the database at the same time. It also supports database
	backup and restore for the operation.
</li>
<li><b>Broker:</b>&nbsp;It is a CUBRID-specific middleware that relays the communication between the Database Server and external applications. It provides
	functions including connection pooling, monitoring, and log tracing and analysis.
</li>
<li><b>CUBRID Manager:</b>&nbsp;It is a GUI tool that manages database and broker. It also provides the Query Editor, a tool that allows users to execute SQL
	queries on the Database Server.
</li>
</ul>

<p>CUBRID's architecture&nbsp;allows to have 1 Broker and many DB Servers, i.e.&nbsp;<b>Broker/DB Server = 1/N</b>.&nbsp;The basic configuration of CUBRID is shown below in <b>Figure 1</b>.</p>

<p><img width="350" height="373" src="http://www.cubrid.org/files/attach/images/49/516/003/basic-onfiguration-of-cubrid.PNG" alt="Figure 1: Basic Configuration of CUBRID" editor_component="image_link"/></p>

<h2 id="_Toc244403457">1. Introduction</h2>

<h3 id="_Toc244403458">1.1 Overall Architecture of the CUBRID System</h3>

<p><b>Figure 2</b> below&nbsp;shows a simplified version of the CUBRID Architecture.</p>

<p><img src="http://www.cubrid.org/files/attach/images/49/516/003/cubrid-architecture-simplified.PNG" alt="Figure 2: CUBRID Architecture - Simplified" width="725" height="408" editor_component="image_link"/></p>

<p><b>Figure 3</b>&nbsp;shows the overall architecture of the CUBRID system.</p>

<p class="center">
	<img width="705" height="479" src="http://www.cubrid.org/files/attach/images/49/516/003/overall-architecture-of-the-cubrid-system.PNG" alt="Figure 3: Overall Architecture of the CUBRID System" editor_component="image_link"/></p>
<p>
	The CUBRID system follows the client/server model that allows multiple applications to access the same database simultaneously. The client module
	(the Broker in <b>Figure </b>3) and the server module (the Server in <b>Figure </b>3) on separate systems (computers) are connected through a
	network. Even when a broker and a server on the same system are connected, the same architecture as above is configured because they are connected
	via socket IPC. A server performs the requests from multiple clients in a single process/multi-threaded environment, and each server process
	manages one database.
</p>
<p>
	The client module analyzes SQL queries on the database from users or applications and executes them to the optimization level. Then it generates a
	query plan tree and sends it to the server. And it receives the execution results from the server by using the cursor navigation and delivers them
	to the users or applications. The client caches object instances from the database to its memory to provide fast access to data by using the query
	execution results or directly by users/applications. In addition, it caches locks as well as objects from the server for concurrency control. The
	execution of triggers or methods specified by users or applications is also performed in the client module.
</p>
<p>
	The server module receives and processes requests from the client module (e.g., object requests or query execution requests from a query execution
	tree) and then returns the query execution results. The server can execute the requests from multiple clients in a single process/multi-threaded
	environment. To support multiple client modules with the appropriate number of threads, the server threads are allocated to each broker request,
	not to each broker. The server performs input and output operations for database and log volume and provides a file access method to the database
	volume in a file or page. In addition, it manages page buffer in a memory and uses a B+-tree index to increase retrieval speed. The server also
	provides concurrency control, deadlock detection, and failover between multiple transactions.
</p>

<h3 id="_Toc244403459">1.2 Process Architecture</h3>

<p>Below is a simplified architecture of the CUBRID Processes.</p>

<p><img width="350" height="451" src="http://www.cubrid.org/files/attach/images/49/516/003/process-architecture-of-the-cubrid-system.PNG" alt="Figure 4: Process Architecture of the CUBRID System" editor_component="image_link"/></p>

<p>Below is a more detailed architecture of the CUBRID Processes.</p>

<p><img width="637" height="520" src="http://www.cubrid.org/files/attach/images/49/516/003/cubrid-process.png" alt="Figure 5: Process Architecture of the CUBRID System" editor_component="image_link"/></p>

<p>
	<b>Figure 5</b>&nbsp;shows the process architecture of the CUBRID system. In the server host, there can be one master process (cub_master) and more than one database
	server process (cub_server). Each client process (cub_cas) that exists in multiple broker hosts connects to each single database server process.
</p>
<p>
	The cub_broker process allocates cub_cas, passes a connection and manages cub_cas for a connect request from an application. The cub_cas process
	executes database queries from the application.
</p>
<h4 id="_Toc244403460">1.2.1 Connection Configuration</h4>
<p>
	The cub_cas process connects to the defined connection port number of the master process. The master process checks whether the requested database
	server is running; the connection request is rejected if the server is not running. If the requested database server is running, the master process
	passes the connected socket to the requested server process. Then, the server process communicates with the client process (cub_cas) directly
	through the socket.
</p>
<p>
	The database server process connects to the master process's port and then registers its server name (database name) and establishes a UNIX Domain
	Socket (or Named Pipe) connection to the master process. In this connection, the master process passes a socket descriptor to the client (cub_cas);
	the connection is maintained for server shutdown and other future operations. After the connection between the server and client processes
	(cub_cas) is established, the server process allocates threads for each client request and performs tasks.
</p>
<h5>Master Process (cub_master)</h5>
<p>
	1. Checks whether other master process is running by connecting to cubrid_port_id
</p>
<p>
	2. Switches to the demon process, opens a socket to the port defined as cubrid_port_id, and waits for the connection between the client and the
	server.
</p>
<p>
	3. Registers a server name and establishes a UNIX domain socket connection to the server process if the connection is from the database server
	process.
</p>
<p>
	4. Passes the connected socket number (socket descriptor) to the database server requested by the client to establish a socket connection between
	the client and the server if it is connected from the client process (cub_cas).
</p>
<h5>Database Server Process (cub_server)</h5>
<p>
	1. Connects to the designated port of the master process. If the connection fails, the connection attempt is aborted, assuming that the master
	process is not running.
</p>
<p>
	2. Registers its server name (database name) to the master process if the connection to the master process is established. At this time, if a
	server with the same name already exists, the registration is rejected, and the server is terminated.
</p>
<p>
	3. Creates a UNIX Domain socket (or Named Pipe), sends a connection path (socket file path) to the master process and terminates the socket
	connection to the designated port when the master process is connected.
</p>
<p>
	4. Waits for task requests from the connected client. At this time, a connection relay of a new client from the master process is processed, if
	any.
</p>
<p>
	5. Accepts requests from the connected client and performs tasks by allocating threads.
</p>
<h5>Client Process (cub_cas)</h5>
<p>
	1. Connects to the master process that exists on a remote or local server through the port defined as cubrid_port_id.
</p>
<p>
	2. Sends the name of the database to connect when the connection to the master process is established and checks whether the database server
	process is registered and running. At this time, the connection is rejected if there is no corresponding server.
</p>
<p>
	3. Receives response messages directly from the server because the master process passes the socket connection between the client and the master
	process to the corresponding server process.
</p>

<h2 id="_Toc244403461">2. Broker</h2>

<p>
	The Broker is a middleware that relays the communication between the database server and applications. It consists of <b>cub_broker</b> and <b>cub_cas</b>. Before you continue reading, we suggest you to read <a href="http://blog.cubrid.org/cubrid-story/the-cubrid-broker-story/" target="_self">The CUBRID Broker Story</a> blog which can give you very clear idea of why there is a Broker layer in CUBRID's architecture.</p>
<h3 id="_Toc244403462">2.1 The cub_broker Process</h3>
<p>
	The cub_broker process allocates cub_cas, passes a connection and manages cub_cas for a connection request from an application. cub_broker has a
	multi-threaded architecture and consists of the following threads:
</p>
<ul>
<li>main<br />This thread creates other threads and manages the number of cub_cas processes. It increases or decreases the number of cub_cas processes depending on the number of requests in the job queue.
</li>
<li>receiver_thread<br />As a thread waiting for the accept() system call, this thread puts a connection request from an application into the job queue.</li>
<li>dispatch_thread<br />This thread finds cub_cas available to allocate to the connection requests in the job queue and passes the connection to cub_cas.
</li>
<li>cas_monitor_thread<br />If cub_cas is abnormally terminated, this thread restarts cub_cas.</li>
</ul>

<h3 id="_Toc244403463">2.2 The cub_cas Process</h3>
<p>
	The cub_cas process executes database queries from an application and has a single thread architecture. This process connects to the database
	server when it receives a “connection” request from an application and calls a function corresponding to the request from the application. After
	the connection with the application is terminated, this process can receive a connection from another application. When disconnecting an
	application, the connection to the database server is not terminated. If next application uses the same database as the current one, the existing
	database connection is reused.
</p>
<p>
	Depending on the application's connection status, cub_cas has four statuses: IDLE, BUSY, CLIENT WAIT, or CLOSE WAIT.
</p>
<p>
	- IDLE: No connection is made to an application.
</p>
<p>
	- BUSY: A connection is made to an application, and the request from the application is being processed.
</p>
<p>
	- CLIENT WAIT: A request from an application is waited for, and a transaction is being processed.
</p>
<p>
	- CLOSE WAIT: A request from an application is waited for but a transaction has been terminated. If the connection between cub_cas and an
	application is disconnected in this status, the application attempts reconnection.
</p>
<p>
	The cub_cas process waits for the select() call after a connection to the application is established and processes each function passed by the
	application. Main functions that respond to requests from an application are as follows:
</p>
<ul>
<li>fn_end_tran<br />This function performs commit/rollback. If KEEP_CONNECTION is set to off in the cubrid_broker.conf file, it terminates the connection the
	application when a transaction is terminated; establishes a new connection when a new transaction starts. If KEEP_CONNECTION is set to auto, the
	status of cub_cas changes to CLOSE_WAIT when a transaction is terminated. In this case, if the application connected to cub_cas has not sent a new
	request, and a new application has sent a "connection" request, the cub_broker process can select the cub_cas whose status is CLOSE_WAIT to
	terminate the connection to the previous application and send a request to cub_cas asking for the connection to a new application.
</li>
<li>fn_prepare<br />This function processes a prepare request from an application. It compiles the queries, creates a handle for the compiled query and sends it to the
	application. Then, the application sends an execution request by using the created handle. After the queries are compiled, if they are the SELECT
	queries, meta information on columns is extracted and sent to the application.
</li>
<li>fn_execute<br />This function executes a prepared query statement. If the query statement is SELECT, it sends the query results as the specified buffer size and
	sends the query execution results for other query statements. If JDBC RESULT CACHE is in use and the executed query already exists in JDBC RESULT
	CACHE, this function determines whether the stored query results can be reused. If they can be reused, the query results are not sent. Instead,
	only a flag indicating reusability is sent to the JDBC.
</li>
<li>fn_fetch<br />This function copies the query results of the SELECT statement as the specified buffer size and sends them to an application.
</li>
</ul>

<h2 id="_Toc244403464">3. Client and Server Modules</h2>

<p>
	This chapter describes the components of the entire server (hereinafter, the server) and the native C API &amp; other modules (hereinafter, the client)
	in the Client Library of the Broker as shown in <b>Figure </b>6.
</p>
<p class="center">
	<img width="705" height="608" src="http://www.cubrid.org/files/attach/images/49/516/003/detailed-architecture-of-the-cubrid-system.PNG" alt="Firgure 6: Detailed Architecture of the CUBRID System" editor_component="image_link"/>
</p>

<h3 id="_Toc244403465">3.1 Module Configuration</h3>
<p>
	The CUBRID client and server modules consist of the following components:
</p>
<ul>
<li>Transaction Management Component<br />Handles system transactions across the client and server (including system failover).
</li>
<li>Server Storage Management Component<br />Accesses and manages database and log volume on the server (including page buffering).
</li>
<li>Client Storage Management Component<br />Allocates and manages a workspace for the object cache and access on the client.
</li>
<li>Object Management Component<br />Defines a class object, creates and modifies an object, converts the object representation structure between the disk and the memory.
</li>
<li>Client-Server Communications<br />Manages the network communication between the client and the server.
</li>
<li>Thread Management<br />Manages threads of a server process.
</li>
<li>Query Processing<br />Executes query plans on the server, which are created by translating, analyzing and optimizing SQL statements on the client.
</li>
</ul>
<p>
	The module configuration of each component is described in the following section.
</p>
<h4 id="_Toc244403466">3.1.1 Transaction Management Component</h4>
<p>
	The Transaction Management Component consists of the modules in dark blue in <b>Figure </b>7.
</p>
<p class="center">
	<img width="705" height="562" src="http://www.cubrid.org/files/attach/images/49/516/003/module-configuration-of-transaction-management-component.PNG" alt="Figure 7: Module Configuration of Transaction Management Component" editor_component="image_link"/>
</p>

<ul>
<li> Object Locator<br />As a module passing object data between a workspace on the clients and the page buffer pool on the server, it caches an object and acquires a lock to a
	workspace.
</li>
<li>Transaction Manager<br />As a module performing transaction start, commit, and rollback, it initializes other modules (lock/log/recovery manager) of Transaction Management
	Component. This module also supports commit, rollback, and savepoint including 2PC (2-phase commit).
</li>
<li>Lock Manager<br />As a module performing lock management based on the 2PL (2-phase locking) protocol, it supports a granularity locking protocols.
</li>
<li>Recovery Manager<br />As a module protecting database consistency from the system failure, it employs a failover method that uses UNDO/REDO logging and the WAL (Write Ahead
	Logging) protocol. This module supports total rollback, partial rollback (to savepoint), and nested top operation, and uses LSA (Log Sequence Address)
	and CLR (Compensation Log Record), etc.
</li>
</ul>

<h4 id="_Toc244403467">3.1.2 Server Storage Management Component</h4>
<p>
	The Server Storage Management Component consists of the modules shown in <b>Figure </b>8.
</p>
<p class="center">
	<img width="705" height="471" src="http://www.cubrid.org/files/attach/images/49/516/003/module-configuration-of-server-storage-management-component.PNG" alt="Figure 8: Module Configuration of Server Storage Management Component" editor_component="image_link"/>
</p>

<ul>
<li>I/O Manager<br />As a module performing I/O tasks for the disk volume (or volume file), it performs a volume mount/unmount process and locks a volume. This module
	performs write synchronization for a log volume.
</li>
<li>Page Buffer Management<br />As a module managing the page buffer in a virtual memory that is used for disk page buffering, it employs the LRU page replacement algorithm and the
	FIX/UNFIX protocol to use page buffer. In addition, this module uses a hash table to quickly retrieve a requested page in the buffer pool.
</li>
<li>Disk Manager<br />A module managing the internal structure of the disk volume (or volume file). A volume consists of sectors, and a sector is a group of continuous
	pages. Each volume consists of system area and user area. The bit allocation map is used for page allocation in the volume.
</li>
<li>File Manager<br />As a module helping access to a database only in a file and page regardless of internal structure of the volume (volume, sector and page), it is used
	in a file structure such as B+-tree, heap, or hash. The File Manager module keeps and manages information on the sector that is allocated to a file in
	a file header.
</li>
<li>Slotted Page Manager<br />As a module inserting, deleting and updating records in a file page, it provides slot structure that indicates the position (offset) of records in a
	page; it can move records in a page through a slot.
</li>
<li>Overflow Page Manager<br />A module inserting, deleting and updating records with the size of over one page in an overflow page area. With this module, you can treat a large size
	data atomically.
</li>
<li>Object Heap Manager<br />A module inserting, deleting and, updating an object in a file through the heap structure. The instances (records) of a class (table) are stored into
	an object heap file, and a unique OID (object identifier) is allocated to each record. The OID consists of "Volume ID | + Page ID + Slot ID," and it is
	not reused except for a special case. This OID expression is the same as disk addressing in the Disk Manager. That is, the OID indicates the physical
	location of a disk where a record is stored.
</li>
<li>Extendible Hash Manager<br />As a module providing the extendible hashing to access data quickly, it is used to retrieve class OIDs with a class name.
</li>
<li>B+-tree Manager<br />As a module providing an index file structure based on the prefix B+-tree, it inserts, deletes, and retrieves a key for B+-tree.
</li>
<li>Long Data Manager<br />As a module processing ad-hoc large objects such as multimedia data, it can modify part of the data.
</li>
</ul>
<h4 id="_Toc244403468">3.1.3 Client Storage Management Component</h4>
<p>
	The Client Storage Management Component consists of the modules shown in <b>Figure </b>9.
</p>
<p class="center">
	<img width="705" height="283" src="http://www.cubrid.org/files/attach/images/49/516/003/module-configuration-of-client-storage-management-component.PNG" alt="Figure 9: Module Configuration of Client Storage Management Component" editor_component="image_link"/>
</p>

<ul>
<li>Workspace Manager<br />A module managing the database objects cached in the workspace of the client process. Through an object table implemented as a hash, it converts a disk
	object identifier OID to a memory object pointer (MOP). The MOP has a memory pointer that helps access to objects cached in the client memory.
</li>
<li>Garbage Collector<br />A module collecting garbage for the client workspace. This module releases the memory that is allocated to MOPs and cached objects.
</li>
<li>Quick Fit Storage Allocator<br />A module allocating a memory to the workspace for an object.
</li>
</ul>
<h4 id="_Toc244403469">3.1.4 Object Management Component</h4>
<p>
	The Object Management Component consists of the modules in <b>Figure </b>10.
</p>
<p class="center">
	<img width="705" height="485" src="http://www.cubrid.org/files/attach/images/49/516/003/module-configuration-of-object-management-component.PNG" alt="Figure 10: Module Configuration of Object Management Component" editor_component="image_link"/>
</p>

<ul>
<li>Representation Manager<br />This is a module performing conversion between disk expression structure and memory expression structure of an object. An object data is suitable to
	query execution in a disk and it has a structure which helps an application access it in a memory. The Representation Manager does conversion between
	these two expression formats. It also performs byte ordering during conversion.
</li>
<li>Schema Manager<br />As a module defining and changing a class, it creates, modifies, or manages the inheritance of a column, method, or class.
</li>
<li>Object Access Manager<br />As a module creating, deleting, modifying, checking an object or calling a method, it is closely related to the Schema Manager.
</li>
<li>Dynamic Loader<br />A module providing a dynamic link to an application that is executing methods written in C.
</li>
<li>Trigger Manager<br />A module implementing a trigger feature with a system object. This module is closely related to the Schema Manager and Object Access Manager.
</li>
<li>Authorization Manager<br />A module checking the authority of a database user. This module is implemented on top of the API provided by the Object Access Manager.
</li>
<li>Data Type and Domain<br />A module manipulating internal data structure (representation format) for data type and domain information. This module caches the information about
	the used domain to a connection list and has a domain conversion matrix.
</li>
</ul>
<h4 id="_Toc244403470">3.1.5 Client-Server Communications</h4>
<p>
	Client-Server Communications consists of the modules in <b>Figure </b>11.
</p>
<p class="center">
	<img width="703" height="393" src="http://www.cubrid.org/files/attach/images/49/516/003/module-configuration-of-client-server-communications.PNG" alt="Figure 11: Module Configuration of Client-Server Communications" editor_component="image_link"/>
</p>

<ul>
<li>Socket Manager<br />A module managing communications in the client, the server and the master process (cub_master). This module manages the procedures of connection to the
	client or server through the master process.
</li>
<li>Packet Manager<br />A module processing a packet that is used to exchange information between the client and the server. The packet types include request packet, data
	packet, close packet, out-of-band packet, or error packet. The request packet and data packet can communicate asynchronously by using a queue in the
	client and server.
</li>
<li>Client-Server Interface<br />A module providing an interface to use Client-Server Communications in the system. This module processes an exception that occurs during communications
	as well as out-of-band such as user interrupt, etc.
</li>
</ul>
<h4 id="_Toc244403471">3.1.6 Thread Management Component</h4>
<p>
	Thread Management Component manages multiple threads in the server process; it is implemented by using pthread. This component detects a request from
	the client by using the select() system call and allocates a task to the threads per each request. Similarly, the worker thread processing a request
	from the client waits for a task in the Job Queue and wakes up when a task enters the process. After it processes the task, it waits for another task
	in the Job Queue. There are also system threads that process only special system tasks as well as this worker thread.
</p>
<ul>
<li>Deadlock detection thread<br />This thread checks whether a deadlock occurs at a fixed interval or when there is a lock request, and it solves a problem when there is a deadlock.
</li>
<li>Checkpoint thread<br />This thread performs a checkpoint feature that flushes the data page, which is already committed at a fixed interval but not reflected to the disk and
	cached in the page buffer. Performing a periodic checkpoint reduces the restore time during failover.
</li>
<li>OOB (out-of-band) thread<br />This thread receives the OOB signal and passes it to thread.
</li>
<li>Page-flush thread<br />This thread periodically flushes the dirty pages in the page buffer to the disk. This improves system performance by reducing flushing dirty pages to
	the disk during page replacement.
</li>
<li>Log-flush thread<br />This thread flushes the log page to the log volume. It provides group and asynchronous commit methods by using the log flush thread.
</li>
</ul>
<h4 id="_Toc244403472">3.1.7 Query Processing</h4>
<p>
	The Query Processing consists of the following modules.
</p>
<ul>
<li>Scanner/Parser<br />As a module translating queries (SQL) from users or applications, it creates a parse tree.
</li>
<li>Semantic Checker<br />A module performing node typing, name resolution, semantic checking, or view translation, etc.
</li>
<li>XASL Generator/Optimizer<br />A module creating XASL (eXtended Access Specification Language) tree which is a query execution plan and performing query optimization by using schema
	information and database statistics. The XASL tree includes scan information (heap scan, index scan, list file scan, set scan, and method scan), a
	value list (values required for query results) and predicate. The query optimization employs cost-based optimization and rewrite optimization.
</li>
<li>Query Manager<br />A server module executing a given XASL_tree from the client. This module consists of the Query File Manager that stores the query's XASL plan and its
	results as well as the Query Evaluator that evaluates queries and creates a result list file. This module interfaces with the Transaction Manager or
	Recovery Manager to approve or cancel a transaction.
</li>
<li>Cursor Manager<br />A module fetching data from the list file that is created as the retrieval results.
</li>
</ul>
<h3 id="_Toc244403473">3.2 Detailed Description for the Modules</h3>
<h4 id="_Toc244403474">3.2.1 Transaction Management Component</h4>
<h5>
	<b>A. </b>
	<b>Object Locator</b>
</h5>
<p>
	The Object Locator is a module delivering object data between a workspace on the clients and the page buffer pool on the server. The Object Locator
	provides simultaneous access, use, and failover for database objects by using the Transaction Management Component's locking and restore algorithm.
</p>
<p>
	The Object Locator is divided into Object Locator on the client, Object Locator on the server, and Object Locator on the client/server. The Client
	Object Locator executes its tasks by using Workspace Manager, Representation Manager (Transformation Manager), and Heap File Manager. The Authorization
	Manager, Schema Manager, Object Access Manager and Query Parser (Scanner/Parser) use the functions of Client Object Locator. The Server Object Locator
	executes tasks by using Object Heap Manager, Representation Manager (Transformation Manager), Lock Manager, Catalog Manager, and B+-tree Manager. In
	the Client Object Locator, the functions of Server Object Locator module is used for object fetch and flush.
</p>
<p>
	The objects that are cached to the workspace of a client by the Object Locator maintains coherency with the objects in a server by using cache
	coherency number. If the cache coherency number of an object, that is cached into the workspace of a client, is not the same as the cache coherency
	number of an object that exists in the page buffer (or disk) of a server, the cached object becomes invalid (invalidation). The Server Object Locator
	increases the cache coherency number of an object whenever an object is flushed from a server and it is sent to a server.
</p>
<p>
	Validation check for a cached object is performed when the object is first used by transaction. Because lock is also cached (set up) when an object is
	cached, the validation of an object is effective while one transaction is being executed. When a transaction requests an object, the Client Object
	Locator checks whether the object and its lock are cached. If both the object and lock are cached, the transaction can use the cached objects in the
	workspace memory much faster. If neither the object nor lock is cached, send a request to the Server Object Locator. The Server Object Locator sets up
	lock that is requested for an object by using the Lock Manager. When lock is acquired, the cache coherency number of an object in the workspace and the
	cache coherency number of an object that exists in the database (page buffer or disk) of a server are compared. If these two values are different, a
	new object data from the server is sent to the client and it replaces the old cached object.
</p>
<p>
	When a transaction is terminated, the cached objects are flushed to a server. When a transaction is rolled back, the objects are all de-cached. In
	addition, when a class object is invalidated (e.g., a schema is changed by a transaction of another client), all the instance objects in the class are
	flushed/de-cached all together. And all the objects are flushed to a server together with query execution requests because queries are executed in a
	server.
</p>
<p>
	To reduce the communication amount between a client and a server, the Object Locator sends flush data together with object fetch request packet or
	pre-fetches related class objects or other surrounding objects when caching objects.
</p>
<p>
	The Server Object Locator fetches an object from database and updates it to the database upon the request of Client Object Locator by using the Heap
	File Manager. In addition, it manages lock setting by using the Lock Manager.
</p>
<h5>
	<b>B. </b>
	<b>Transaction Manager</b>
</h5>
<p>
	The Transaction Manager is a module which does transaction start, approval, and rollback, etc. The Transaction Manager calls the Object Locator to
	flush an object that is used for transaction, the Lock Manager to release a cached lock, or the Log Manager (Recovery Manager) for transaction
	approval/rollback.
</p>
<p>
	The Transaction Manager is divided into a client and a server. When an application requests transaction termination (approval, rollback), the Client
	Transaction Manager flushes the objects (among the objects in the workspace) that are changed during transaction execution to the page buffer of a
	server. (If it is rollback request, the changed objects are not flushed to a server. Instead, they are immediately removed from the workspace.) Next,
	the Client Transaction Manager requests approval/rollback to the Server Transaction Manager. In case of approval, the Server Transaction Manager calls
	the Log Manager (Recovery Manager) executes postpone action to the database in a server and also loose_end postpone action in a client. After that, it
	releases all the acquired locks and closes all the open cursors. In case of rollback, the Log Manager (Recovery Manager) returns the tasks that are
	executed by transaction by using UNDO log and releases all the acquired locks. When a transaction is approved or rolled back, the locks that are cached
	by the Client Transaction Manager are all released.
</p>
<p>
	It supports 2PC (2-phase commit) protocol for global transaction.
</p>
<h5>
	<b>C. </b>
	<b>Lock Manager</b>
</h5>
<p>
	The Lock Manager is a module that manages locks according to the 2PL (2 Phase Locking) protocol and Granularity Locking protocol. The Lock Manager
	searches for a transaction identifier, calls the Log Manager (Recovery Manager) to get the lock waiting time of a transaction, and calls the Server
	Transaction Manager to roll back a transaction to handle deadlock. The Server Object Locator uses the Lock Manager to acquire and release a lock for an
	object and the Log Manager uses the Lock Manager to release locks all together.
</p>
<p>
	When accessing an instance object, lock setting is necessary for the class objects that define the all attributes of the instance and also for the
	upper class objects that are inherited. In case of the schema change for a class object, eXclusive lock must be set for the class and its lower
	classes.
</p>
<p>
	In case of query execution, the instance of a class and the instance of its lower classes are all searched. In addition, because a class object is a
	domain that defines the corresponding instance, the domain class and its lower classes are all accessed. Therefore, set up shared lock for the class to
	search and its lower classes and also the domain class that defines an instance and its lower classes during query execution.
</p>
<p>
	To detect a deadlock, WFG (Waits-For-Graph) method is used. If WFG detects a deadlock, one of the involved transactions is forcibly terminated by the
	system.
</p>
<p>
	The Lock Manager manages Lock Table. The Lock Table is implemented with hash table for OID and access to the table is set up as critical section to
	maintain consistency.
</p>
<h5>
	<b>D. </b>
	<b>Recovery Manager</b>
</h5>
<p>
	The Recovery Manager reflects the status of all the committed transactions to the database and does not reflect the effect of transactions that are not
	committed when any fault to transaction, system, or media occurs. For this, the Recovery Manager records a log and restores database from diverse
	faults based on the log. The CUBRID Recovery Manager uses UNDO/REDO restore protocol and this protocol is based on the following rules:
</p>
<ul>
<li>UNDO Rule<br />Record data value before it is changed. It is assured the last committed value is recorded into a log before it is overwritten by a value that is not
	yet committed.
</li>
<li>REDO Rule<br />The values updated by a transaction are surely recorded into a log before the transaction is committed. That is, the data value before committing is
	recorded into a log.
</li>
</ul>
<p>
	A log is a file in which data is appended in an arbitrary length. To implement a log file with infinite length, recent log data is recorded into an
	active log and previous log data is archived into an archive log.
</p>
<p>
	The UNDO/REDO logging is designed to achieve the maximum efficiency during general operation, rather than database system fault restore time. The flush
	of data page can be avoided as much as possible during commit or rollback due to the logging protocol. The data page is only written to a disk only
	when it is replaced by another page.
</p>
<h4 id="_Toc244403475">3.2.2 Object Management Component</h4>
<p>
	The Object Management Component defines a table, creates or modifies an object, and formats an object in a disk or memory.
</p>
<h5>
	<b>A. </b>
	<b>Representation Manager</b>
</h5>
<p>
	This is a module performing conversion between disk expression structure and memory expression structure of an object. An object data is suitable to
	query execution in a disk and it has a structure which helps an application access it in a memory. The Representation Manager does conversion between
	these two expression formats.
</p>
<p class="center">
	<img width="437" height="424" src="http://www.cubrid.org/files/attach/images/49/516/003/disk-expression-format-of-an-object.png" alt="Figure 12: Disk Expression Format of an Object" editor_component="image_link"/>
</p>

<p>
	The disk expression format of an object is shown in <b>Figure </b>12. The class OID and Representation ID of an object come first, and these are used
	to judge which format the object has. The following CHN (Cache Coherency Number) is used to judge the validity of caches object. In the disk expression
	format, the columns (attributes) are divided into a fixed length type column where all the values have the same length just like an integer and a
	variable length column where all the values have different lengths just like a string. The fixed length columns are saved into a pre-defined location,
	and the location of each column is obtained from the information that is managed by the Catalog Manager. The location of the variable length column is
	obtained from the variable length column offset table which has location information of each variable length column. The last entry of offset table
	indicates the end of an object. The offset table is not saved for the object of a table which has no variable length column.
</p>
<p>
	When an object is cached into a memory, the MOP indicates a memory block that has the columns of the object. The fixed length column values are
	continuously saved into an object block and the values of a variable length column are saved into a memory block that is separately allocated. The CHN
	is also included in the memory expression format. The object locator compares this CHN value and the CHN value that is stored in a disk to judge the
	validity of an object. If two CHN values are different, it means the object that is cached to the memory is not valid. Then, the object locator
	de-caches the object and caches the content of a new object.
</p>
<p class="center">
	<img width="705" height="313" src="http://www.cubrid.org/files/attach/images/49/516/003/memory-expression-format-of-an-object.PNG" alt="Figure 13: Memory Expression Format of an Object" editor_component="image_link"/>
</p>

<p>
	The Representation Manager uses the Workspace Manager to receive a storage space for the memory expression of an object and uses the Schema Manager to
	determine the size and architecture of an object.
</p>
<p>
	When the CUBRID changes schema, it does not change the expression format of the records in the schema. Therefore, if you find an object that is saved
	in the old expression format during the conversion process between two expression formats, convert it to the recent expression format. At this time,
	use schema information for the recent expression format and the old expression format. During expression format conversion process, convert the
	difference of hardware architecture between the client equipment and the server equipment, e.g., the byte ordering difference.
</p>
<h4 id="_Toc244403476">3.2.3 Query Processing</h4>
<p class="center">
	<img width="350" height="392" src="http://www.cubrid.org/files/attach/images/49/516/003/the-procedures-of-query-compile-in-a-client.PNG" alt="Figure 14: The Procedures of Query Compile in a Client" editor_component="image_link"/>
</p>

<h5>
	<b>A. </b>
	<b>Scanner/Parser</b>
</h5>
<p>
	The parser keeps the data structure to create a parse tree during parsing process, the data structure to maintain the created parse tree, and data
	structure to manage multiple SQL statements, and information about lexer.
</p>
<h5>
	<b>B. </b>
	<b>Semantic Checker</b>
</h5>
<p>
	If a parse tree is configured without an error, it means a query statement with correct syntax is input. Semantic checking is a feature that checks
	whether the semantics of an input statement is valid. It performs the following tasks:
</p>
<p>
	1. Name resolution and parse tree node type checking
</p>
<p>
	Checks whether an existing table or column is used and infers the type of a column.
</p>
<p>
	2. Semantic checking
</p>
<p>
	Checks whether an operation that is not supported between types is used.
</p>
<p>
	3. View translation
</p>
<p>
	Converts the definition statement of a view.
</p>
<h5>
	<b>C. </b>
	<b>XASL Generator/Optimizer</b>
</h5>
<p>
	The query statement input by a user goes through parsing and semantic checking, and then it is converted into the augmented parse tree where catalog
	information is listed. When query optimization is performed based on this augmented parse tree, the XASL tree, i.e. action plan, is created as a
	result. The XASL tree is a tree where the most optimized access sequence and method are specified for the tables to access during query execution. It
	consists of action plans which has the lowest access path cost among many other possible plans. With a parse tree and catalog statistics information,
	one XASL tree can be created as follows:
</p>
<p>
	1. Classifying terms to configure search conditions in table units
</p>
<p>
	A term becomes a search condition for one or more tables. When there is one table to which the term is applied, the term is scan term (sarg). If there
	are two, the term is join term (edge). If there are three, the term is other term. For the terms specified in the where clause of a parse tree, divide
	them into join terms or scan terms. Classify the scan terms according to the table to which each term is applied.
</p>
<p>
	2. Determining the most optimized access method to each table
</p>
<p>
	For the scan terms that will be applied to an arbitrary table, calculate the selectivity of each scan term and select a search method of a term whose
	selectivity is lowest as a table search method. That is, determine whether to use sequential scan or index scan for a table. If the index scan is used,
	determine which index to use.
</p>
<p>
	3. Calculating selectivity for each table
</p>
<p>
	Calculate the selectivity of each table by using the selectivity of each scan term that is calculated in the step 2.
</p>
<p>
	4. Determining access sequence among tables
</p>
<p>
	To determine the access sequence among tables, list various access sequences and calculate access path cost of each case. Select the execution sequence
	whose access cost is lowest as the final execution plan.
</p>
<p>
	5. Creating XASL tree for the final execution plan
</p>
<h5>
	<b>D. </b>
	<b>Query Manager</b>
</h5>
<p>
	This is a server module that executes a XASL tree from a client. During Query Processing, a client sends a XASL tree that is created through the XASL
	Generator/Optimizer module to a server. A query is executed when the server receives and executes this XASL tree. Actually, it is undesirable, in terms
	of performance, to go through the XASL Generator/Optimizer whenever there is a query of the same pattern, the CUBRID saves the XASL tree into the Query
	Plan Cache and reuses it. In addition, when the same query is executed repeatedly, it saves the query result into the Query Cache and returns the
	result without query execution next time.
</p>
<p class="center">
	<img width="400" height="309" src="http://www.cubrid.org/files/attach/images/49/516/003/query-execution-on-the-server.PNG" alt="Figure 15: Query Execution on the Server" editor_component="image_link"/>
</p>

<p>
	The procedure of query processing through these components is shown in <b>Figure </b>16.
</p>
<p class="center">
	<img width="102" height="588" src="http://www.cubrid.org/files/attach/images/49/516/003/query-execution-steps.PNG" alt="Figure 16: Query Execution Steps" editor_component="image_link"/>
</p>]]></description>
                        <pubDate>Mon, 06 Dec 2010 11:04:09 -0800</pubDate>
                        <category>architecture</category>
                        <category>broker</category>
                        <category>3-tier architecture</category>
                        <category>CAS</category>
                                </item>
            </channel>
</rss>
