Background Image

BLOG

?

Shortcut

PrevPrev Article

NextNext Article

Larger Font Smaller Font Up Down Go comment Print Attachment
Written by Jaehong Kim on 06/16/2017

Previous blog article covered Vert.x, a Java application framework which provides noticeable performance advantage over competing technologies and features multi programming language support. The previous article has explained us about the philosophy of Vert.x, performance comparison with Node.js, internal structure of Vert.x, and many more. Today, I would like to continue this conversation and talk more about Vert.x architecture.

Considerations Used to Develop Vert.x

Polyglot is the feature making Vert.x stand out from other server frameworks. In the past, server frameworks could not support multiple languages. Supporting several languages does more than expand the range of users. More important thing is that services using different languages in a distributed environment can intercommunicate with ease. Of course, supporting a variety of languages is not sufficient for supporting a distributed environment. Essential functions of greater priority for a distributed environment include address system or message bus. Vert.x framework provides these functions. As Vert.x provides these functions as well as Polyglot, the benefits of Vert.x should be considered for a distributed environment.

As Vert.x supports a universal server framework, a variety of workloads should be considered. We should consider unusal cases different from Nginx, which is typically used as a Web server, or Node.js. It is to build a universal server application that processes a variety of protocols except HTTP (i.e., not a Web server which executes simple operations, considering scalability in a 3-tier environment). In order to accomplish this, Vert.x provides an additional thread pool while using the Run Loop method.

We will discuss Vert.x architecture starting from the thread pool and consideration for a distributed environment.

Run Loop and Thread Pool

Vert.x and asynchronous server applications (or frameworks), include Ngin.x and Node.js, use the Run Loop method. Vert.x uses the term 'Event Loop' instead of 'Run Loop'. However, as Run Loop is the more popular term among some developers. I use this term, Run Loop, here. Run Loop, as you will guess from the name, is a method for checking whether there is a new event in the infinite loop, and calling an event handler when an event has been received. As such, the asynchronous server application and the event-based server application are different terms indicating an identical target, similar to ‘enharmonic' for music. To use the Run Loop method, all I/Os should be managed as events.

For example, imagine a general Web server application that creates a query for a database to respond to an HTTP request from a Web browser. The CPU of the Web server is used when one thread analyzes the HTTP request to execute proper business logic, and creates a query statement. However, the CPU is not used while the thread sends the query to the database and waits for a response. However, when the thread to be created equals the number of HTTP requests (Thread per Connection), another thread may be processing a task requiring the web server CPU, while one thread is waiting for response from the database. Finally, the web server CPU is used to process HTTP requests. As you know, the weakness of Thread per Connection is the cost for context switching at the kernel level since many threads must be created. This can be called waste. 

The asynchronous event handling method can overcome this weakness (figuratively speaking, 'asynchronous event handling' is the 'purpose' and ‘Run Loop’ is the 'means'). If ‘HTTP request itself’ and ‘receiving a response from the database’ are created as an event, and the Run Loop calls the corresponding event handler whenever an event is received, the execution performance of the application can be enhanced by avoiding unnecessary context switching. In this fashion, to utilize a CPU efficiently, the number of Run Loops required equals the number of cores (i.e., thread should be created equaling the number of cores and each thread should run the Run Loop).

However, there is another problem creating threads equaling the number of cores, which is preventing as much context switching as possible. If a handler, using server resources, takes a long time to handle an event, other events received while the handler is being executed are not managed in a timely manner. A popular example is file searching on the server disk. In this case, it is better to create a separate thread for searching files. 

Therefore, to build a universal server framework with asynchronous event handling, the framework should have a function for managing a thread pool. This is the aim of Vert.x. Thread pool management is the biggest difference between Vert.x and Node.js, except for polyglot. Vert.x creates Run Loops (Event Loops) equaling the number of cores and provides thread pool-related function to handle tasks using server resources requiring long periods for event handling.

Why is Hazelcast Used?

Vert.x uses Hazelcast, an In-Memory Data Grid (IMDG). Hazelcast API is not directly revealed to users but is used in Vert.x. When Vert.x is started, Hazelcast is started as an embedded element.

Hazelcast is a type of distributed storage. When storage is embedded and used in a server framework, we can obtain expected effects from a distributed environment.

The most popular case is session data processing. Vert.x calls it Shared Data. It allows multiple Vert.x instances to share the same data. Of course, additional RDBMS, instead of Hazelcast, will bring the same effect from the functional side. It is natural that embedded memory storage can consistently provide results faster than remote RDBMS. Therefore, users who need sessions for e-commerce or chatting servers can build a system with a simple configuration by using only Vert.x.

Hazelcast allows a message queue use without additional costs or investments (without server costs or monitoring of message queue instances). As mentioned before, Hazelcast is a distributed storage. It can duplicate a storage for reliability. By using this distributed storage as a queue, the server application implemented by using Vert.x becomes a message processing server application and a distributed queue.

These benefits make Vert.x a strong framework in a distributed environment.

 

Understanding Vert.x Components

 

vertx-architecture-diagram.png

Figure 1: Vert.x Architecture (Component) Diagram.

Figure 1 above shows a diagram of Vert.x components. As shown in the figure, in all Vert.x instances (these can be understood as a JVM), a Hazelcast is embedded and runs. The embedded Hazelcast is connected to Hazelcast in other Vert.x instances. Event Bus uses functions of Hazelcast. Hazelcast itself provides a certain level of reliability (because of WAL records and data duplication). So, events can be forwarded with a certain level of reliability.

HTTP Server and Net Server

HTTP Server and Net Server control network events and event handlers. A Net Server is for events and handlers private protocol, and an HTTP Server allows registering a handler to an HTTP event such as GET or POST. The reason for preparing an HTTP Server is eliminating the need to add event types, as well as the universality of HTTP itself. HTTP Server supports WebSocket as well as HTTP.

vertx-event-and-handler-of-http-server.png

Figure 2: Event and Handler of HTTP Server.

Vert.x Thread Pool

Vert.x has three types of thread pools:

  1. Acceptor: A thread to accept a socket. One thread is created for one port.
  2. Event Loops: (same with Run Loop) equals the number of cores. When an event occurs, it executes a corresponding handler. When execution is performed, it repeats reading another event.
  3. Background: Used when Event Loop executes a handler and an additional thread is required. Users can specify the number of threads in vertx.backgroundPoolSize, an environmental variable. The default is 20. Using too many threads causes an increase in context switching costs, so be cautious.

Event Loops can be described as follows in a detailed way. Event Loops use Netty NioWorkder as it is. All handlers specified by verticles run on Event Loops. Each verticle instance has its specified NioWorker. As such, it is guaranteed that a verticle instance is always executed on an identical thread. Therefore, verticles can be written in a thread-safe manner.

Conclusion

So far, I have briefly described Vert.x architecture. Since Vert.x framework is not widely used, I believe it would be better to detail the concept of designing Vert.x than detail each Vert.x component. Even if you have no interest in network server frameworks, it is helpful to review new products and determine differences between new and existing products. Doing so helps in understanding the evolution and direction of software products that are flooding today's market.


  1. CUBRID's Development Culture: The Development Process and Improvements Behind

    Written by Hyung-Gyu, Ryoo on 07/16/2021   Foreword Hello, I am Hyung-Gyu Ryoo, I am working in CUBRID as a research engineer in the R&D department. In this post, I would like to introduce the development process of the open source project CUBRID and the efforts we have made to improve the process.   It has been almost two and a half years since I joined CUBRID.  During this period, as many great fellow developers have joined the CUBRID community, the R&D department of CUBRID has grown from one small team to three developments teams along with a QA team.   After I participated in the first major version release (CUBRID 11), I was able to look back to the release process and improve the development process with my fellow developers.     The Development Process of the Open Source Data...
    Read More
  2. TDE (Transparent Data Encryption) in CUBRID 11

    Written by Jiwon Kim on 07/07/2021 👍Increase the level of database security by utilizing various security features of CUBRID. CUBRID 11 has enhanced security by providing the Transparent Data Encryption (henceforth, TDE) feature. So, what is TDE? TDE means transparently encrypting data from the user’s point of view. This allows users to encrypt data stored on disk with little to no application change. When a hacker hacks into an organization, the number one thing they want to steal is the important data in the database. Alternatively, there may be a situation where an employee with malicious intention inside the company logs into the database and moves all data to a storage media such as a USB.  The easiest way to protect data in these situations is to encrypt the database. TDE, a technolog...
    Read More
  3. Preventing Sniffing by CUBRID- Packet Encryption

      Written by Youngjin Hwang on 05/11/2021   Nowadays, browsing the internet with PCs or our smartphones has become an essential part of our daily life. As a result, it is possible to peek into the data being transmitted over the Internet with malicious intent. In other words, being able to peek at the data being transmitted by someone is called sniffing.   A classic example of a sniffing attack would be intercepting the account’s id and password and causing physical damage by using the personal information of others.   To protect our database user data, CUBRID 11.0 has enhanced security by providing packet encryption (and TDE (Transparent Data Encryption) based data encryption, but that will be cover in another blog later). When packet encryption is applied, the packet is encrypted and tran...
    Read More
  4. Contributing to Open Source Community/Project

      Written by Charis Chau on 06/23/2020      What is an open source project? To answer this, let us start with a burger! Imagine an open source project is a burger selling in a restaurant. Every day, the chef makes thousands of burgers that have the same quality by following the same recipe from the restaurant. One day, customer A comes to the burger place to try the burger, and he/she loves it! Therefore, customer A decides to ask the chef whether he/she can get the recipe. Here, if the restaurant is open source, they will be happy to share the recipe to customer A, vice versa.     After customer A gets the receipt, he/she decide to make the burger at home by him/herself! However, customer A is a meat lover and does not like onion that much, so he/she decide to change the recipe by taking o...
    Read More
  5. CUBRID License Model

    Written by Charis Chau on 06/08/2020   Why Licenses Matter?   Open source licenses allow software to be freely used, modified, and shared. Choosing a DBMS with suitable licenses could save the development cost of your application or the Total Cost of Ownership (TCO) for your company. Choosing a DBMS without a proper license, you might find yourself situate in a legal grey area!     CUBRID Licenses   Unlike other open source DBMS vendors, CUBRID is solely under open source license instead of having a dual license in both commercial license and open source license. Which means that for you, it is not mandatory to purchase a license or annual subscription; company/organizational users can achieve the saving from Total Cost of Ownership (TCO).   Since CUBRID has been open source DBMS from 2008,...
    Read More
Board Pagination Prev 1 2 3 4 5 6 Next
/ 6

Join the CUBRID Project on