Background Image


Written by Dongsoon Choi on 06/23/2017

This is the fourth article in the series of "Become a Java GC Expert". In the first issue Understanding Java Garbage Collection we have learned about the processes for different GC algorithms, about how GC works, what Young and Old Generation is, what you should know about the 5 types of GC in the new JDK 7, and what the performance implications are for each of these GC types.

In the second article How to Monitor Java Garbage Collection we have explained how JVM actually runs the Garbage Collection in the real time, how we can monitor GC, and which tools we can use to make this process faster and more effective.

In the third article How to Tune Java Garbage Collection we have shown some of the best options based on real cases as our examples that you can use for GC tuning. Also we have explained how to minimize the number of objects passed to Old Area, decreasing Full GC time, as well as how to set GC type and the memory size.

In this fourth article, I will explain the importance of MaxClients parameter in Apache that significantly affects the overall system performance when GC occurs. I will provide several examples through which you will understand the problem MaxClients value causes. I will also explain how to reliably set the proper value for MaxClients depending on the available system memory.


The effect of MaxClients on the system

The operation environment of NHN services has a variety of Throttle valve-type options. These options are important for reliable service operations. Let's see how the MaxClients option in Apache affects the system when Full GC has occurred in Tomcat.

Most developers know that "stop the world (STW) phenomenon" occurs when GC has occurred in Java (for more refer to Understanding Java Garbage Collection). In particular, Java developers at NHN may have experienced faults caused by GC-related issues in Tomcat. Because Java Virtual Machine (JVM) manages the memory, Java-based systems cannot be free of the STW phenomenon caused by GC.

Several times a day, GC occurs in services you have developed and currently operate. In this situation, even if TTS caused by faults does not occur, services may return unexpected 503 errors to users.

Service Operation Environment

For their structural characteristics, Web services are suitable for scale-out rather than scale-up. So, generally, physical equipment is configured with Apache  * 1 + Tomcat * n according to equipment performance. However, this article assumes an environment where Apache * 1 + Tomcat * 1 are installed on one host as shown in Figure 1 below for convenient description.


Figure 1: Service Operation Environment Assumed for the Article.

For reference, this article describes options in Apache 2.2.21 (prefork MPM), Tomcat 6.0.35, jdk 1.6.0_24 on CentOS 4.72 (32-bit) environment.

The total system memory is 2 GB and the Garbage Collector uses ParallelOldGC. The AdaptiveSizePolicy option is set to true by default and the heap size is set to 600m.

STW and HTTP 503

Let's assume that requests are flowing into Apache at 200 req/s and more than 10 httpd processes are running for service, even though this situation may depend on response time for requests. In this situation, assuming that the pause time at full GC is 1 second, what will happen if full GC occurs in Tomcat?

The first thing that hits your mind is that Tomcat will be paused by full GC without responding to all requests being processed. In this case, what will happen to Apache while Tomcat is paused and requests are not processed?

While Tomcat is paused, requests will continuously flow into Apache at 200 req/s. In general, before full GC occurs, responses to requests can be sent quickly by the service with only 10 or more httpd processes. However, because Tomcat is paused now, new httpd processes will continuously be created for new inflowing requests within the range allowed by the MaxClients parameter value of the httpd.conf file. As the default value is 256, it will not care that the requests are inflowing at 200 req/s.

At this time, how about the newly created httpd processes?

Httpd processes forwards requests to Tomcat by using the idle connections in the AJP connection pool managed by the mod_jk module. If there is no idle connection, it will request to create new connections. However, because Tomcat is paused, the request to create new connections will be rejected. Therefore, these requests will be queued in the backlog queue as many as the size of the backlog queue, set in the AJP Connector of the server.xml file, allows.

If the number of queued requests exceed the size of the backlog queue, a Connection is Refused error will be returned to Apache and Apache will return the HTTP 503 error to users.

In the assumed situation, the default size of backlogs is 100 and the requests are flowing in at 200 req/s. Therefore, more than 100 requests will receive the 503 error for 1 second of the pause time caused by full GC.

In this situation, if full GC is over, sockets in the backlog queue are retrieved by Tomcat's acceptance and assigned to worker threads within the range allowed by MaxThreads (defaults to 200) in order to process requests.

MaxClients and backlog

In this situation, which option should be set in order to prevent the 503 error to users? 

First, we need to understand that the backlog value should be enough to accept requests flowing to Tomcat during the pause time in full GC. In other words, it should be set to at least 200 or greater. 

Now, is there a problem in such configuration?

Let's repeat the above situation under the assumption that the backlog setting value has been increased to 200. This result is more serious as shown below.

The system memory usage is typically 50%. However, it rapidly increases to almost 100% when full GC occurs, causing a rapid increase of swap memory usage. Moreover, because the pause time of full GC increases from 1 second to 4 or more seconds, the system is down for that time and cannot respond to any requests.

In the first situation, only 100 or more requests received the 503 error. However, after increasing the backlog size to 200, more than 500 requests will be hung for 3 or more seconds and cannot receive responses.

This situation is a good example that shows more serious situations which may occur when you do not precisely understand the organic relations between settings, i.e., their impact on the system.

Then, why does this phenomenon occur?

The reason is the characteristics of the MaxClients option.

Setting the MaxClients value to a generous value does not matter. The most important thing in setting the MaxClients option is that the total memory usage should be calculated not to exceed 80% even though httpd processes are created as much as the maximum MaxClients value.

The swappiness value of the system is set to 60 (default). As such, when memory usage exceeds 80%, swap will actively occur.

Let's see why this characteristic causes the more serious situation described above.

When requests are flowing in at 200 req/s and Tomcat is paused by full GC, the backlog setting value is 200. Approximately, an additional 100 httpd processes can be created in Apache above the first case. In this situation, when the total memory usage exceeds 80%, the OS will actively use the swap memory area, and objects for GC will be moved from the old area of JVM to the swap area since the OS considers them unused for a long period.

Finally, when the swap area is used in GC, the pause time will rapidly increase. So, the number of httpd processes will increase, causing 100% of memory usage and the situation previously described will occur.

The difference between the two cases is only the backlog setting values: 100 vs. 200. Why did this situation occur only for 200?

The reason for the difference is the number of httpd processes created in these configurations. When the setting value is set to 100 and, full GC occurs, 100 requests for creating new connections are created and then are queued in the backlog queue. The other requests receive the connection refused error message and return the 503 error. Therefore, the number of total httpd processes will be slightly more than 100.

When the value is set to 200, then 200 requests for creating new connections can be accepted. Therefore, the number of total httpd processes will be more than 200 and the value exceeds the threshold that determines the occurrence of memory swap.

Then, by setting the MaxClients option without considering the memory usage, the number of httpd processes rapidly increases with full GC, causing swap and degradation of the system performance.

If so, how can we determine the MaxClients value, what is the threshold value for the current system situation?

Calculation Method of MaxClients Setting

As the total memory of the system is 2 GB, the MaxClients value should be set to use no more than 80% of the memory (1.6 GB) in any situation in order to prevent performance degradation caused by the memory swap. In other words, the 1.6 GB memory should be shared and allocated to Apache, Tomcat, and agent-type programs, which are installed by default.

Let's assume that the agent-type programs, which are installed in the system by default, occupy the memory at about 200 m. For Tomcat, the heap size set to -Xmx is 600m. Therefore, Tomcat will always occupy 725m (Perm Gen + Native Heap Area) based on the top RES (see the figure below). Finally, Apache can use 700m of the memory.


Figure 2: Top Screen of Test System.

If so, what should the value of MaxClients be with a memory of 700m?

It will be different according to the type and the number of loaded modules. However, for NHN Web services, which use Apache as a simple proxy, 4m (based on top RES) will be enough for one httpd process (see Figure 2). Therefore, the maximum MaxClients value for 700m should be 175. 


Reliable service configuration should decrease the system downtime under overload and send successful responses to requests within the allowable range. For Java-based Web services, you must check whether the service has been configured to reliably respond to the STW under full GC.

If the MaxClients option is set to a large value, to respond to simple increase of user requests and against DDoS attacks, without considering the system memory usage, it loses its functionality as a throttle valve, causing bigger faults.

In this example, the best way to solve the problem is to expand the memory or the server, or set the MaxClients option to 175 (in the above case) so that Apache returns the 503 error to requests that only exceed 175.

The situation in this article occurs within 3 to 5 seconds, so it cannot be checked by most monitoring tools which run at regular sampling intervals.


  1. CUBRID License Model

    Written by Charis Chau on 06/08/2020   Why Licenses Matter?   Open source licenses allow software to be freely used, modified, and shared. Choosing a DBMS with suitable licenses could save the development cost of your application or the Total Cost of Ownership (TCO) for your company. Choosing a DBMS without a proper license, you might find yourself situate in a legal grey area!     CUBRID Licenses   Unlike other open source DBMS vendors, CUBRID is solely under open source license instead of having a dual license in both commercial license and open source license. Which means that for you, it is not mandatory to purchase a license or annual subscription; company/organizational users can achieve the saving from Total Cost of Ownership (TCO).   Since CUBRID has been open source DBMS from 2008,...
    Read More
  2. CUBRID Internal: Storage Management (Disk Manager, File Manager)

    Written by Jaeeun, Kim on 08/11/2021 Introduction Database, just as its name implies, it needs spaces to store data. CUBRID, the open source DBMS that operates for the operating system allocates as much space as needed from the operating system and uses it efficiently as needed. In this article, we will talk about how CUBRID internally manages the storage to store data in the persistent storage device. Through this article, we hope developers can access the open source database CUBRID more easily. - The content of this article is based on version 10.2.0-7094ba. (However, it seems to be no difference in the latest develop branch, 11.0.0-c83e33. ) CUBRID Storage Management The CUBRID server has multiple modules that operate and manage data complexly and sophisticatedly. Among them, there are ...
    Read More
  3. CUBRID INTERNAL: CUBRID Double Write Buffer

    Written by MyungGyu Kim on 03/08/2022 INTRODUCTION Data in the database is allocated from disk to memory, some data is read and then modified, and some data is newly created and allocated to memory. Such data should eventually be stored on disk to ensure that it is permanently stored. In this article, we will introduce one of the methods of storing data on disk in CUBRID to help you understand the CUBRID database. The current version at the time of writing is CUBRID 11.2. DOUBLE WRITE BUFFER First of all, I would like to give a general description of the definition, purpose, and mechanism of Double Write Buffer. What is Double Write Buffer? By default, CUBRID stores data on disk through Double Write Buffer. Double Write Buffer is a buffer area composed of both memory and disk. By default, t...
    Read More
  4. CUBRID INSIDE: Subquery and Query Rewriter (View Merging, Subquery Unnest)

    Written by SeHun Park on 08/07/2021   What is Subquery  A subquery is a query that appears inside another query statement. Subquery enables us to extract the desired data with a single query. For example, if you need to extract information about employees who have salary that is higher than last year’s average salary, you can use the following subquery:    It is possible to write a single query as above without writing another query statement to find out the average salary. Subquery like this has various special properties, and their properties vary depending on where they are written. scalar subquery: A subquery in a SELECT clause. Only one piece of data can be viewed. inline view: A subquery in the FROM clause. Multiple data inquiry is possible. subquery: A subquery in the WHERE clause. I...
    Read More

    Written by SeHun Park on 11/09/2021 - HASH SCAN Hash Scan is a scan method for hash join. Hash Scan is applied in view or hierarchical query. When a subquery such as view is joined as inner, index scan cannot be used. In this case, performance degradation occurs due to repeated inquiry of a lot of data. In this situation, Hash Scan is used. The picture above shows the difference between Nested Loop join and Hash Scan in the absence of an index. In the case of NL join, the entire data of INNER is scanned as many as the number of rows of OUTER. In contrast, Hash Scan scans INNER data once when building a hash data structure and scans OUTER once when searching. Therefore, you can search for the desired data relatively very quickly. Here, the internal structure of Hash Scan is written as the fl...
    Read More
Board Pagination Prev 1 2 3 4 5 6 Next
/ 6

Join the CUBRID Project on