Apache Performance Tuning

Apache performance tuning is very important in the performance of a webserver. As compared with Apache 1.3, there are a lot of additional optimizations brings in the Apache 2.x performance. Most of these settings were done as default in the apache configuration. Since the webserver performance is very much affected by the RAM, there are some configuration changes need to be done in the apache configuration as per this. The more RAM your system has, the more processes and threads Apache can allocate and use.

The other important factors in webserver performance is disk I/O, CPU clock speed and network link. These factors are depend upon the hardware selection.

Selecting suitable Apache MPM is more effective in the apache performance. Since the version later Apache2.0 is modular selecting a suitable MPM will change the functionality of the Apache webserver.

MPMs must be chosen during configuration, and compiled into the server. Compilers are capable of optimizing a lot of functions if threads are used, but only if they know that threads are being used.

Once the server has been compiled, it is possible to determine which MPM was chosen by using the following command.

httpd -V | grep “Server MPM:”

The following table lists the default MPMs for various operating systems. This will be the MPM selected if you do not make another choice at compile-time.

BeOS beos

Netware mpm_netware

OS/2 mpmt_os2

Unix prefork

Windows mpm_winnt

Here I would like to stick on the Unix operating system.

There are two main MPM available in the Unix Operating system which are prefork and worker. It is needed to select the right MPM for the right job.

Prefork

It is the default MPM for Apache 2.x and 1.3 versions. This Multi-Processing Module (MPM) implements a non-threaded, pre-forking web server that handles requests in a manner similar to Apache 1.3. A single process launches child processes which listen for connections and serve them when they arrive. Each time a request arrives, the web-server serves the request with a process that is dedicated for that particular request. It is also the best MPM for isolating each request, so that a problem with a single request will not affect any other.

This MPM is very self-regulating, so it is rarely necessary to adjust its configuration directives. Most important is that MaxClients be big enough to handle as many simultaneous requests as you expect to receive, but small enough to assure that there is enough physical RAM for all processes.

Worker

This Multi-Processing Module (MPM) implements a hybrid multi-process multi-threaded server. By using threads to serve requests, it is able to serve a large number of requests with fewer system resources than a process-based server. However, it retains much of the stability of a process-based server by keeping multiple processes available, each with many threads.

A single control process (the parent) is responsible for launching child processes. Each child process creates a fixed number of server threads as specified in the ThreadsPerChild directive, as well as a listener thread which listens for connections and passes them to a server thread for processing when they arrive. Memory and CPU usage is about 5% is lower in worker as compared to prefork.

When running PHP via mod_php, choose pre-forked. When running only static files (html, jpg, etc), choose multi-threaded. If passing on to a back end application server like Mongrel (for Ruby on Rails), the multi-threaded version works fine.

Using PHP with a threaded Apache is dangerous, and highly discouraged. Since the core of PHP has support with multi threaded application, some php modules is not compatible with threaded application. So prefork is recommended in Apache-PHP servers.

Default Configurations of prefork and worker MPM in Apache is as follows.

<IfModule prefork.c>

ServerLimit 256

StartServers 8

MinSpareServers 5

MaxSpareServers 20

MaxClients 150

MaxRequestsPerChild 1000

</IfModule>

<IfModule worker.c>

ServerLimit 256

StartServers 2

MaxClients 150

MinSpareThreads 25

MaxSpareThreads 75

ThreadsPerChild 25

MaxRequestsPerChild 0

</IfModule>

Changing the directives in the above default default configuration can increase the performance of Apache.

MaxClient

The single biggest hardware issue affecting web server performance is RAM. A web server should never ever have to swap, as swapping increases the latency of each request beyond a point that users consider “fast enough”. This causes users to hit stop and reload, further increasing the load. You can, and should, control the MaxClients setting so that your server does not spawn so many children it starts swapping. This procedure for doing this is simple: determine the size of your average Apache process, by looking at your process list via a tool such as top, and divide this into your total available memory, leaving some room for other processes

First, find out how much memory, on average, each httpd process is using. To get an accurate reading, let your VPS and site run for a few hours to allow the memory usage to reflect normal conditions. The memory usage that shows just after restarting httpd is often lower than what would usually be the case.

The formula for setting MaxClient values is as follows.

MaxClients ? (RAM – size_all_other_processes)/(size_apache_process)

Check your total RAM memory using the command

free -m

When the apache server is in running state you can use the following script to check the total memory used for total apache process and memory used for one apache process.

ps -ylC httpd | awk ‘{x += $8;y += 1} END {print “Apache Memory Usage (MB): “x/1024; print “Average Process Size (MB): “x/((y-1)*1024)}’

First of all you have to check the memory needed for one apache process. Then check the memory needed for mysql process using the following script.

ps -ylC mysql | awk ‘{x += $8;y += 1} END {print “Mysql Memory Usage (MB): “x/1024; print “Average Process Size (MB): “x/((y-1)*1024)}’

Then stop apache and mysql services and check how much memory needed for the other processes in total. It will help you to calculate MaxClient.

If you want to change MaxClient greater than 256, you need to change ServerLimit also(MaxClient <= ServerLimit).

MinSpareServers, MaxSpareServers, and StartServers

MaxSpareServers and MinSpareServers determine how many child processes to keep active while waiting for requests. If the MinSpareServers is too low and a bunch of requests come in, Apache will have to spawn additional child processes to serve the requests. Creating child processes is relatively expensive. If the server is busy creating child processes, it won’t be able to serve the client requests immediately. MaxSpareServers shouldn’t be set too high: too many child processes will consume resources unnecessarily.

The StartServers directive sets the number of child server processes created on start-up. Apache will continue creating child processes until the MinSpareServers setting is reached. This doesn’t have much effect on performance if the server isn’t restarted frequently. If there are lot of requests and Apache is restarted frequently, set this to a relatively high value.

The default apache configuration is seems to be good in these directives.

StartServers 8

MinSpareServers 5

MaxSpareServers 20

MaxRequestsPerChild

The MaxRequestsPerChild directive sets the limit on the number of requests that an individual child server process will handle. After MaxRequestsPerChild requests, the child process will die. It’s set to 0 by default, the child process will never expire. It is appropriate to set this to a value of few thousands. This can help prevent memory leakage, since the process dies after serving a certain number of requests. Don’t set this too low, since creating new processes does have overhead.

KeepAlive

Enable HTTP persistent connections to improve latency times and reduce server load significantly. Using KeepAlive can increase speed for both server and the client — disable it and the serving of static files such as images may be a lot slower. I think it’s best to have KeepAlive on, and KeepAliveTimeout very low, perhaps one or two seconds.

KeepAlive On

KeepAliveTimeout 2

MaxKeepAliveRequests 0

MaxKeepAliveRequests directive is used to define the number of requests allowed per connection when the KeepAlive option above is set to “On”. When the value of this option is set to “0? then unlimited requests are allowed on the server. For server performance, it’s recommended to allow unlimited requests or give a high value.

MaxRequestsPerChild

Directive MaxRequestsPerChild is used to recycle processes. When this directive is set to 0, an unlimited amount of requests are allowed per process. If it is too low in a busy server, apache will utilize a good CPU power for killing and spawning new child processes which may cause more CPU burden. However it is best to give the value a few thousands.

Enabling Apache file compression

Apache 1.x and 2.x can automatically compress files, but neither one comes with a compressor enabled by default. Enabling compression reduces CSS, HTML, and JavaScript file sizes by 55-65% and speeds up overall page load times by 35-40%.

For Apache 1.x, use the free mod_gzip module to compress files. For Apache 2.x, use mod_gzip or the built-in mod_deflate module.

Reduce bandwidth by 75% and improve response time by using mod_deflate.

you can use the following command to test if Content-Encoding is enabled:

curl -I -H ‘Accept-Encoding: gzip,deflate’ http://yourdomain.com/test.html

Enable Caching

In caching, a copy of the data is stored at the client or in a proxy server so that it need not be retrieved frequently from the server. This will save bandwidth, decrease load on the server, and reduce latency. Cache control is done through HTTP headers. In Apache, this can be accomplished through mod_expires and mod_headers modules. There is also server side caching, in which the most frequently-accessed content is stored in memory so that it can be served fast. The module mod_cache can be used for server side caching; it is production stable in Apache version 2.2.

Include mod_expires for the ability to set expiration dates for specific content; utilizing the ‘If-Modified-Since’ header cache control sent by the user’s browser/proxy. Will save bandwidth and drastically speed up your site for [repeat] visitors.

Separate server for static and dynamic content

Apache processes serving dynamic content take from 3MB to 20MB of RAM. The size grows to accommodate the content being served and never decreases until the process dies. As an example, let’s say an Apache process grows to 20MB while serving some dynamic content. After completing the request, it is free to serve any other request. If a request for an image comes in, then this 20MB process is serving static content – which could be served just as well by a 1MB process. As a result, memory is used inefficiently.

Use a tiny Apache (with minimum modules statically compiled) as the front-end server to serve static contents. Requests for dynamic content should be forwarded to the heavy-duty Apache (compiled with all required modules). Using a light front-end server has the advantage that the static contents are served fast without much memory usage and only the dynamic contents are passed over to the big server. Request forwarding can be achieved by using mod_proxy and mod_rewrite modules.

It is best to use Nginx for static contents. Nginx is noted to be a good server for sites that need fast, efficient reverse proxies or fast, efficient serving of static content.

SymLinks

If SymLinksIfOwnerMatch is set, then the server will follow symbolic links only if the target file or directory is owned by the same user as the link.

Make sure ‘Options +FollowSymLinks -SymLinksIfOwnerMatch’ is set for all directories. Otherwise, Apache will issue an extra system call, lstat() per filename component to verify whether the ownership of the link and the target file match.

It is not so good to enable FollowSymLinks in a shared server due to security reasons.

AllowOverride

If AllowOverride is set to all, then Apache will attempt to open .htaccess file.

You should avoid using .htaccess files completely if you have access to httpd main server config file. Using .htaccess files slows down your Apache http server. Any directive that you can include in a .htaccess file is better set in a Directory block, as it will have the same effect with better performance.

The first of these is performance. When AllowOverride is set to allow the use of .htaccess files, httpd will look in every directory for .htaccess files. Thus, permitting .htaccess files causes a performance hit, whether or not you actually even use them! Also, the .htaccess file is loaded every time a document is requested.

If a file is requested out of a directory /www/htdocs/example, httpd must look for the following files:

/.htaccess

/www/.htaccess

/www/htdocs/.htaccess

/www/htdocs/example/.htaccess

ExtendedStatus

If mod_status is included, make sure that directive ‘ExtendedStatus’ is set to ‘Off’. Otherwise, Apache will issue several extra time-related system calls on every request made.

ExtendedStatus Off

HostNameLookups

Using HostnameLookups requires a DNS lookup before the request is completed adding considerable latency. Therefore HostnameLookups should be turned off. If you are using Allow or Deny from domain directives you will create double reverse DNS lookups because the first is a forward lookup and the second is a reverse lookup to verify it is not spoofed. This means then that you should use IP Addresses to create the Allow and Deny directives.

Use a Specific DirectoryIndex

Avoid the situation where the server has to figure out a wildcard option for the DirectoryIndex. Instead of using a wildcard setting like this:

DirectoryIndex index

Use this specific setting:

DirectoryIndex index.html index.cgi index.pl

You can save a small amount of resources by having the most used option first in the list.

Conclusion

It is hard to say a specific set of rules for a webserver. It is needed to be use customized set of specifications for each webserver. It is need to be understand the web server requirements and experimenting each of the above discussed directives. We can use tools like ab and httperf to measure the web server performance. In a LAMP server, it is need to optimize PHP and MySQL also to get a good performance.