Thursday, 7 May 2009

Slow Weblogic Response Part 3 - Tuning File Descriptors

This article is a follow-up to the Overall Tuning considerations discussed earlier.
A very important Operating System value to be tuned are the File Descriptors

File Descriptors and relation to Sockets

A File Descriptor (FD) is a handle created by a process when a file is opened. Each process can use a set limit of FDs and this is usually an OS level setting.
In the Solaris 8 version, the default is 1024. In later Solaris releases, the default is 65,536 but this needs to be set in the /etc/system as given below.
So, the default available for a process on an untuned OS is 1024.


# ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 2048
memory(kbytes) unlimited





In a JEE server, each incoming request uses a TCP socket and this socket consumes a file descriptor from the total available for the process.

As the number of requests coming into the server increase, you can face a situation where there are many sockets open and thus you run out of FDs
This could happen if you have a large number of clients (~1000 or more). This could also happen if you have HTTP connections with keep-alive turned off and so a lot of sockets are in TIME_WAIT.

If you have Stuck Threads, then those keep the FDs open and those wont be closed until the thread is released.

File Descriptors limit exceeded

A Weblogic server throws the below error when the FD limit has been exhausted.



BEA-000204 java.net.SocketException: Too many open files

OR

java.io.IOException: Too many open files
at java.lang.UNIXProcess.forkAndExec(Native Method)



When a Weblogic server starts up, there is an INFO message in the logs stating how many FDs have been allocated to the process.


<<WLS Kernel>> <> <BEA-000415> <System has file descriptor limits of - soft: 2,048, hard: 2,048>

<main> <<WLS Kernel>> <> <BEA-000416> <Using effective file descriptor limit of:

2,048 open sockets/files.>



To display a process' current file descriptor limit on Solaris, the command is:



/usr/proc/bin/pfiles <pid> grep rlimit

Example:

$/usr/proc/bin/pfiles 9052 grep rlimit

Current rlimit: 8192 file descriptors



You can check the actual number of FDs being used by a running server at any time using this command:




Note: This is not the max limit but the actual number used by the process at that point of time

ls /proc/<pid>/fd wc –l




If you monitor this on a regular basis, you can see how often you are nearing the max limit set, and whether the values need to be tuned to support the peak loads and peak traffic timings for your server.

As an example, we plotted FDs used every 5 minutes as below












Time

FD Used
11:00389
11:05429
11:10748
11:15488
11:20337
11:25595



Increasing File Descriptors

The Hard limit is the max value set on the OS.

The Soft limit shows the value set for a particular child process on the OS. This cannot be higher than the hard limit.
You can change the soft limit on-the-fly by using


ulimit -Sn 8192


This will only for that that telnet session and will not be a permanent change. Add this command to the user's .profile file to avoid repeating each time.

If hard limits need to be set, root user needs to update the /etc/system file and the machine to be rebooted. Even though the rlim_fd_max default on Solaris 9 is 65536, it must be in the /etc/system file

The values to be changed are

rlim_fd_max (default hard limit)
rlim_fd_cur (default soft limit)

On Linux:

To increase the hard limit it to (say) 65535, use the following command (as root):

echo "65535" > /proc/sys/fs/file-max

To make this value to survive a system reboot, add it to /etc/sysctl.conf and specify the maximum number of open files permitted:

fs.file-max = 65535


Any queries or clarifications, leave me a comment and I'll try to get back.