Mysterious Crashing of Apache Server….need help from Apache

Postby masterwenly » 05. September 2006 15:32

hi, gurus
We got an interesting (also serious) problem here with our Apache Server. It worked fine for several months until last month, since then, it kept crashing at middle night, and recently it began to crash at daytime as well. We have done some analysis using home-made tools to analyze Apache logs and output from Linux top command. (We run top command every 1 minute on our Linux server and record its output in text file and analyze it using a small tool written by me). By analyzing the logs, we noticed that at the time the server crashed although there were very few people visiting it, it always reached the maximum number of processes allowed and memory usage is very high (not surprising). We tried to change apache configurations, especially those related to KeepAlive settings.We tuned MaxKeepAliveRequests and KeepAliveTimeout directives, but they cannot solve the problem at all. We checked apache server-status page at its peek time; it seems most of Apache Server’s processes in the operation mode ‘W’, which means ‘Sending Reply’. However, the requests were received several hours before; I don’t think it would take that long to answer a customer’s request. All the processes are hanging there for some reason. In addition, all these hanging processes are severing the same PHP page, our browse.php page. We suspected that it was this page that caused the problem, we tested this page on another server which is almost the same with our production server, and everything is fine. We can solve this problem temporarily by monitoring the server and restarting it regularly, but we want to know what the cause of this problem is. Hope some Apache guru can give us some help.

Thank you very much.

Here is some data we have collected:

Logs generated using top command on 3rd of September, Sunday from middle night to 23:59pm

The server was restarted at round 4:09am and the number of processes reached its peak at 15:30 and then it crashed.

Server Status at the time it restarted and it was about to crash on 3rd of September

We gather data from apache server status page every two minutes. Start from midnight to 23:59pm

Our Apache configuration file
