Portal Home > Knowledgebase > Articles Database > server crashing, but logs are empty


server crashing, but logs are empty




Posted by supiiik, 05-02-2011, 06:38 AM
I have a big problems with my new server. I have a webserver (with apache and powerdns), which is crashing almost every day. I was using this configuration also before and server was running 6 months without single crash. Now I installed new server and only new thing there is NFS client. When server crash, then hard reboot is needed. Server is not even responding pings. I decided to move my service to old server (which was running properly) and installed there NFS client and this problem appeared also in this server. I have this server at hetzner and they already did 8 hours hardware test, so HW is OK. I have already upgraded centos kernel by yum, but problem is still there. I'm using this option kernel.panic=10, but server is not rebooting automatically, when crash and all logs are empty (nothing unusual is there, only firewall logs). I'm really desperate and don't know, what to do New server kernel: 2.6.18-238.5.1.el5 Old server kernel: 2.6.18-194.11.1 I'm using centos 5.5 and 5.6 there. I think, that it's because of NFS, but I don't know, where is the problem, because of no logs NFS servers are OK and stable. Only NFS clients have this problems. They are connected via 1Gbps switch in dedicated rack.

Posted by akasharya, 05-02-2011, 07:35 AM
check the sar reports and try to get the load avg when server goes down. is there anything else showing except firewall logs in /var/log/messages. what options are you using with nfs mounts? try intr,soft,wize=8192,rsize=8192 options in fstab for nfs mounts.

Posted by supiiik, 05-02-2011, 08:02 AM
Sar reports? I'm using these options at the moment and I have there 5 NFS mounts with same options: rw,hard,intr,nfsvers=3,retrans=2,rsize=32768,wsize=32768,noatime I was using also "soft" and "timeo=30", but it was only worse, server was crashing more frequent and in the /var/log/messages was something like NFS timeouts. With "hard" option these errors never appeared. load avg is absolutely normal in the moment of crash (I have top command opened almost non-stop and if server crash, then last screen is there).

Posted by supiiik, 05-02-2011, 09:12 AM
Sorry, I didn't know, what is it sar reports, but now I know: Server load is not a problem.

Posted by bloodyman, 05-04-2011, 02:30 PM
What network card are you using?

Posted by supiiik, 05-06-2011, 07:09 AM
I'm using RTL8168c/8111c there. 4 days ago I set wize=8192,rsize=8192 instead of rsize=32768,wsize=32768 and server seems fine now. But I can say, that problem is solved, when server will run without crash more than 2-3 weeks. 32KB r/wsize is default value for NFS, why this configuration is not stable? Is it some bug in centos generic kernel?

Posted by bloodyman, 05-07-2011, 06:06 PM
I think r8169 is buggy in stock kernels, we have had some problems with random crashes of our servers and copiled drivers from Realtek websites, so far no problems. I advice you to compile custom driver . r8169 is installed as module so there would not be any problem to change it to r8168

Posted by bizness, 05-07-2011, 08:26 PM
if you have nothing in your logs... it may be a power supply starting to go bad.

Posted by MikeDVB, 05-07-2011, 08:33 PM
Unlikely, especially as it passed hardware tests and the issues only cropped up after installing NFS.

Posted by bizness, 05-07-2011, 10:28 PM
what if it was just a coincidence.... it can still be power related... We have seen this multiple times in our datacenter with many of our clients. it starts out as unexplained reboots... finally one day ... the server doesnt come back ... power supply completely dead. I think it is worth a shot for your Host to replace the Power supply... is this a dedicated box or colocation?

Posted by SoftDux, 05-18-2011, 06:32 AM
2 faulty power supplies, on 2 different servers, a coincidence only when he mounts the NFS clients? Very unlikely. I would also look at the Realtek NIC's, they're not the best. @OP if you can, replace the RealTek NIC's with Intel NIC's, and also make sure all the firewalls between the NFS server(s) and NFS clients pass all traffic through. I would normally just whitelist the NFS client on the firewall for this purpose.

Posted by benjohnsonfs, 05-21-2011, 02:52 AM
Please check if your syslog service is running or not.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
ev1?? (Views: 651)