Portal Home > Knowledgebase > Articles Database > Server crash: detective work advice
Server crash: detective work advice
Posted by lamp, 09-05-2007, 04:35 PM |
Hello,
I am running FC2 with cPanel.
My server stopped responding to everything from http to ping and I had to do a hard reboot on it.
Now, I'd like to figure out what happened but am not sure where to start. I've been sifting through /var/log/messages but can't pin point what I'm looking for.
Do you guys have any advice as to where/what to start looking for?
Thanks,
Lamp
|
Posted by FirmbIT, 09-05-2007, 05:28 PM |
If you have the approximate time that the server went down you can sift through /var/log/messages and locate the gap from when the server went down to when it came back up. Once you find that, paste the messages here and that will help us better understand what happened.
|
Posted by lamp, 09-05-2007, 05:55 PM |
Here it is, you'll notice the gap between 16:05 and 16:54
Sep 5 16:05:57 machine kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000000c
Sep 5 16:05:57 machine kernel: printing eip:
Sep 5 16:05:57 machine kernel: c01b4a90
Sep 5 16:05:57 machine kernel: *pde = 21391001
Sep 5 16:05:57 machine kernel: Oops: 0000 [#1]
Sep 5 16:05:57 machine kernel: SMP
Sep 5 16:05:57 machine kernel: Modules linked in: ipt_owner iptable_mangle ip_conntrack_ftp ipt_conntrack ipt_LOG ipt_limit ipt_multiport autofs4 tg3 e100 mii ipt_REJECT ipt_state ip_conntrack iptable_filter ip_tables floppy sg microcode dm_mod ohci_hcd video button battery ac ext3 jbd mptscsih mptbase sd_mod scsi_mod
Sep 5 16:05:57 machine kernel: CPU: 0
Sep 5 16:05:57 machine kernel: EIP: 0060:[] Not tainted VLI
Sep 5 16:05:57 machine kernel: EFLAGS: 00010207 (2.6.10-1.771_FC2smp)
Sep 5 16:05:57 machine kernel: EIP is at __rb_rotate_left+0x8/0x36
Sep 5 16:05:58 machine kernel: eax: e37f56c0 ebx: c04255e4 ecx: e37f56c0 edx: 00000000
Sep 5 16:05:58 machine kernel: esi: e37f56c0 edi: ea9a3e00 ebp: c04255e4 esp: e4d49ed4
Sep 5 16:05:58 machine kernel: ds: 007b es: 007b ss: 0068
Sep 5 16:05:58 machine kernel: Process exim (pid: 20541, threadinfo=e4d49000 task=e847e020)
Sep 5 16:05:58 machine pure-ftpd: (user@remote_ip) [NOTICE] /home/user//public_html/Website3/modules/gallery2/modules/imagemagick/locale/sk/LC_MESSAGES/modules_imagemagick.mo uploaded (893 bytes, 13.60KB/sec)
Sep 5 16:05:58 machine kernel: Stack: e83dccc0 c01b4ba1 ea9a3e00 ea9a3e00 e83dccc8 00007d43 c01977f4 e83dccc0
Sep 5 16:05:58 machine kernel: 0000000f e4d49f58 e4d49f67 ffffffea c01978a8 00000017 00000000 00007d43
Sep 5 16:05:58 machine kernel: c031f1e0 e4d49f58 00000000 e924ca00 00007d43 c0198a21 ffffffff 001f0000
Sep 5 16:05:58 machine kernel: Call Trace:
Sep 5 16:05:58 machine kernel: [] rb_insert_color+0xad/0xcc
Sep 5 16:05:58 machine kernel: [] key_user_lookup+0xd4/0x101
Sep 5 16:05:58 machine kernel: [] key_alloc+0x53/0x2bf
Sep 5 16:05:58 machine pure-ftpd: (user@remote_ip) [INFO] Can't change directory to /public_html/Website3/modules/gallery2/modules/imagemagick/locale/sk/LC_MESSAGES/_notes: No such file or directory
Sep 5 16:05:58 machine kernel: [] keyring_alloc+0x1a/0x48
Sep 5 16:05:58 machine kernel: [] alloc_uid_keyring+0x2b/0x7b
Sep 5 16:05:58 machine kernel: [] alloc_uid+0xb6/0x143
Sep 5 16:05:58 machine kernel: [] set_user+0xb/0x8c
Sep 5 16:05:58 machine kernel: [] sys_setreuid+0xc7/0x174
Sep 5 16:05:58 machine kernel: [] syscall_call+0x7/0xb
Sep 5 16:05:58 machine kernel: Code: 59 83 bc 82 04 01 00 00 00 75 ea 41 83 f9 01 76 ed 31 c0 5b c3 57 b9 45 00 00 00 89 c7 31 c0 f3 ab 5f c3 53 89 c1 89 d3 8b 50 08 <8b> 42 0c 85 c0 89 41 08 74 02 89 08 89 4a 0c 8b 01 85 c0 89 02
Sep 5 16:05:58 machine pure-ftpd: (user@remote_ip) [ERROR] Can't create directory: Disk quota exceeded
Sep 5 16:05:58 machine pure-ftpd: (user@remote_ip) [INFO] Can't change directory to /public_html/Website3/modules/gallery2/modules/imagemagick/locale/sk/LC_MESSAGES/_notes: No such file or directory
**** SERVER WENT DOWN HERE *****
Sep 5 16:54:16 machine syslogd 1.4.1: restart.
|
Posted by FirmbIT, 09-05-2007, 06:03 PM |
Well this doesn't look particularly good:
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
What are your hardware specs?
|
Posted by lamp, 09-05-2007, 06:07 PM |
damn...
root@computer [~]# fdisk -l
Disk /dev/sda: 36.4 GB, 36420075008 bytes
255 heads, 63 sectors/track, 4427 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 1288 10241437+ 83 Linux
/dev/sda3 1289 2180 7164990 83 Linux
/dev/sda4 2181 4427 18049027+ f W95 Ext'd (LBA)
/dev/sda5 2181 2690 4096543+ 82 Linux swap
/dev/sda6 2691 2817 1020096 83 Linux
/dev/sda7 2818 2944 1020096 83 Linux
/dev/sda8 2945 4427 11912166 83 Linux
Disk /dev/sdb: 36.4 GB, 36420075008 bytes
255 heads, 63 sectors/track, 4427 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 4427 35559846 83 Linux
|
Posted by FirmbIT, 09-05-2007, 06:13 PM |
You most likely have broken RAM and will need to get it replaced. One test you can do to see if your memory is working is to download the source code for gcc and compile it. If it crashes, but then crashes in a different place after re-issuing the 'make' command, then you have broken RAM.
If it isn't broken RAM then it could be a bug in the motherboard.
|
Posted by david510, 09-06-2007, 01:50 AM |
Seems to be issue with RAM. Check the top processes during the time server crashed from the files present in the directory /var/log/dcpumon.
|
Add to Favourites Print this Article
Also Read