Portal Home > Knowledgebase > Articles Database > DDOS or not?


DDOS or not?




Posted by AlexSteel, 11-30-2009, 08:40 PM
One of our web servers is having packet loss and extremely low throughput during a few ours every day. The funny thing is, this always stops before we hit the peak hour of the day. My guess is someone else on the same router is having it's peak hours just before we do... However the opinion of our hosting provider is the server is being DDOS'd. Do this x2 when the server is busy. FIN_WAIT2 and TIME_WAIT are pretty high. I analyzed those a bit further. But I do not have a lot of experience. FIN_WAIT2 is coming from a bunch of connections which close pretty fast. However some IP's have 20+ FIN_WAIT2 connections. I did a whois on those. Most are from schools and other educational institutions. The site is very popular with kids, so I don't think that is suspicious. TIME_WAIT. There are some IP's with a bunch of TIME_WAIT connections. Often they come from "Wildblue Communications". Some satellite internet provider. I've put them in the firewall, hoping those are not from educational institutions... I've made some changed to the TCP/IP stack. More memory, cookies enabled, lower timeout. Doesn't make any difference. What to do next? I have a server with almost zero load, plenty of free memory, no strange bandwidth usage and packet loss when it's not the most busy time of the day. Maybe is has a bit much connections in the FIN_WAIT2 and TIME_WAIT state, but could that kill it? Additional information: Server CPU: Intel Quad Core MEM: 8GB OS: CentOS 5.3 64bit Load: 0.05 - 0.20 Website Visitors: 60k/day, 200k pageviews/day Platform: PHP/MySQL/APC Bandwidth: 20 - 40 MBit Apache Timeout 30 ExtendedStatus Off KeepAlive On KeepAliveTimeout 3 MaxKeepAliveRequests 256 ServerLimit 1024 StartServers 128 MinSpareServers 256 MaxSpareServers 512 MaxRequestsPerChild 65536 ServerSignature Off ServerTokens Prod HostnameLookups off Last edited by AlexSteel; 11-30-2009 at 08:46 PM.

Posted by madaboutlinux, 12-01-2009, 05:16 AM
The number of TIME_WAIT connections do not seem to be high enough to confirm it as a DOS attack but yes, it may be a possibility as the packets are dropping. The another reason of high number of TIME_WAIT's is that you may have an application which is very busy but having a short lived connections. In this case the TIME_WAIT increases as they are the connections which are closed but still waiting to make sure all data is cleared on the network for that connection. So you may have to try tuning TCP settings and see how it goes And also making sure, another server on the same RACK isn't the cause of the problem.

Posted by AlexSteel, 12-01-2009, 06:01 AM
Thanks for your response. The server is running a statistics API which is working with a simple GET request. It's called about 500k times a day. It's a simple PHP script so is does not differ from a normal page. I don't think it can cause any problems. It's a simple dedicated server so I will never get to know the IP's of other servers in the rack. I have pinged some other IP's in the same range, and some have packet loss as well. The hosting company "checked" the server ( checked load, a netstat call and exit). They say the network is fine and the server is under DDos attack. This is my sysctl.conf. I changed it after the packet loss started but it doesn't make any difference. Although the modifications are carefully added I could have make a mistake (I'm by far an expert).

Posted by madaboutlinux, 12-01-2009, 06:15 AM
Well, it's quite hard to say where the problem is but you can try a thing here... You say, you receive packet loss everyday, so as soon as the problem occurs, stop all the services on your server for sometime and see if the packet loss to your server still occurs. At the same try a ping to your near by IPs (which you already tested) at the same time and see if you receive any packet loss there. This might tell you whether it's your application OR something else wrong on the network.

Posted by Steven, 12-01-2009, 07:26 AM
You have high values for your apache parameters but you lack a max clients parameter.

Posted by AlexSteel, 12-01-2009, 09:39 AM
The packet loss started again. But I don't see any difference between the netstat output from last night and now. @Steven, I could (or should) set the MaxClients to 1024, but that won't fix this problem. I will now look for some servers on the same router to check if there having packet loss. Last edited by AlexSteel; 12-01-2009 at 09:43 AM.

Posted by madaboutlinux, 12-01-2009, 10:28 AM
Make sure you stop everything on your server as I have mentioned earlier.

Posted by mellow-h, 12-01-2009, 10:34 AM
Correct, why did you put the spareservers that high?

Posted by AlexSteel, 12-01-2009, 11:04 AM
Increased MaxClients to 1024. The reason these values are so high is easy, it's a pretty busy website. 200 simultaneous connections is normal. It needs some room to take care of traffic spikes. Also i don't think it's a problem to keep those processes open, there is plenty of RAM. Create new Apache children takes a whole lot of time. I will try to change those values to about half of it. I've now routed the DNS of the statistics API elsewhere, that should reduce the number of req/sec. quite a bit. Throughput to Europe now dropped to < 20kbyte. No difference in the netstat output. Last edited by AlexSteel; 12-01-2009 at 11:13 AM.

Posted by plumsauce, 12-01-2009, 11:20 AM
For your ping test, you might try this: simultaneously ping: your default gateway the gateway beyond that and the next gateway You might also try it from your office. What you are trying to establish is whether the routers/switches being used by your host are suffering from congestion. If it is *always* the same time, consider whether a neighbour is running a remote backup job at that time.

Posted by AlexSteel, 12-01-2009, 11:45 AM
The problem is, the router before our server does not respond to ping, so I cannot check it. Offloading the server by changing the DNS of the API didn't make any difference (we use DNS monitoring with short TTL). And yes, it's *always* the same time. I'm now also serving large files from a CDN. Should reduce load even more. But still no difference.

Posted by AlexSteel, 12-01-2009, 12:23 PM
Small update, I found a server behind the same router (by pinging some IP's in the same block and doing a traceroute). However this server does not have any packet loss at all. I've submitted a ticked to the hosting provider, i'll keep you guys updated. Last edited by AlexSteel; 12-01-2009 at 12:29 PM.

Posted by AlexSteel, 12-01-2009, 01:37 PM
Here comes the mystery. Now the number of connections is increasing. But also more bandwidth is coming available, the packet loss is always gone around this time. I think I'm going to eat my shoes.

Posted by AlexSteel, 12-03-2009, 09:12 AM
For the people who care: They picked up the box and placed it in another rack and connected it to a different switch. It looks like the problem is solved.

Posted by madaboutlinux, 12-03-2009, 09:29 AM
If the problem use to occur at the same time of the day and to only your server in the rack, how come changing the rack solved the problem. but well, glad to hear the problem is solved though they couldn't figure out what was it.

Posted by AlexSteel, 12-03-2009, 09:48 AM
I don't know for sure if there are no other servers with the same problem behind that switch. I've checked just one ( found it by doing a lot of pinging and tracerouting ). Maybe the switch port was broken or misconfigured. What we've learned here is a bunch of connections in TIME_WAIT state doesn't mean the server is under attack, especially when it's serving a medium to large website.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read
Nixcom is down (Views: 616)
copyright breaking? (Views: 576)
PHP & mySQL question (Views: 589)
Servage.net down again (Views: 624)