Geert's Place

Linux

Fake Google Bots

by on Jan.15, 2015, under Linux

In a previous post, I mentioned an enormous load on my Apache web server. So now I found the source and cause of this very annoying happening. It was actually a not-so-subtle mix of DDoS attacks, mixed with an arsenal of fake Google bots. In the logs (typically /var/log/httpd/access_log) this would look like this :


94.23.6.88 – – [15/Jan/2015:16:35:40 +0100] “POST /wp-login.php HTTP/1.1” 403 214 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
209.20.80.243 – – [15/Jan/2015:16:36:26 +0100] “POST /wp-login.php HTTP/1.1” 403 214 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
199.59.56.6 – – [15/Jan/2015:16:36:43 +0100] “POST /wp-login.php HTTP/1.1” 403 214 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
195.154.75.101 – – [15/Jan/2015:16:36:49 +0100] “POST /wp-login.php HTTP/1.1” 403 214 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”

Now, they may LOOK like Google bots (which are normally quite harmless), but taking a closer look at this .. Why would they use so many different IP addresses/networks ..? And why would they be trying to log in on my blog? 🙂
When I look up those IP addresses, the hostnames are : ns204288.ovh.net, mail.jenruno.com, synconlinemedia.com and 195-154-75-101.rev.poneytelecom.eu …

They look a lot more like hacked machines, if you ask me. The Google bot IP range is actually known: it’s 66.249.*

So, there are a few ways on how to tackle these fakers. First of all there is a nice plugin for WordPress, called Wordfence. This adds some security but you basically need to know what you’re doing. A second way would be to block the IP’s on firewall level .. But that turns into a nightmare/fulltime job quite fast.
The most effective way to block them, and still allow the REAL Google bots to carry on, is by simply routing the fakers to an error page, which is a default Apache error page. It decreases the load on the web server dramatically.

In your server (or virtual server) root directory – mostly /var/www/html/ – you just have to put an .htaccess file which contains :

RewriteEngine on
RewriteOptions inherit
Options +FollowSymlinks
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} Googlebot
RewriteCond %{REMOTE_ADDR} !^66\.249\.
RewriteRule .* – [F]

For this to work, you need to have the Rewrite Module compiled in your Apache. Just have a look in your config file (/etc/httpd/conf/httpd.conf) and look for a line that reads :
LoadModule rewrite_module modules/mod_rewrite.so

If it’s commented out, simply remove the comment and reload your Apache by running the command “apachectl restart” as the root or apache user.

Have fun killing off fake bots 😉

Leave a Comment : more...

Apache extreme memory usage

by on Sep.13, 2014, under Linux, Personal

My Apache webserver suddenly started to use an insane amount of RAM, freezing the whole box.. There are a couple of things you can tune in order to avoid this from happening.  Obviously, one or more scripts are causing this issue.  You could either go search for them by running the next commands :

ps -eo pmem,pcpu,pid,user,rss,vsize,args | { head -1 ; sort -k 1 -r -n ; } | head -10

This will sort the top 10 processes which eat away all the RAM.  Run the following command to free up memory while you’re troubleshooting :

echo 3 > /proc/sys/vm/drop_caches

Change the next settings in your httpd.conf (typically found in /etc/httpd/conf/) :

<IfModule prefork.c>
StartServers 2
MinSpareServers 2
MaxSpareServers 5
MaxClients 150
MaxRequestsPerChild 500
</IfModule>

LoadModule deflate_module modules/mod_deflate.so
<Location />
AddOutputFilterByType DEFLATE text/html text/plain text/css text/xml application/x-javascript
</Location>

KeepAlive On
KeepAliveTimeout 2
MaxKeepAliveRequests 80

<Directory />
Options FollowSymLinks
</Directory>

<Directory />
AllowOverride None
</Directory>

ExtendedStatus Off

Timeout 45

Also, it might be worth having a look at Easyapache and add mpm-prefork, or else Nginx or Litehttpd..

Leave a Comment :, more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...