首页 编程 正文
211

检测Google机器人

  • yiqingpeng
  • 2019-03-28
  • 0
  •  
方案一, DNS反查(https://support.google.com/webmasters/answer/80553):
1、先获取request-ip,  使用命令host request-ip(php code: gethostbyaddr('66.249.90.77')),从结果中查找以google.com或googlebot.com结尾的域名xxxx.google.com
2、再用host domain(php code: gethostbyname('rate-limited-proxy-66-249-90-77.google.com'))命令对步骤1得到的google域名进行ip地址反查,如果得到的ip地址与request-ip一致则通过验证。
例如:
> host 66.249.90.77
77.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com.
> host rate-limited-proxy-66-249-90-77.google.com
rate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77
注:windows没有host命令,对应的是nslookup
附原文:
1、Run a reverse DNS lookup on the accessing IP address from your logs, using the host command.
2、Verify that the domain name is in either googlebot.com or google.com
3、Run a forward DNS lookup on the domain name retrieved in step 1 using the host command on the retrieved domain name. Verify that it is the same as the original accessing IP address from your logs.


方案二、User-Agent(https://support.google.com/webmasters/answer/1061943):
以下是Google的User-Agent列表:
Crawler User agent token (product token) Full user agent string
APIs-Google APIs-Google APIs-Google (+https://developers.google.com/webmasters/APIs-Google.html)
AdSense Mediapartners-Google Mediapartners-Google
AdsBot Mobile Web Android
(Checks Android web page ad quality)
AdsBot-Google-Mobile Mozilla/5.0 (Linux; Android 5.0; SM-G920A) AppleWebKit (KHTML, like Gecko) Chrome Mobile Safari (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
AdsBot Mobile Web
(Checks iPhone web page ad quality)
AdsBot-Google-Mobile Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
AdsBot
(Checks desktop web page ad quality)
AdsBot-Google AdsBot-Google (+http://www.google.com/adsbot.html)
Googlebot Images Googlebot-Image 或 Googlebot  Googlebot-Image/1.0
Googlebot News Googlebot-News 或 Googlebot  Googlebot-News
Googlebot Video Googlebot-Video 或 Googlebot Googlebot-Video/1.0
Googlebot (Desktop) Googlebot Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Safari/537.36
or (rarely used):
Googlebot/2.1 (+http://www.google.com/bot.html)
Googlebot (Smartphone) Googlebot Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mobile AdSense Mediapartners-Google (Various mobile device types) (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html)
Mobile Apps Android
(Checks Android app page ad quality.
 Obeys AdsBot-Google robots rules.)
AdsBot-Google-Mobile-Apps AdsBot-Google-Mobile-Apps
Feedfetcher FeedFetcher-Google
NOTE: Feedfetcher does not
respect robots.txt rules; here's why
FeedFetcher-Google; (+http://www.google.com/feedfetcher

正在加载评论...