Exclude Bot Hits from Counterize


Please jump to the updated version here:
Exclude Bot Hits from Counterize v2.0

Counterize 另一個可以修改的地方是它會計算 Bot(1) 的來訪。剛才看見作者留言板上有人說要不停加入不同 Bot 的 IP 到 Exclusion List 頗為麻煩,但是其實有捷徑的。因為多數 Bot 的 Useragent 裏都會有「bot」這個字串,所以可以靠檢查 Useragent 來減去 Bot count。
(Full content for English explaination)

In order to exclude all bot hit count from Counterize, we can compare the user-agent(2) with string “bot”. Simply add two lines into counterize.php can do:

around lines 377:
$excludelist=get_option(‘counterize_excluded’);
$checkval = strpos($excludelist, $remoteaddr);

Add these two lines after them:
if (!(stristr($useragent, ‘bot’) === FALSE))
$checkval = TRUE;

This will exclude most bot hit. But still, Yahoo bots use the same user-agent as Mozilla 4.X, thus we need to add these Yahoo! Slurp IPs into our exclusion list:

Yahoo! Slurp IP (from Search Engine Genie):
66.196.65.38
66.196.73.96
66.196.90.100
66.196.90.178
66.196.72.91
66.196.90.82
66.196.90.216
66.196.90.215

Then Counterize should filter most bot hits. I may post some other bot IPs which cannot be filtered by checking user-agent here if I find some. Anyone who have better idea of filtering bot hit please share your method 😀

遲些考完試再試一試另開紀錄來處理 Bot Hits,這個比較多功夫,這幾天不試了。

 

(1):Bot 即是自動閱讀網頁的程式,例如 Yahoo! 及 Google 都有這種程式來取得網頁資料。
(2): user-agent indicate what software that the client is using.


發佈留言

發佈留言必須填寫的電子郵件地址不會公開。 必填欄位標示為 *

這個網站採用 Akismet 服務減少垃圾留言。進一步瞭解 Akismet 如何處理網站訪客的留言資料