Jump to content

Baiduspider not blocked??


Dirty Butter

Recommended Posts

It is 0.02 minutes. We can tweak the query: statistics.index.inc.php, near line 310

From:
$filter = '(S.session_last > S.session_start) AND ';

To:
$filter = '(S.session_last - S.session_start > 3) AND ';

Only someone who has been online for more than 3 seconds (0.05 minutes).

Or, as was probably mentioned earlier, make the array of bot signatures 'public' and see if the signature is in the session string. In User class, line 34

From:
protected $_bot_sigs =  array(

To:
public $_bot_sigs =  array(

Then, in statistics.index.inc.php, line 321

From:
foreach ($results as $user) {

To:
$user_bot_sigs = User->getInstance()->_bot_sigs;
foreach ($results as $user) {
  if (!empty($filter)) { // Scanning for bots
    foreach ($user_bot_sigs as $signature) {
      if (stripos($user['useragent'], $signature) !== false) {
        continue 2;
      }
    }
  }

 

Link to comment
Share on other sites

Thank you.

The method shown here uses some code I had commented out from the previous help you had given about bots on another post. After doing a little research, I find there are a HUGE number of variations of the word baidu for different Chinese bots. It looks like it would be simpler in the long run to take the previous fix out and use just the time limit method explained here. I'll be back.

Link to comment
Share on other sites

I took it back to stock statistics.index.inc.php and no results at all show with your suggested change:

$filter = '(S.session_last - S.session_start > 3) AND ';

I experimented with a few variations. Using 0.05 and 0.9 shows the bot entries, and using 1 again shows no results at all.

Link to comment
Share on other sites

"I find there are a HUGE number of variations of the word baidu for different Chinese bots."

Please give us a dozen examples.

This test:

if (stripos($user['useragent'], $signature) !== false)

is looking for the sequence of characters 'baidu' anywhere in the useragent string. So, regardless of the variations, if 'baidu' is in the string, that session record will be knocked out.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...