Jump to content

Resolved - Eliminate Bots data in Users Online?


Dirty Butter

Recommended Posts

You could try:

* Filter out short sessions in the query - /admin/sources/statistics.index.inc.php, line 319:

$query = sprintf("SELECT * FROM %1$sCubeCart_sessions AS S LEFT JOIN %1$sCubeCart_customer AS C ON S.customer_id = C.customer_id WHERE (S.session_last = S.session_start) AND S.session_last>".$timeLimit." ORDER BY S.session_last DESC", $glob['dbprefix']);

Link to comment
Share on other sites

  • 1 year later...

I'd like to use this tweak too but cannot find the exact wording as listed in this thread.Maybe as its some time since last message updates have changed the script?

The script I'm seeing in that section of the index.inc.php page is as follows. What should I change to get rid of these bot stats?

 

// Customers Online
$timeLimit = time()-1800;  // 30 minutes
$query = sprintf("SELECT S.*, C.first_name, C.last_name FROM %1$sCubeCart_sessions AS S LEFT JOIN %1$sCubeCart_customer AS C ON S.customer_id = C.customer_id WHERE S.session_last>".$timeLimit." ORDER BY S.session_last DESC", $glob['dbprefix']);
if (($results = $GLOBALS['db']->query($query)) !== false) {
$GLOBALS['main']->addTabControl($lang['statistics']['title_customers_active'], 'stats_online', false, false, count($results));
foreach ($results as $user) {
$user['is_admin'] = ((int)$user['admin_id'] > 0) ? 1 : 0;
$user['name'] = ((int)$user['customer_id'] != 0) ? sprintf('%s %s', $user['first_name'], $user['last_name']) : $lang['common']['guest'];
$user['session_length'] = sprintf('%.2f', ($user['session_last']-$user['session_start'])/60);
$user['session_start'] = formatTime($user['session_start']);
$user['session_last'] = formatTime($user['session_last']);
$smarty_data['users_online'][] = $user;
}
$GLOBALS['smarty']->assign('USERS_ONLINE',$smarty_data['users_online']);
}
$page_content = $GLOBALS['smarty']->fetch('templates/statistics.index.php');
Link to comment
Share on other sites

Let's try modifying this statement:

$query = sprintf("SELECT S.*, C.first_name, C.last_name FROM %1$sCubeCart_sessions AS S LEFT JOIN %1$sCubeCart_customer AS C ON S.customer_id = C.customer_id WHERE S.session_last>".$timeLimit." ORDER BY S.session_last DESC", $glob['dbprefix']);

to (change just the WHERE part):

WHERE (S.session_last <> S.session_start) AND S.session_last>
Link to comment
Share on other sites

  • 1 year later...

In /classes/user.class.php, there is an array of uniquely identifiable bot signatures, being mainly just one key word.

We would need to know the unique word that identifies this particular googlebot.

You may need to examine the CubeCart_sessions table directly to learn what the 'useragent' phrase is, and what word we may be able to use to distinctly identify it.

Link to comment
Share on other sites

This conversation started by assuming any session that lasts more than one second must not be a BOT.

In later versions of CubeCart, a "filter" was put in place that added that condition to the database query. There is a link to refetch the list to show only records that satisfy this condition.

We can try modifying the code that fetches and displays the list of customers to use the list of BOT signatures.

While we make the edit, let's just assume we don't want zero-length sessions (probably BOTs) until we actually do.

In /admin/sources/statistics.index.inc.php, near the bottom:

Was:
if (isset($_GET['bots']) && $_GET['bots']=='false') {
    $filter = '(S.session_last > S.session_start) AND ';
    $GLOBALS['smarty']->assign('BOTS', false);
} else {
    $filter = '';
    $GLOBALS['smarty']->assign('BOTS', true);
}
Now:
if (isset($_GET['bots']) && $_GET['bots']=='true') {
    $filter = '';
    $GLOBALS['smarty']->assign('BOTS', true);
} else {
    $filter = '(S.session_last > S.session_start) AND ';
    $GLOBALS['smarty']->assign('BOTS', false);
}

In order to use User->(protected)_bot_sigs, will need to "extend" the User class to our own little class.

Link to comment
Share on other sites

Extending the User class is not going to happen (based on what I know about classes). So, we will just need to copy over the bot sigs.

Was:
$query  = sprintf("SELECT S.*, C.first_name, C.last_name FROM %1\$sCubeCart_sessions AS S LEFT JOIN %1\$sCubeCart_customer AS C ON S.customer_id = C.customer_id WHERE ".$filter."S.session_last>".$timeLimit." ORDER BY S.session_last DESC", $glob['dbprefix']);
if (($results = $GLOBALS['db']->query($query)) !== false) {
    $GLOBALS['main']->addTabControl($lang['statistics']['title_customers_active'], 'stats_online', false, false, count($results));
    $smarty_data['users_online'] = array();
    foreach ($results as $user) {
        $user['is_admin']  = ((int)$user['admin_id'] > 0) ? 1 : 0;
Now:
$thisBotStringArray = array('alexa','appie','archiver','ask jeeves','baiduspider','bot','crawl','crawler','curl','eventbox','facebookexternal','fast',
'firefly','froogle','gigabot','girafabot','google','googlebot','infoseek','inktomi','java','larbin','looksmart','mechanize','monitor','msnbot','nambu',
'nationaldirectory','novarra','pear','perl','python','rabaz','radian','rankivabot','scooter','slurp','sogou web spider','spade','sphere','spider','technoratisnoop',
'tecnoseek','teoma','toolbar','transcoder','twitt','url_spider_sql','webalta','webbug','webfindbot','wordpress','www.galaxy.com','yahoo','yandex','zyborg',);
$query  = sprintf("SELECT S.*, C.first_name, C.last_name FROM %1\$sCubeCart_sessions AS S LEFT JOIN %1\$sCubeCart_customer AS C ON S.customer_id = C.customer_id WHERE ".$filter."S.session_last>".$timeLimit." ORDER BY S.session_last DESC", $glob['dbprefix']);
if (($results = $GLOBALS['db']->query($query)) !== false) {
    $GLOBALS['main']->addTabControl($lang['statistics']['title_customers_active'], 'stats_online', false, false, count($results));
    $smarty_data['users_online'] = array();
    foreach ($results as $user) {
        if (in_array($user['useragent'], $thisBotStringArray)) continue;
        $user['is_admin']  = ((int)$user['admin_id'] > 0) ? 1 : 0;

 

Link to comment
Share on other sites

This approach (not including certain session records) initially just used the fact that somehow, the length of time the session exists for these visitors is zero. The IP address and user agent string was not used.

The edit above now adds the examination of the user agent string. The IP address is still ignored.

As long as the user agent string, from whoever or wherever, contains any of those trigger words, it will be dumped.

Link to comment
Share on other sites

I tried your fix on 6.0.4, but it did not get rid of these entries. And the IP's are different today than yesterday. They still show as Mountainview locations. Should I have also made the first edit you mentioned in https://forums.cubecart.com/topic/46651-resolved-eliminate-bots-data-in-users-online/?do=findComment&comment=207969 ?

Link to comment
Share on other sites

"it did not get rid of these entries."

Then the useragent string must not contain any of those words in $thisBotStringArray. Look at the CubeCart_sessions table directly and reply back with those useragent strings.

"And the IP's are different"

Again, not interested in the IP addresses.

Should I have also made the first edit?

Yes, please. This also rejects all zero-time sessions which are a good indicator of a robot visitor.

Link to comment
Share on other sites

Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

i DID ADD THE PREVIOUS EDIT, BUT IT DID NOT HELP, EITHER.

 

 

Link to comment
Share on other sites

Instead of this part of the new code:

    foreach ($results as $user) {
        if (in_array($user['useragent'], $thisBotStringArray)) continue;
        $user['is_admin']  = ((int)$user['admin_id'] > 0) ? 1 : 0;
Use this:
    foreach ($results as $user) { $user_agentstring = $user['useragent'];
        $foundBots = array_filter($thisBotStringArray, function($str) use ($user_agentstring){ return (false !== strpos($user_agentstring,$str)); });
        if (!empty($foundBots) && !empty($filter)) continue;
        $user['is_admin']  = ((int)$user['admin_id'] > 0) ? 1 : 0;

Let me test this on my system just to make sure there are no syntax errors.

Link to comment
Share on other sites

That was tricky.

Make these edits as well:

In this same area of making edits, find and delete:

$GLOBALS['main']->addTabControl($lang['statistics']['title_customers_active'], 'stats_online', false, false, count($results));
Add the new line seen here just above the existing line:
/* NEW */ $GLOBALS['main']->addTabControl($lang['statistics']['title_customers_active'], 'stats_online', false, false, count($smarty_data['users_online']));
    $GLOBALS['smarty']->assign('USERS_ONLINE', $smarty_data['users_online']);

I've also made some edits in the post above, so make a fresh copy of those statements.

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...