Jump to content
sean1688

fuzzy search

Recommended Posts

can we enable fuzzy search on cubcart?

 

for example, on our website http://www.volteq.com, you can search for "hy3005d", and it works as it's a product code.

 

if you search for "3005D", it says nothing is found.

 

is there anything that we can do to make the search like "3005D" yield the same results include HY3005D?

Share this post


Link to post
Share on other sites

No and yes.

 

No, MySQL is not built for searching. There are two methods MySQL employs and CubeCart tries one first, then the other. Neither is 'fuzzy'-looking.

 

Yes, because CubeCart is (now) 100% human-readable and extensible, a plugin could be written to implement a search algorithm comparable to what Google can do, or the plugin can use Goggle itself to return it's results from having scanned your site's pages.

 

I've started to play around with Sphinx, a "man in the middle" database that scans your CubeCart database and creates a comprehensive collection of searchable stuff that is 'fuzzy'-looking. But this requires your own server.

Share this post


Link to post
Share on other sites

You can try switching the order of the two methods CubeCart uses to send queries to MySQL. In /classes/catalogue.class.php, find the searchCatalogue() function. (Line numbers have changed in CC600b6.)

 

Change the function argument list from $search_mode = 'fulltext' to $search_mode = 'like'.
 

There are some statements about 190 lines later than could be edited to have CubeCart switch back to fulltext if like didn't find anything. We won't worry about that for now.

Share this post


Link to post
Share on other sites

thanks for your reply. I tried to change the search mode per your suggestion, but nothing seems to have changed.

 

maybe I should rephrase the request. can we implement the search so that a match of subset will be presented?

 

in other words, the product code is HY3005D, but if we search for a subset of that, say 3005D, the result of HY3005D search should be presented instead of null?

Share this post


Link to post
Share on other sites

The 'like' method of searching uses the standard SQL expression:

RLIKE [[:<:]]3005D[[:>:]]

 

The [[:<:]] is a character class that means a word boundary and RLIKE is short for Regular Expression.

 

So, this is searching on whole words.

 

To get a simpler search (as you describe), make this next edit:

Was:
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if (($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR')) {
                                $regexp .= '[[:<:]]'.$searchArray[$i].'[[:>:]].*';
                            }
                        }
                        $regexp = substr($regexp, 0, strlen($regexp)-2);
                        $rlike = " AND (I.name RLIKE '".$regexp."' OR I.description RLIKE '".$regexp."' OR I.product_code RLIKE '".$regexp."')";
 
Now:
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if (($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR')) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp .= '%'.$searchArray[$i].'%';
                            }
                        }
                        // $regexp = substr($regexp, 0, strlen($regexp)-4);
                        $rlike = " AND (I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."')";

I recall testing this many moons ago, but I cannot be certain I have it all correct.

 

In admin, Store Settings, Advanced, enable Debug mode and enter your IP address (www.whatis myip.com).

 

Perform a search. In the queries section of debug, look for the query that has your search terms in it. We may need to tweak the code.

Share this post


Link to post
Share on other sites

This code works better, but still not perfect:

                        $rlike = ' AND ( ';
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp = '%'.$searchArray[$i].'%';
                            } else { $rlike .= " ".$ucSearchTerm." "; continue; }
                            $rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."') AND ";
                        }
                        $rlike = substr($rlike, 0, strlen($rlike)-5); $rlike .= " )";

Please realize that there are several syntaxes for search patterns:

Every word is a term that is AND'd (<<== what the above code does)

Every word is a term that is OR'd

Words can be force AND'd or OR'd (<<== what the above code does)

AND and OR are the combining symbols, but others exist such as && and || and NOT

 

A few statements prior, any commas and spaces are used to separate the words and then removed.

 

Tesa31 or Tesa32, returns two products.

Tesa31, Test doesn't work.

 

More later.

Share this post


Link to post
Share on other sites

I like this one:

                        $rlike = ' AND ( ';
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp = '%'.$searchArray[$i].'%';
                                $rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."')";
                            }
                            else {
                              if(strlen($ucSearchTerm) > 0){ $rlike .= " ".$ucSearchTerm." "; }
                              else { $rlike .= " AND "; }
                              continue;
                            }
                        }
                        $rlike .= " )";
 

Share this post


Link to post
Share on other sites

It doesn't seem to be working for me. I have tried all of the different codes you suggested, but when I search for freeze instead of freezer nothing shows up.

Share this post


Link to post
Share on other sites

Yes it is possible because I have it working on one of my websites. Unfortunately I can't remember how I did it but I will try and work it out and let you know.

Did this on a v3 store and it works for part word and multiple word.

http://cubecart.expandingbrain.com/free-cubecart-3-mods/simple-search-improvement-cc3/prod_160.html

Haven't tried it on a v6 store but assume it is possible. 

Edited by ayz1
Update

Share this post


Link to post
Share on other sites

Yes it is possible because I have it working on one of my websites. Unfortunately I can't remember how I did it but I will try and work it out and let you know.

Did this on a v3 store and it works for part word and multiple word.

http://cubecart.expandingbrain.com/free-cubecart-3-mods/simple-search-improvement-cc3/prod_160.html

Haven't tried it on a v6 store but assume it is possible. 

Any idea which file I'd use?

Share this post


Link to post
Share on other sites

It's beyond me unfortunatley. However if it's possible in V3 then I assume it's possible on V6. Tought the link might give someone a clue how to do it. Tried the code from bsmither above but couldn't get it to work. Probably something I am doing wrong rather than a problem with the code. Will have another go at it.

Share this post


Link to post
Share on other sites

I like this one:

                        $rlike = ' AND ( ';
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp = '%'.$searchArray[$i].'%';
                                $rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."')";
                            }
                            else {
                              if(strlen($ucSearchTerm) > 0){ $rlike .= " ".$ucSearchTerm." "; }
                              else { $rlike .= " AND "; }
                              continue;
                            }
                        }
                        $rlike .= " )";

Tried this again and it worked. Seems to do exactly what is required. Thanks bsmither.

Share this post


Link to post
Share on other sites

I like this one:

                        $rlike = ' AND ( ';
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp = '%'.$searchArray[$i].'%';
                                $rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."')";
                            }
                            else {
                              if(strlen($ucSearchTerm) > 0){ $rlike .= " ".$ucSearchTerm." "; }
                              else { $rlike .= " AND "; }
                              continue;
                            }
                        }
                        $rlike .= " )";

Tried this again and it worked. Seems to do exactly what is required. Thanks bsmither.

I must be doing something wrong because that doesn't work for me :/

[Notice] /home/content/88/5118188/html/ltcube/classes/catalogue.class.php:1530 - Undefined variable: order
[Notice] /home/content/88/5118188/html/ltcube/classes/catalogue.class.php:1598 - Undefined variable: noKeys

I am getting these errors, not sure if that helps.

Share this post


Link to post
Share on other sites

Change the function argument list from $search_mode = 'fulltext' to $search_mode = 'like'.

Where bsmither said to make the main code change you should have this (replace what is already there)

 

$rlike = ' AND ( ';
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp = '%'.$searchArray[$i].'%';
                            } else { $rlike .= " ".$ucSearchTerm." "; continue; }
                            $rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."') AND ";
                        }
                        $rlike = substr($rlike, 0, strlen($rlike)-5); $rlike .= " )";

                    }

 

between the lines

                        $noKeys = count($searchArray);
                        $regexp = '';

 

AND

 

$q2 = "SELECT I.* FROM ".$GLOBALS['config']->get('config', 'dbprefix')."CubeCart_inventory AS I LEFT JOIN (SELECT product_id, MAX(price) as price

Share this post


Link to post
Share on other sites

With this as my code nothing shows up on the page- it's totally broken, but there aren't any errors in the debug.

 

$noKeys = count($searchArray);
                        $regexp = '';
        
    $rlike = ' AND ( ';
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp = '%'.$searchArray[$i].'%';
                            } else { $rlike .= " ".$ucSearchTerm." "; continue; }
                            $rlike .= "(I.name LIKE '".$regexp."' OR I.seo_meta_keywords LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."') AND ";
                        }
                        $rlike = substr($rlike, 0, strlen($rlike)-5); $rlike .= " )";

                    }

$q2 = "SELECT I.* FROM ".$GLOBALS['config']->get('config', 'dbprefix')."CubeCart_inventory AS I LEFT JOIN (SELECT product_id, MAX(price) as price

Share this post


Link to post
Share on other sites

I've tried it on on V6 and just looked at your store and you are on V5 so maybe that's why. Or are you on V6 with an old skin? I can email you the file if you wish.

Edited by ayz1

Share this post


Link to post
Share on other sites

I am using V6 with an old skin apparently lol. Thanks for the code. I put that in instead of what I had before and it works perfect! This has been driving me insane, but now it's fixed. Thank you so much for all of your help!!

Share this post


Link to post
Share on other sites

I've just verified the following works on CC606. There will be four edits:

Find near line 1590:
    $rlike = '';
    if (!empty($search_data['keywords'])) {
        $searchwords = preg_split( '/[\s,]+/', $GLOBALS['db']->sqlSafe($search_data['keywords']));
        foreach ($searchwords as $word) {
            $searchArray[] = $word;
        }

        $noKeys = count($searchArray);
        $regexp = '';
        for ($i=0; $i<$noKeys; ++$i) {
            $ucSearchTerm = strtoupper($searchArray[$i]);
            if (($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR')) {
                $regexp .= '[[:<:]]'.$searchArray[$i].'[[:>:]].*';
            }
        }
        $regexp = substr($regexp, 0, strlen($regexp)-2);
        $rlike = " AND (I.name RLIKE '".$regexp."' OR I.description RLIKE '".$regexp."' OR I.product_code RLIKE '".$regexp."')";
    }

Change to:
/*
    $rlike = '';
    if (!empty($search_data['keywords'])) {
        $searchwords = preg_split( '/[\s,]+/', $GLOBALS['db']->sqlSafe($search_data['keywords']));
        foreach ($searchwords as $word) {
            $searchArray[] = $word;
        }

        $noKeys = count($searchArray);
        $regexp = '';
        for ($i=0; $i<$noKeys; ++$i) {
            $ucSearchTerm = strtoupper($searchArray[$i]);
            if (($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR')) {
                $regexp .= '[[:<:]]'.$searchArray[$i].'[[:>:]].*';
            }
        }
        $regexp = substr($regexp, 0, strlen($regexp)-2);
        $rlike = " AND (I.name RLIKE '".$regexp."' OR I.description RLIKE '".$regexp."' OR I.product_code RLIKE '".$regexp."')";
    }
*/
    if (!empty($search_data['keywords'])) {
        $rlike = ' AND ( ';
        $searchwords = preg_split( '/[\s,]+/', $GLOBALS['db']->sqlSafe($search_data['keywords']));
        foreach ($searchwords as $word) {
            $searchArray[] = $word;
        }

        $noKeys = count($searchArray);
        for ($i=0; $i<$noKeys; ++$i) {
            $ucSearchTerm = strtoupper($searchArray[$i]);
            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                $searchArray[$i] = str_replace(array('%','_'), array('\%','\_'), $searchArray[$i]);
                $regexp = '%'.$searchArray[$i].'%';
                $rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."')";
            } else {
                if(strlen($ucSearchTerm) > 0){ $rlike .= " ".$ucSearchTerm." "; }
                else { $rlike .= " AND "; }
                continue;
            }
        }
        $rlike .= " )";
    }

Just after that:

Find:
    if (($search = $GLOBALS['db']->query($query)) !== false) {
        $count = $GLOBALS['db']->query($q2, false, 0);
        $this->_category_count  = (int)count($count);
        $this->_category_products = $search;
        return true;
    }

Change to:
    if (($search = $GLOBALS['db']->query($query)) !== false) {
        $count = $GLOBALS['db']->query($q2, false, 0);
        $this->_category_count  = (int)count($count);
        $this->_category_products = $search;
        return true;
/**/} elseif ($search_mode == 'like') {
/**/	return $this->searchCatalogue($original_search_data, 1, $per_page, 'fulltext');
    }

Just above the first edit:

Find:
	if (($search = $GLOBALS['db']->query($query)) !== false) {
		$q2 = sprintf("SELECT COUNT(I.product_id) as count, %2\$s AS Relevance FROM %1\$sCubeCart_inventory AS I LEFT JOIN (SELECT product_id, MAX(price) as price, MAX(sale_price) as sale_price FROM %1\$sCubeCart_pricing_group $group_id GROUP BY product_id) as G ON G.product_id = I.product_id $joinString WHERE I.product_id IN (SELECT product_id FROM `%1\$sCubeCart_category_index` as CI INNER JOIN %1\$sCubeCart_category as C where CI.cat_id = C.cat_id AND C.status = 1) AND I.status = 1 AND (%2\$s) >= %4\$s %3\$s GROUP BY I.product_id %5\$s", $GLOBALS['config']->get('config', 'dbprefix'), $match, $whereString, $match_val, $order_string);
		$count = $GLOBALS['db']->query($q2, false, 0);
		$this->_category_count  = (int)count($count);
		$this->_category_products = $search;
		return true;
	} elseif ($search_mode == 'fulltext') {
		return $this->searchCatalogue($original_search_data, 1, $per_page, 'like');
	}

Change to:
	if (($search = $GLOBALS['db']->query($query)) !== false) {
		$q2 = sprintf("SELECT COUNT(I.product_id) as count, %2\$s AS Relevance FROM %1\$sCubeCart_inventory AS I LEFT JOIN (SELECT product_id, MAX(price) as price, MAX(sale_price) as sale_price FROM %1\$sCubeCart_pricing_group $group_id GROUP BY product_id) as G ON G.product_id = I.product_id $joinString WHERE I.product_id IN (SELECT product_id FROM `%1\$sCubeCart_category_index` as CI INNER JOIN %1\$sCubeCart_category as C where CI.cat_id = C.cat_id AND C.status = 1) AND I.status = 1 AND (%2\$s) >= %4\$s %3\$s GROUP BY I.product_id %5\$s", $GLOBALS['config']->get('config', 'dbprefix'), $match, $whereString, $match_val, $order_string);
		$count = $GLOBALS['db']->query($q2, false, 0);
		$this->_category_count  = (int)count($count);
		$this->_category_products = $search;
		return true;
//	} elseif ($search_mode == 'fulltext') {
//		return $this->searchCatalogue($original_search_data, 1, $per_page, 'like');
	}

Finally, at the beginning of the function searchCatalogue:

Find:
public function searchCatalogue($search_data = null, $page = 1, $per_page = 10, $search_mode = 'fulltext') {

Change to:
public function searchCatalogue($search_data = null, $page = 1, $per_page = 10, $search_mode = 'like' /* 'fulltext' */) {

The first edit discards the use of word boundaries in a regex-style search pattern, and uses a simple character match -- an underscore (_) replaces any one character, a percent (%) replaces zero or more characters. This new search code uses only the percent at the beginning and end of the search terms. The terms to be searched may contain an underscore or percent, but will be searched on literally.

The next three edits switches the search function to first use a "Simple Pattern Match" in the SQL statements sent to the database, then use the "Full Text" search of the database if nothing comes up.

This part of the new code:

$rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."')";

can be expanded upon to include other product-based fields, including:
seo_meta_keywords
short_description

Edited by bsmither

Share this post


Link to post
Share on other sites
On 26/08/2015 at 6:15 PM, ayz1 said:

Tried this again and it worked. Seems to do exactly what is required. Thanks bsmither.

Ive just applied the short version of this to V6.10 and it searches no problem. Unfortunately the Sort By doesn't seem to work. It changes the URL but doesn't refresh the page. Anyone got the search and the sort working together?

Share this post


Link to post
Share on other sites

The URL changes? It seems that in order to change the URL, the browser must have fetched a page from that URL.

 

Share this post


Link to post
Share on other sites

If i do an initial search for two keywords red bus I get the desired results with the URL domainname.com/search.html?search[keywords]=red+bus&_a=category

If I change the Sort to by price descending the URL changes to

search.html?_a=category&search[keywords]=red bus&sort[price]=DESC

but the page doesn't reload.

So the initial search works but the sorting doesn't. I've used the following code edit.

On 26/08/2015 at 8:28 PM, ayz1 said:

Change the function argument list from $search_mode = 'fulltext' to $search_mode = 'like'.

Where bsmither said to make the main code change you should have this (replace what is already there)

 

$rlike = ' AND ( ';
                        for ($i=0; $i<$noKeys; ++$i) {
                            $ucSearchTerm = strtoupper($searchArray[$i]);
                            if ( ($ucSearchTerm != 'AND') && ($ucSearchTerm != 'OR') && (strlen($ucSearchTerm) > 0) ) {
                                $searchArray[$i] = str_replace(array('%','_'), array('%','_'), $searchArray[$i]);
                                $regexp = '%'.$searchArray[$i].'%';
                            } else { $rlike .= " ".$ucSearchTerm." "; continue; }
                            $rlike .= "(I.name LIKE '".$regexp."' OR I.description LIKE '".$regexp."' OR I.product_code LIKE '".$regexp."') AND ";
                        }
                        $rlike = substr($rlike, 0, strlen($rlike)-5); $rlike .= " )";

                    }

 

between the lines

                        $noKeys = count($searchArray);
                        $regexp = '';

 

AND

 

$q2 = "SELECT I.* FROM ".$GLOBALS['config']->get('config', 'dbprefix')."CubeCart_inventory AS I LEFT JOIN (SELECT product_id, MAX(price) as price

 

Share this post


Link to post
Share on other sites

Think I've solved it. Changed all the "fulltext" to "like" and it is now working.

Hmm looked again and It isn't working. Sorts OK doing that but not returning the correct information.

Share this post


Link to post
Share on other sites

"but the page doesn't reload."

Are you sure??? Maybe a new page came in but it looks identical to the page that was already there.

Share this post


Link to post
Share on other sites
7 hours ago, bsmither said:

"but the page doesn't reload."

Are you sure??? Maybe a new page came in but it looks identical to the page that was already there.

Sorry yes you are correct. The page is going through the reload process but it is laoding the page that was already there.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×