Jump to content

[Resolved] .htaccess rewrites


red sun

Recommended Posts

We have old URLs still turning up on Google for a site that was launched months ago. I'm trying to modify the .htaccess file to catch any URL that contains a specific string in the URL which is from the old site - for example "spectacle_shop" (eg: https://spectacles.com/spectacle_shop/cat_544283-RayBan.html?_a=category&cat_id=544283").

<IfModule mod_rewrite.c>
  RewriteEngine On

  RewriteCond %{HTTPS} off    
  RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
  
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)\.html?$ index.php?seo_path=$1 [L,QSA]
  
  RewriteCond %{REQUEST_URI} spectacle_shop
  RewriteRule .* index.php [L,R=301]

</IfModule>

The existing CubeCart .htaccess code is taking preference as the site always redirects back to the root URL followed by /.html which results in a 403 errror.

Please note I've changed the actual shop URL and string we're looking for as I don't want Google indexing them again through this post. The whole site is under an SSL cert so all URLs start with https://

Many thanks for any advice.

 

 

Link to comment
Share on other sites

Ending up with /.html means CubeCart could not find a matching SEO Path in the database.

So, this may be an invalid variant that Cubecart cannot figure out:

https://spectacles.com/spectacle_shop/cat_544283-RayBan.html?_a=category&cat_id=544283

First 301 would change to:

https://spectacles.com/cat_544283-RayBan.html?_a=category&cat_id=544283

A CubeCart 4 URL is of the form:

cat_([0-9]+)(\.[a-z]{3,4})?(.*)$

Such as:

https://spectacles.com/friendly-text-of-category/cat_544283.html?followed_by_querystring

I think it is the -RayBan that is causing the fault. Can you determine how reliable this extra bit of characters is? That is, if we can code for it, will there be any false negatives?

Your .htaccess does not have the set of CC4 Rewrite statements. Please add the following:

  ######## START v4 SEO URL BACKWARD COMPATIBILITY ########
  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule cat_([0-9]+)(\.[a-z]{3,4})?(.*)$ index.php?_a=category&cat_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule prod_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule info_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=document&doc_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule tell_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule _saleItems(\.[a-z]+)?(\?.*)?$ index.php?_a=saleitems&%1 [NC,L]
  ######## END v4 SEO URL BACKWARD COMPATIBILITY ########

ABOVE this:

  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)\.html?$ index.php?seo_path=$1 [L,QSA]

 

Link to comment
Share on other sites

They are not required for a domain that has never been seen before. If there are CC4-style URL links out there in the vastness of the Internet for your domain, one will need to either convince the owners of those web pages where the links are located to update those links (impractical), or rewrite the URLs that show up (easy).

Link to comment
Share on other sites

The new site was built in CC5 and all content was added in from scratch so there aren't any CC4 URLs to worry about. The problem is that Google still brings up the occasional reference to the old site URLs for general product searches and as you spotted they're very similar to CC4. The CC4 rewrite rules were causing a "Category not found" error in CC5.

I've asked the client to delete any old Google Shopping entries and we're hoping that will stop them showing up in searches in future. We've checked that Google is indexing the site correctly and that sitemap.xml.gz is being picked up.

In the meantime I had to come up with a soltion which didn't lead to 404s when customers click the old link as we think it's been damaging the Google ranking pretty badly, hence the need for a rewrite/redirect based on a string we know existed in all the old site URLs.

Link to comment
Share on other sites

The rewrite code below strips out the quesrystring and redirects any link that contains "spectacle_shop". Looks like we may now be going for the more painful task of redirecting the old product URL to new the product URL for several hundred products. Fine to mark this as resolved - many thanks again for your input.

RewriteCond %{REQUEST_URI} spectacle_shop
RewriteRule ^$ https://spectacles.com/index.php [L,R=301]

 

Link to comment
Share on other sites

Maybe consider setting up a 404 error page in the Advanced section of cPanel. You can add a message on it to tell the customer why they arrived there and what to do next. Eventually Google will flush all the old links out. Might save you a bit time.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...