red sun Posted September 5, 2016 Share Posted September 5, 2016 We have old URLs still turning up on Google for a site that was launched months ago. I'm trying to modify the .htaccess file to catch any URL that contains a specific string in the URL which is from the old site - for example "spectacle_shop" (eg: https://spectacles.com/spectacle_shop/cat_544283-RayBan.html?_a=category&cat_id=544283"). <IfModule mod_rewrite.c> RewriteEngine On RewriteCond %{HTTPS} off RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} !=/favicon.ico RewriteRule ^(.*)\.html?$ index.php?seo_path=$1 [L,QSA] RewriteCond %{REQUEST_URI} spectacle_shop RewriteRule .* index.php [L,R=301] </IfModule> The existing CubeCart .htaccess code is taking preference as the site always redirects back to the root URL followed by /.html which results in a 403 errror. Please note I've changed the actual shop URL and string we're looking for as I don't want Google indexing them again through this post. The whole site is under an SSL cert so all URLs start with https:// Many thanks for any advice. Quote Link to comment Share on other sites More sharing options...
bsmither Posted September 5, 2016 Share Posted September 5, 2016 Not being an expert, I would think that if the spectacle_shop statements appeared above CubeCart's statements, then the 301 would happen. Quote Link to comment Share on other sites More sharing options...
red sun Posted September 5, 2016 Author Share Posted September 5, 2016 Many thanks. That worked but the URL that I end up on shows the complete server path: http://spectacles.com/var/www/vhosts/spectacles.com/httpdocs/index.php Looks like my redirect is malformed - will try to hunt down the correct syntax. Quote Link to comment Share on other sites More sharing options...
red sun Posted September 5, 2016 Author Share Posted September 5, 2016 Aha - I had the CC4 SEO backwards compatibility in the .htaccess as well. Stripped that out and it's working now. Ended up putting the domain into the redirect as well. RewriteCond %{REQUEST_URI} spectacle_shop RewriteRule ^$ https://spectacles.com/index.php [L,R=301] Quote Link to comment Share on other sites More sharing options...
bsmither Posted September 5, 2016 Share Posted September 5, 2016 Ending up with /.html means CubeCart could not find a matching SEO Path in the database. So, this may be an invalid variant that Cubecart cannot figure out: https://spectacles.com/spectacle_shop/cat_544283-RayBan.html?_a=category&cat_id=544283 First 301 would change to: https://spectacles.com/cat_544283-RayBan.html?_a=category&cat_id=544283 A CubeCart 4 URL is of the form: cat_([0-9]+)(\.[a-z]{3,4})?(.*)$ Such as: https://spectacles.com/friendly-text-of-category/cat_544283.html?followed_by_querystring I think it is the -RayBan that is causing the fault. Can you determine how reliable this extra bit of characters is? That is, if we can code for it, will there be any false negatives? Your .htaccess does not have the set of CC4 Rewrite statements. Please add the following: ######## START v4 SEO URL BACKWARD COMPATIBILITY ######## RewriteCond %{QUERY_STRING} (.*)$ RewriteCond %{REQUEST_FILENAME} !-f RewriteRule cat_([0-9]+)(\.[a-z]{3,4})?(.*)$ index.php?_a=category&cat_id=$1&%1 [NC] RewriteCond %{QUERY_STRING} (.*)$ RewriteCond %{REQUEST_FILENAME} !-f RewriteRule prod_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC] RewriteCond %{QUERY_STRING} (.*)$ RewriteCond %{REQUEST_FILENAME} !-f RewriteRule info_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=document&doc_id=$1&%1 [NC] RewriteCond %{QUERY_STRING} (.*)$ RewriteCond %{REQUEST_FILENAME} !-f RewriteRule tell_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC] RewriteCond %{QUERY_STRING} (.*)$ RewriteCond %{REQUEST_FILENAME} !-f RewriteRule _saleItems(\.[a-z]+)?(\?.*)?$ index.php?_a=saleitems&%1 [NC,L] ######## END v4 SEO URL BACKWARD COMPATIBILITY ######## ABOVE this: RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_URI} !=/favicon.ico RewriteRule ^(.*)\.html?$ index.php?seo_path=$1 [L,QSA] Quote Link to comment Share on other sites More sharing options...
red sun Posted September 5, 2016 Author Share Posted September 5, 2016 Ah. I stripped all the CC4 Rewrite statements out. Are they still required for a CC5 site that wasn't upgraded from CC4? Quote Link to comment Share on other sites More sharing options...
bsmither Posted September 5, 2016 Share Posted September 5, 2016 They are not required for a domain that has never been seen before. If there are CC4-style URL links out there in the vastness of the Internet for your domain, one will need to either convince the owners of those web pages where the links are located to update those links (impractical), or rewrite the URLs that show up (easy). Quote Link to comment Share on other sites More sharing options...
red sun Posted September 5, 2016 Author Share Posted September 5, 2016 Thankfully there won't be any CC4 URL structures as the site was developed and launched on CC5. Many thanks for your assistance - not sure how I mark this as resolved. Quote Link to comment Share on other sites More sharing options...
bsmither Posted September 5, 2016 Share Posted September 5, 2016 But you started this conversation by saying that there exists CC4 type URLs. Quote Link to comment Share on other sites More sharing options...
red sun Posted September 5, 2016 Author Share Posted September 5, 2016 Aha - apologies, I should have been much more specific. The previous site wasn't CubeCart but looking at the URL structure it does look similar. Quote Link to comment Share on other sites More sharing options...
bsmither Posted September 5, 2016 Share Posted September 5, 2016 Wasn't Cubecart... So the category id numbers are in no way comparable to Cubecart's category id numbers? Quote Link to comment Share on other sites More sharing options...
red sun Posted September 5, 2016 Author Share Posted September 5, 2016 The new site was built in CC5 and all content was added in from scratch so there aren't any CC4 URLs to worry about. The problem is that Google still brings up the occasional reference to the old site URLs for general product searches and as you spotted they're very similar to CC4. The CC4 rewrite rules were causing a "Category not found" error in CC5. I've asked the client to delete any old Google Shopping entries and we're hoping that will stop them showing up in searches in future. We've checked that Google is indexing the site correctly and that sitemap.xml.gz is being picked up. In the meantime I had to come up with a soltion which didn't lead to 404s when customers click the old link as we think it's been damaging the Google ranking pretty badly, hence the need for a rewrite/redirect based on a string we know existed in all the old site URLs. Quote Link to comment Share on other sites More sharing options...
bsmither Posted September 5, 2016 Share Posted September 5, 2016 Aside from maybe making things worse, give some thought to 301 bouncing straight to index.php with no querystring if the trigger phrase is found.. Quote Link to comment Share on other sites More sharing options...
red sun Posted September 5, 2016 Author Share Posted September 5, 2016 You think it's best to retain the query string. Will look into that - many thanks. Quote Link to comment Share on other sites More sharing options...
Dirty Butter Posted September 5, 2016 Share Posted September 5, 2016 I'll mark it resolved when you have success. Quote Link to comment Share on other sites More sharing options...
bsmither Posted September 5, 2016 Share Posted September 5, 2016 No, I think it's best to discard the querystring. Quote Link to comment Share on other sites More sharing options...
red sun Posted September 6, 2016 Author Share Posted September 6, 2016 The rewrite code below strips out the quesrystring and redirects any link that contains "spectacle_shop". Looks like we may now be going for the more painful task of redirecting the old product URL to new the product URL for several hundred products. Fine to mark this as resolved - many thanks again for your input. RewriteCond %{REQUEST_URI} spectacle_shop RewriteRule ^$ https://spectacles.com/index.php [L,R=301] Quote Link to comment Share on other sites More sharing options...
ayz1 Posted September 6, 2016 Share Posted September 6, 2016 Maybe consider setting up a 404 error page in the Advanced section of cPanel. You can add a message on it to tell the customer why they arrived there and what to do next. Eventually Google will flush all the old links out. Might save you a bit time. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.