Jump to content

[Resolved] 404 error from site map


Recommended Posts

I have done some product renaming, and reset the seo links. I then generated a new site map using the maintenance section...links in the new site map generate 404 errors when you click them, and google is showing 700 errors on my site from my last crawl. Any ideas on how to fix this?

Also, if you browse through the site, the errors will show, but after 5 or so refreshes, the page pulls up. Then the link from the site map will refresh and pull the site with no 404 error.

I also see sitemap links to inactive category and items that have the status off. How do I get those deleted?

Link to post
Share on other sites

" but after 5 or so refreshes, the page pulls up "

I was going to say this sounds like an external caching issue (not your browser and not CubeCart, but something your hosting provider is running your site through).

But that doesn't make sense. A new seo path wouldn't be in the external cache.and no external cache would block a request for a page it doesn't have.

If it is CubeCart that is returning a 404 (a "404 message" within a regular Cubecart page), then either:
The URL is not getting rewritten in the .htaccess file, or
The seo path cannot be found in the CubeCart_seo_urls database table.

Check the table for the exact, correct spelling of the seo path being used.

Link to post
Share on other sites
1 minute ago, bsmither said:

" but after 5 or so refreshes, the page pulls up "

I was going to say this sounds like an external caching issue (not your browser and not CubeCart, but something your hosting provider is running your site through).

But that doesn't make sense. A new seo path wouldn't be in the external cache.and no external cache would block a request for a page it doesn't have.

If it is CubeCart that is returning a 404 (a "404 message" within a regular Cubecart page), then either:
The URL is not getting rewritten in the .htaccess file, or
The seo path cannot be found in the CubeCart_seo_urls database table.

Check the table for the exact, correct spelling of the seo path being used.

The spelling is all exact and correct. I think once the seo link has been deleted, it takes the page being clicked on to refresh it... they are almost all fixed now that I sent a program to crawl all links.

Link to post
Share on other sites

Now google gave me this warning: When we tested a sample of URLs from your Sitemap, we found that some URLs redirect to other locations. We recommend that your Sitemap contain URLs that point to the final destination (the redirect target) instead of redirecting to another URL.

this is a link from the site map:

http://www.csrocketry.com/rocket-motors/cesaroni.html

this is where that link takes you when you click in the site map:

https://www.csrocketry.com/index.php?seo_path=rocket-motors/cesaroni

This is the actual link:

https://www.csrocketry.com/rocket-motors/cesaroni.html

 

Edited by Christopher Short
Link to post
Share on other sites

That sounds like your htaccess file does not have the Redirects working properly. Rename your existing .htaccess file. CC will create a new one with the appropriate standard values. Compare the entries in your own re-named one and the stock one CC creates. Maybe that will help you spot the issue.

Link to post
Share on other sites
3 minutes ago, Dirty Butter said:

That sounds like your htaccess file does not have the Redirects working properly. Rename your existing .htaccess file. CC will create a new one with the appropriate standard values.

I did this just now. It still leaves the /index.php?seo_path= part in the link. Should it remove that?

Edited by Christopher Short
Link to post
Share on other sites

Do you have this?

Quote

  ######## START v4 SEO URL BACKWARD COMPATIBILITY ########
  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule cat_([0-9]+)(\.[a-z]{3,4})?(.*)$ index.php?_a=category&cat_id=$1&%1 [NC]

 # RewriteRule prod_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC]
###BSMITHER VERSION TO FIX GOOGLE WARNINGS
  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule p(rod)?_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$2&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule info_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=document&doc_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule tell_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule _saleItems(\.[a-z]+)?(\?.*)?$ index.php?_a=saleitems&%1 [NC,L]

###BSMITHER ADDED TO FIX C3 EXTRA IMAGES URLS
RewriteCond %{QUERY_STRING} ^productId=([0-9]+)$
RewriteRule /extra/prodImages\.php$ index.php?_a=product&product_id=%1 [NC]
  ######## END v4 SEO URL BACKWARD COMPATIBILITY ########

  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)\.html?$ index.php?seo_path=$1 [L,QSA]
</IfModule>

## Default store 404 page
ErrorDocument 404 yours goes here

##### END CubeCart .htaccess #####

 

Link to post
Share on other sites
1 minute ago, Dirty Butter said:

Do you have this?

 

Here is the new file:

##### START CubeCart .htaccess #####

### File Security ###
<FilesMatch "\.(htaccess)$">
  Order Allow,Deny
  Deny from all
</FilesMatch>

### Apache directory listing rules ###
DirectoryIndex index.php index.htm index.html
IndexIgnore *

### Rewrite rules for SEO functionality ###
<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteBase /
 
  ##### START v4 SEO URL BACKWARD COMPATIBILITY #####
  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule cat_([0-9]+)(\.[a-z]{3,4})?(.*)$ index.php?_a=category&cat_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule prod_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule info_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=document&doc_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule tell_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$1&%1 [NC]

  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule _saleItems(\.[a-z]+)?(\?.*)?$ index.php?_a=saleitems&%1 [NC,L]
  ##### END v4 SEO URL BACKWARD COMPATIBILITY #####

  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)\.html?$ index.php?seo_path=$1 [L,QSA]
</IfModule>

### Default store 404 page ###
ErrorDocument 404 /index.php

## Override default 404 error document for missing page resources ##
<FilesMatch "\.(gif|jpe?g|png|ico|css|js|svg)$">
  ErrorDocument 404 "<html></html>
</FilesMatch>
##### END CubeCart .htaccess #####

Link to post
Share on other sites

Note there's a section Bsmither provided that is supposed to keep Google from seeing those v4 url's. It has helped, but I sometimes still get warnings about the v4 url formed pages. I've guessed that it's a hosting blip that lets those old formed url's show momentarily.

4 minutes ago, Dirty Butter said:

###BSMITHER VERSION TO FIX GOOGLE WARNINGS
  RewriteCond %{QUERY_STRING} (.*)$
  RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule p(rod)?_([0-9]+)(\.[a-z]{3,4})?$ index.php?_a=product&product_id=$2&%1 [NC]

 

Link to post
Share on other sites
9 minutes ago, Dirty Butter said:

Note there's a section Bsmither provided that is supposed to keep Google from seeing those v4 url's. It has helped, but I sometimes still get warnings about the v4 url formed pages. I've guessed that it's a hosting blip that lets those old formed url's show momentarily.

 

I added this, not sure how this file works though :)

5 minutes ago, bsmither said:

" It still leaves the /index.php?seo_path= part in the link. Should it remove that? "

CubeCart should be removing that. We will have to find out where it doesn't when it is supposed to.

 

What can I do to help?

10 minutes ago, bsmither said:

" I think once the seo link has been deleted, it takes the page being clicked on to refresh it. "

There is an issue with having the Maintenance kill SEO links. See: https://github.com/cubecart/v6/issues/1490

 

Hmm, I started the thread that you guys used for that. I guess Im causing problems... ;)

Edited by Christopher Short
Link to post
Share on other sites
6 minutes ago, Dirty Butter said:

YOU are not causing problems, lol, but something sure is!

Hopefully this will be a straightforward fix. I am trying to improve my seo rankings and generate more business. My last google report had almost 700 404 errors, which were all product name changes. Then the link warning occurred and now I'm stuck. Of course you and Bsmither are great at helping me and I truly appreciate it!

Link to post
Share on other sites
12 minutes ago, bsmither said:

In an earlier post, you gave a sequence of links.

From the first to the second, the URL is getting switched to https while retaining the rewritten URL.

This issue in the Github mentions it, but was dropped.

This fix answers why a SSL-enabled store still has non-ssl links in the sitemap.

So the fix didn't actually fix it then. Ideas how to do so?

Link to post
Share on other sites

When you created the sitemap, did you log in to admin under SSL? (Even if the store is set to use SSL, one can still log in to admin not using SSL.)

If you did not log in under SSL, then that may be the reason why the sitemap was created using non-SSL links.

Edited by bsmither
  • Like 1
Link to post
Share on other sites
12 hours ago, bsmither said:

When you created the sitemap, did you log in to admin under SSL? (Even if the store is set to use SSL, one can still log in to admin not using SSL.)

If you did not log in under SSL, then that may be the reason why the sitemap was created using non-SSL links.

Yeah, this seems to be the issue. I added https to my admin, made a new site map and they all show https now, and appear to be linking properly. Thanks a bunch!

Link to post
Share on other sites

Here is something new for this:

URL:
https://www.csrocketry.com/xxlarge-skyangle-deployment-bag.html
Error details
In sitemaps
Linked from
 
 

 

This is a 404 error from google console. I generated this map after I did the SEO refresh from admin.

How do I remove the sales items category from the site map, it is not active on my site as a category, as I run permanent price discounts on some items.

Link to post
Share on other sites

As there is no console to select what gets in the map and what doesn't, your quickest solution is to manually edit the XML file.

But, the code that makes the sitemap will check to see if the store sale mode is not off, and if not in the global sale mode (leaving per product), then include the link.

But you say the sale mode is off, so there should be no link.

Link to post
Share on other sites

If you cleared all SEO links from admin, Maintenance, then this issue in the Github may explain what is happening.

On the other hand, a search for skyangle includes this product:
www.csrocketry.com/recovery-supplies/skyangle/deployment-bag/xxlarge-skyangle-deployment-bag.html

Note that the category's seo_path has been prepended to the product's seo_path.

 

Link to post
Share on other sites
12 minutes ago, bsmither said:

If you cleared all SEO links from admin, Maintenance, then this issue in the Github may explain what is happening.

On the other hand, a search for skyangle includes this product:
www.csrocketry.com/recovery-supplies/skyangle/deployment-bag/xxlarge-skyangle-deployment-bag.html

Note that the category's seo_path has been prepended to the product's seo_path.

 

I know the path on the site is correct, but the site map seems to pull invalid paths. Some of those changes are months old, such as this one. I changed the path a long time ago. How bad will these 404 errors hurt my seo rankings?

Link to post
Share on other sites

" Some of those changes are months old "

Such as the one we are just now exploring?

You may have to tell the search engines to completely drop prior crawls and/or sitemaps, then give them a fresh sitemap.

I am not one to answer questions about SEO.

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...