Jump to content

Archived

This topic is now archived and is closed to further replies.

Hicens

.htaccess issues want to disable CubeCart and let me take control

Recommended Posts

Hi there

Really like the CubeCart but there is just one thing that is annoying me.

I dont like the .html extensions so managed to set this to null in the seo class file and that worked fine, updated the .htaccess to work with this and all is working well.

When I redesign a website I always use 301 redirects to point to the new path, such as:

redirect 301 /old-prod.html /new-product
This keeps Google and the customer happy with page ranks etc. Well not always but certainly helps.

But some paths are not working and just redirecting to index.php, what I want to do is disable the following: (I don't mind modifying PHP that is not an issue for me)

I am would prefer to take control and make my own 404 page, so that I can capture all backlinks that are not valid and then update the .htaccess file accordingly.

Also I want the site to be fully HTTPS, however using the inbuilt functions causes redirect loops, again I just want to take control of this and use something along the lines of:

RewriteCond %{HTTPS} offRewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
But even with "Force SSL mode on all pages" and "Enable SSL" set to "No" it seems to still keep causing me issues with a redirect loop.

Has anyone modified cubecart to take complete control over the redirecting of pages?

Of course I need the SEO path to work still, but I would prefer to fully manage the redirects if that is possible.

Thanks in advance for any advice and keep up the great work on CubeCart.

I am using the latest version of CubeCart v5 for information.

Share this post


Link to post
Share on other sites

Welcome Hicens! Glad to see you made it to the forums.

 

"I want to use my own 404 page if possible"

 

With SEO enabled, CubeCart relies on the rewrite rule to grab whatever is the URI and make it a querystring. Then CubeCart looks for the seo_path value in the database.

 

If your intent is to log seo_paths that aren't in the database (considered to be 404-ish), we can add code that will log to one of the error_logs: PHP's error_log, or CubeCart's admin_error_log or system_error_log database tables.

 

"Why not just www.example.com"

 

I don't know for sure. Some of the code uses $_SERVER['PHP_SELF'] and some does not. Personally, I don't see why a URL couldn't routinely be stripped of 'index.php'.

Share this post


Link to post
Share on other sites

 

BACKUP ALL FILES then try:

 

in the classes/ssl.class.php file after;

 

$page = $GLOBALS['config']->get('config',
'ssl_url').str_replace(CC_ROOT_REL, '/',
$_SERVER['PHP_SELF']);

add:

 

$page = str_replace('index.php','',$page);

 

 

change the .htaccess file:

 

RewriteRule ^(.*).html?$ index.php?seo_path=$1 [L,QSA]

to

RewriteRule ^(.*).html?$ ?seo_path=$1 [L,QSA]

and also add:

RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.YOURDOMAIN.com/$1 [R,L]

(note), you may need to redo this when upgrading cubecart.

Share this post


Link to post
Share on other sites

Thanks both for your replies.

 

@i.ahmed I will test out the code, given for the SSL, I am using a private Git on my machine to track my modifications so I know what to do after upgrades.

Thanks for this, will update later this evening when I have tested and implemented the code.

 

 

@bsmither The 404 is an important part in SEO if the website does not give a header of 404 how will Google know if that backlink is now bad?

Although I agree that redirecting to the homepage is a great idea, it just needs some extra code to handle things a little differently in my view.

 

normally in my .htaccess file I would specify a 404 PHP file by doing the following:

ErrorDocument 404 /404.php

Then in my 404.php file I would then specify the following code:

<?php

// Insert code here to trap bad page and store in DB

header("HTTP/1.0 404 Not Found");

echo '<html><head><meta http-equiv="Refresh" content="0;url=http://www.example.com" /></head><body></body></html>;

This is a very simple example, but the idea is the same, I need the header to be 404 capture the request and then return content.

Perhaps take the input clean it and then pass it to the search page to give them results??

The reason for the meta redirect is that you will get a headers already sent error if you do an include.

 

Do you think CubeCart is able to handle a modification like this?

Which files are redirecting it back to the index.php from what I can see its the seo class.

 

 

Thanks again for your input both.

Share this post


Link to post
Share on other sites

"Which files are redirecting it back to the index.php from what I can see its the seo class."

 

Please look at the code in /classes/cubecart.class.php, near line 200, the loadPage() function.

 

The first task is to deal with $_GET['_g'] commands (line 202). _g commands generally deal with the gateways. For this conversation, there is nothing to do here.

 

Next, the task is to deal with $_GET['_a'] commands (line 261). _a commands generally deal with customer actions. There are a number of pre-defined cases, followed by a catch-all default that will call the named action routine. But, if the action name is not known, either by one of the pre-defined names, the derived name, or by a hook, CC5 does nothing but to show the header, footer, and sidebars.

 

Near line 385 is the task that is performed if there is no $_GET['_g'] or&_GET[' _a']. This is important as it is here that the HomePage is displayed when the case is such that the SEO class could not find the path in the seo_paths table.

 

In the file /controllers/controller.index.inc.php, near line 40, SEO->getItem() uses the seo_path to swap out $_GET['seo_path'] (introduced in .htaccess) with $_GET['_a']=action and $_GET['type_id']=id. If the seo_path isn't found, CubeCart will bounce the browser to index.php where there will be no $_GET['_a']. From above, CubeCart sends the Homepage.

 

A good spot to try to catch a 404 condition is in SEO->getItem(). A check to make sure nothing else goes wrong will need to be done. If the query returns false, we can bounce the browser with the statement httpredir('404.php');

Share this post


Link to post
Share on other sites

Thank you for your help, I have now modified CubeCart to a working solution just some tidy up work and possibly some vetting required!
 
I only needed to edit 1 class in the end following the flow you gave me.
I edited the SEO class as follows.
 
Incredibly simple but very effective in capturing those nasty 404's, already found one link in the wild of the internet misspelling one of the products.
A quick addition to the SEO urls table and all is well, no more having to chase the webmaster to update the links etc.!
 
Ok down to business, again this is concept and I urge anyone to test with an experienced pentester/security expert that this is safe for production!
 
First off lets get our new table created to support this new addition, here is the MySQL statement (Note: replace TABLEPREFIX of course!):

CREATE TABLE IF NOT EXISTS `TABLEPREFIX_CubeCart_404s` (
  `404_id` int(11) NOT NULL AUTO_INCREMENT,
  `path` varchar(256) COLLATE utf8_unicode_ci NOT NULL,
  `last_hit` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `count` int(11) NOT NULL,
  `correct_path` varchar(256) NULL,
  PRIMARY KEY (`404_id`),
  UNIQUE KEY `path` (`path`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AUTO_INCREMENT=1 ;

Edit the file seo.class.php located in the classes directory, on the lines 237 and 240, you will find the httpdir('index.php');.
 
Replace both of those lines this with the following code, this includes a 301 redirect to the correct path:

Note: On mine I had to edit the functions.inc.php at line 542 as it was adding on the .html which I removed from the SEO class.

//404 catch and store in DB
$item    = $GLOBALS['db']->select('CubeCart_404s', false, array('path' => $path));
if($item[0]['correct_path']){
	httpredir($item[0]['correct_path'],'',false,301);		
}else{
	if($item[0]['path']){
		$NewCount = $item[0]['count'] + 1;
		$GLOBALS['db']->update('CubeCart_404s', array('count' => $NewCount),array('path' => $path));
	}else{
		$GLOBALS['db']->insert('CubeCart_404s', array('count' => 1, 'path' => $path));
	}
	httpredir('404.php');
}

At the root of the CubeCart shop create a new PHP file named 404.php and add the following code to it (Note: replace http://www.example.com with your address  ;) ):

<?php

header("HTTP/1.0 404 Not Found");

echo '<html> <head> <meta http-equiv="Refresh" content="0;url=http://www.example.com" /> </head><body></body> </html>';

?>

You will now have a table that looks this, the highlighted yellow datetime is when I first made my code live as you can see the queries are flooding so need to monitor impact on this:

This was taken before I added the column for correct_path.
 
Capture.png
 
 
Now I just need to try out the SSL code and see if I can get that working as I want, thanks for you help @bsmither on the 404 pointers.

Share this post


Link to post
Share on other sites

@i.ahmed  The SSL suggestions didn't work for me, but thank you anyway for your kind advice.

 

In the end I just disabled all SSL options to "No" and removed the root options so the text input box is empty.

 

Then updated my htaccess file to do the following:

RewriteCond %{HTTP_HOST} ^example.com$ [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

As for the index.php thing I have done enough for now, I will sleep on it for a few days and try again.

 

Thanks again

Share this post


Link to post
Share on other sites

Updated answer above, wife not happy but I am as the 301 redirects can now be done by the database.

 

Just need a gui in the admin for my customer now !

Share this post


Link to post
Share on other sites

Looking at the (image of the) database records, I am puzzled at how some paths are getting through.

 

.htaccess looks for anything that ends in .html and gives that anything to seo_path.

 

Please verify if your Rewrite rule still looks like this:

RewriteRule ^(.*).html?$ index.php?seo_path=$1 [L,QSA]

If so, then your first hit must have been www/store.com/sdfadsfaf.html Is this correct?

 

Which begs the question: Did the paths that end with a file extension (.php, .jpg) also have .html?

 

Also, in controllers.index.inc.php, a test is made for a non-empty seo_path. If true, then go to SEO->getItem(). At getItem(), there is another test for a non-empty $path. Because getItem() is a public function, any code anywhere else can use it. Thus, this apparent redundancy is needed.

 

But if $path is empty, meaning we came into this function not from line 40 of controller, do you really want to do your new 404 thing at (was) line 240?

Share this post


Link to post
Share on other sites

Hi sorry I should've put this up earlier, I have modified the SEO path aswell I feel the .html is a little outdated and easier to a customer to remember a path like /book than /book.html.

So my first hit was actually https://www.example.com/sdfadsfaf this then gave the result shown in the database as expected.

 

By doing this it also means I can see if there are any people linking to old images on the site  :) and replaced the image with the companies logo.

 

So on line 49 in the SEO class file I set the extension to null and then adjusted the .htaccess file to:

  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_FILENAME} !-d
  RewriteCond %{REQUEST_URI} !=/favicon.ico
  RewriteRule ^(.*)?$ ?seo_path=$1 [L,QSA]

This also removes index.php, I tested extensively as this is a live site and after 2 hours gave up trying to break it.  :)

I think that line 542 in the functions php file should respect this variable also to save adjustment from multiple places.

 

I did think about this at line 240, doing a global search for the string $GLOBALS['seo']->getItem( I could only see this used in the controller.index.php so really I should leave that code as is.

If an empty path is used this wouldn't be from a user input as that is always caught, as the path is always present.

 

I understand what you mean so thinks for that.

 

It would be nice to see this in the future as part of the core code of CubeCart, with options in the admin settings to disable the .html and enable 404 catching with a 301 redirect. This would really help with the SEO.

 

When I installed CubeCart you can see the drop in queries as I had no insight to what was out there, I tried some backlink checks but still missed a few key links, also the existing site died after the webhost upgrade to PHP 5.4 the shop the customer had was too old to support this version so had no time to spend into researching the existing links just had to get it done. Which is always bad a redesign can seriously affect your page ranking.

 

Here is the graph from Google webmaster tools to show you what happened in this case, the drop is when it went belly up and after I installed CubeCart, the increase which is only within 24 hours! Is after I implemented my code and closely started managing the 404s.

 

Capture.png

 

 

Now the next thing on my tick list is to do the microdata for the products pages.

 

Thanks for your help it is nice to have a good support when using a product and I will be recommending CubeCart to a few other my other customers.

Share this post


Link to post
Share on other sites

I tried unsuccessfully to add the google microdata to the products page on a Blueprint site. I will be very interested to see what you manage with this.

Share this post


Link to post
Share on other sites

Will be more than happy to update the forum with my findings and work.

I will do so as a new post.

You said "products" page, I guess you mean the product page as microdata will not work on a category page the schema for product can only appear once on a page, from memory.

Share this post


Link to post
Share on other sites

×
×
  • Create New...