Google's new Guidelines

September 11, 2005

At the present time, CC in the URL code, has the following -- &catid -- according to Googles new rules, it has to be changed to this: &catid.

As the urls are now, none of the Url's will be indexed in google. This renders CC useless in google for getting any traffic and needs to be addressed quickly.

How will this issue be addressed ? Is there a mod, will there be one, if so when ?

roban · September 11, 2005

That's accurate if you want to have valid code, but your premise is off. Google hasn't ever required valid code and hasn't changed recently that I know of. Sure, it'd be nice if everyone were to code everything properly, but I've seen plenty of sites that don't do it right that are listed just fine in the SERPs, including my vitamin site.

On the Google Sitemap program, that's a little different - you want to do that correctly. But their bots aren't as picky as their Sitemap program. Just think of how many pages they'd exclude if they required that links be valid - and with Yahoo already showing an index well over double what Google claims, they don't want to exclude that many pages.

If you have a reference for your statement I'd be interested in seeing it.

September 12, 2005

I have a site i have heavily modded for a client and i have made no changes to it in over 6 months now to compensate what changes goole make and the site is still achieving page 1 rankings on many keywords.

The only thing I have done is apply 4/5 mods to help place text in right place and apply my secret formula whan adding the products to achieve the high results.

I still believe if your honest and have good content u will do well.

September 12, 2005

Homer

Will you share your secret formula?

September 12, 2005

Webmaster guide

The URL above takes you to the complete list of guidelines.

Its nothing to do with valid or invalid code. You are correct Roban about your code statement and I agree most bots are able to crawl badly written code and ignore for the most part errors in that coding. But I am talking about something different here.

It may be so that current pages are indexed and doing well regarding SERPS. Sure these pages are indexed now but for how long. This statement is a recent addition to their guideliness. Previously it said avoid "&=id" due to the nature of these pages,when crawled googlebot places massive a drain on the server resources and therefore would only crawl and index these pages slowly.

The part I am referring to is at the bottom.

Technical Guidelines:

* Use a text browser such as Lynx to examine your site, because most search engine spiders see your site much as Lynx would. If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site.

* Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page.

* Make sure your web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.

* Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it's current for your site so that you don't accidentally block the Googlebot crawler. Visit http://www.robotstxt.org/wc/faq.html to learn how to instruct robots when they visit your site.

* If your company buys a content management system, make sure that the system can export your content so that search engine spiders can crawl your site.

* Don't use "&id=" as a parameter in your URLs, as we don't include these pages in our index

From this information, google will stop indexing pages with this parameter in the url. It won't happen in the blink of an eye but if they say they won't index them, then they won't.

From my server logs I can see msnbot is crawling more pages and on its last visit used 9.2mb of bandwidth. Googlebot used just 2.49mb of bandwidth. It shows me the struggle googlebot is having.

We as a community need to make sure we aren't going to get left behind when these changes happen.

September 12, 2005

Is it just me or is this basically a policy decision by Google to support the froogle venture?

Sign In

Google's new Guidelines

Recommended Posts

Guest

Link to comment

Share on other sites

roban

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Guest hobbyofkings

Link to comment

Share on other sites

Join the conversation

Browse

Activity