Google not caching all pages

Vote:
 

Hi,

 

We currently run a number of sites on EpiServer 4.6. Recently, the number of pages that Google cache's has dropped dramatically.  We believe this may be related to URL re-writing, has anyone experienced similar issues?

We have not yet tested the problem on EpiServer 5 sites, but can do.

Any help would be great,

Thanks,

 

Tom

 

#24158
Sep 25, 2008 16:33
Vote:
 

Hi,

It might be related to this issue perhaps?

GoogleBot-search-requests-failing-against-ASPNET-20

Here are some more information about that issue:

http://nick.bluecog.co.nz/2006/09/12/episerver-and-friendly-urls/

#24187
Edited, Sep 26, 2008 11:58
Vote:
 

Hi,

 

I have checked the sites as per the instructions in the blog, but all seems to be fine.

We have looked through the server logs and there is nothing abnormal there.

 

Any other ideas?

 

Thanks,

Tom

#24205
Sep 26, 2008 17:50
Vote:
 
Have you created an xml sitemap and submitted it to google via webmaster tools? If so, check the statuses of the links in the webmaster central. They should however be the same as the statuses you got from fiddler when identifying yourself with googlebot's user agent string.
#24244
Sep 29, 2008 10:15
Vote:
 

As Peter said there is a module which generates a sitemap for google bot. We have one for 4.6 http://r.ep.se/projects/GoogleSitemaps/ and one for a later version http://labs.episerver.com/en/Blogs/Jacob-Khan/Dates/2008/6/EPiGoogleSiteMaps/ Hopefully this will help you

/Jacob

#24250
Sep 29, 2008 13:54
Vote:
 

We have created and submitted a sitemap to Google but are getting some unusual results.

 

Total URLs in sitemap: 121

Indexed URLs in sitemap: 11

 

I can't understand why there is such a differnce?

 

Thanks for your help, Tom

 

 

#24724
Oct 06, 2008 13:27
Vote:
 

Hi Again Tom
I was wondering if the url's in the sitemap are friendly or not. Is there a correlation between friendly and not. Also, do you use a lot of redirects or in EPiServer Shortcut to another page in EPiServer. Also state the sitemap in the robots.txt file

#24732
Oct 06, 2008 16:40
Vote:
 

I had a similiar problem on an old 4.6 site... Dotnet-error was: Cannot use a leading .. to exit above the top directory.

http://www.webmasterworld.com/google/3497779.htm

Solved it by adding a browser file from post #:3504161. I would like to know if it's possible to add a Browser-file that shuts all variations off and always serve the same way?

<browsers>
<browser refID="Mozilla">
<capabilities>
<capability name="cookies" value="true" />
</capabilities>
</browser>
</browsers>

#27305
Edited, Jan 26, 2009 10:08