Google has decided to use a sitemap concept (XML) to assist it to index files. This is a long overdue initiative to replace Freshbot, it’s original indexer.
Freshbot used to index just the front page and Deepbot (the current Googlebot) would then do a full of those sites which had changed. This worked on the basis that new content will be reflected on the front page and that the index file’s timestamp would be updated.
Quote: Make sure your web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.
That was all fine and good in the early days when pages weren’t dynamic and new content tended to be showcased from the front page. Not so anymore. Blogs will, but information and corporate sites won’t.
Consider Google itself. They update the FAQs to some of the webmaster tools. Not in a million light years will that be visible from any page other than the updated page itself. So, this tool will help Google find the changes.
No more compromises
Another huge impact for me personally, is my Raincheck site. I’ve got complex forms for user navigation which use POST. So I’ve had to build “catalogues” of bot-friendly but human-daunting data. Now I can start to think about my site being human-friendly and the feed can be bot-friendly.
Lets just hope that the other search engines find the feeds and decide to use them too.