Google Search Console And Your Blogger Sitemap: How To Add or Make One For Your Site

Adding Sitemaps to Google Blogger

So if you have received it yet or not, Google has been sending out emails to those with websites registered under their "Google Search Console" control with the title, "Introducing the new Search Console for http://(your site.com)/"

It's just a new format for their Search Console system.

If you took them up on it, (as if we have a choice!) you saw the new processes for your website for their web characterizations. I'm actually not sure if it will or will not help your site, because if you're a small-fry like this site, I'm not sure how you will get any traction in web searches. But for humor's sake, let's assume it will. Being an optimist and all.

At one point you may see an email that is titled "New Index coverage issue detected for site http://(your site.com)."

In it, after I've followed up the link that was showing me what my detected issue was, I noticed an option saying that I am missing my sitemap. But unlike "real" blog sites that are run under Wordpress or other forms of blog control, Blogger sites don't have a real sitemap by default.

But according to a multitude of other sites, there is a way to create, or as I am presuming, simulate a sitemap.

SITEMAP DEFINED, via { https://developers.google.com/search/reference/robots_txt }

crawler: A crawler is a service or agent that crawls websites. Generally speaking, a crawler automatically and recursively accesses known URLs of a host that exposes content which can be accessed with standard web-browsers. As new URLs are found (through various means, such as from links on existing, crawled pages or from Sitemap files), these are also crawled in the same way.

user-agent: a means of identifying a specific crawler or set of crawlers.

directives: the list of applicable guidelines for a crawler or group of crawlers set forth in the robots.txt file.

URL: Uniform Resource Locators as defined in RFC 1738.

Google-specific: These elements are specific to Google's implementation of robots.txt and may not be relevant for other parties.

-

Basically put, for Blogger site admins, you need to add the following text to your "Custom robots.txt" option.

This is a sample from my other blogger-based site:

# Blogger Sitemap generated on 2013.06.06
User-agent: *
Disallow: /search
Allow: /
Sitemap: http://www.cinemastatic.org/atom.xml?redirect=false&start-index=1&max-results=500

(And if you have more than 500 posts, then the following subsequent lines can be added:)

Sitemap: http://www.cinemastatic.org/atom.xml?redirect=false&start-index=501&max-results=500
Sitemap: http://www.cinemastatic.org/atom.xml?redirect=false&start-index=1001&max-results=500
Sitemap: http://www.cinemastatic.org/atom.xml?redirect=false&start-index=1501&max-results=500
Sitemap: http://www.cinemastatic.org/atom.xml?redirect=false&start-index=2001&max-results=500
Sitemap: http://www.cinemastatic.org/atom.xml?redirect=false&start-index=2501&max-results=500

-

BTW: If you think you can skip this with some kind of RSS Feed gadget that shows the latest content on your site, that's great, but the part you're missing is the ability of Google to scan your site for posts by directing it to your feed via the text in your robots.txt file.

BTW #2: If you don't create this customization, the default XML sitemap file of any Blogger blog will have only the 26 most recent blog posts on your blog.

Just saying.

A sitemap explained:

A sitemap is a text file that describes or gives information about your site to web robots like search engine spiders. It can control what parts of your site they can or cannot search.  Without a robots file, most spiders will assume your entire site is scannable. The advantage of having a robots text file is to avoid that pesky 404 error message when a bot can't find your robots file. Yes, it's an interesting conundrum.

From my file example above,

"User-agent: *" is the robot name. We indicate : *, to include ALL bots.

"Disallow: /search" This section usually is a place to tell most well-behaved spiders where you don't want them to go explore. Yes, you can block things from being indexed. In this example, which seems to be common to blogger robots.txt files, this particular command says DO NOT SEARCH any url that starts with "/search" in it.  If we wanted for some strange reason to block the entire site, we'd just put "/" in there.

"Allow: /:" This is telling the spider bots that they are allowed to search the entire site, "/" on down.

"Sitemap: http://www.cinemastatic.org/atom.xml?redirect=false&start-index=1&max-results=500"  This is a command statement, telling the crawlers, Hey, check out all my posts, as spelled out via this tricky formatted statement! Under normal, standard industry formatted blogs, a sitemap lists

And that's all there is to it!

- - -

There is another, slightly more complicated method for adding a sitemap for the more technically oriented folks out there... but when I tried it, I got all kinds of security warnings from my blogger control panel so I'm not going to get into it here.

Other resources to look into:

labnol.org ,
advancedhtml.co.uk ,
xml-sitemaps.com ,

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Comments