What are Sitemaps & Why to Use Sitemaps in a Website?Sitemaps are used by webmasters to tell search engines about pages on their sites that are available for crawling. To be precise, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.
Web crawlers find out pages from links within the site and from other sites. Sitemaps provide this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and study about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.
Sitemap 0.90 is offered under the terms of the Attribution-ShareAlike Creative Commons License and has wide adoption, including support from Google, Yahoo!, and Microsoft.
This document describes the XML plan for the Sitemap protocol.
The Sitemap protocol format consists of XML tags. All data values in a Sitemap must be entity-escaped. The file itself must be UTF-8 encoded.
The Sitemap must:
Begin with an opening <urlset> tag and end with a closing </urlset> tag.
Specify the namespace (protocol standard) within the <urlset> tag.
Include a <url> entry for each URL, as a parent XML tag.
Include a <loc> child entry for each <url> parent tag.
All other tags are optional. Support for these optional tags may vary among search engines. Refer to each search engine’s documentation for details.
Also, all URLs in a Sitemap must be from a single host, such as www.example.com or store.example.com. For further details, refer the Sitemap file location
Using Sitemap index files (to group multiple sitemap files)
You can provide multiple Sitemap files, but each Sitemap file that you provide must have no more than 50,000 URLs and must be no larger than 10MB (10,485,760 bytes). If you would like, you may compress your Sitemap files using gzip to reduce your bandwidth requirement; however the sitemap file once uncompressed must be no larger than 10MB. If you want to list more than 50,000 URLs, you must create multiple Sitemap files.
If you do provide multiple Sitemaps, you should then list each Sitemap file in a Sitemap index file. Sitemap index files may not list more than 50,000 Sitemaps and must be no larger than 10MB (10,485,760 bytes) and can be compressed. You can have more than one Sitemap index file. The XML format of a Sitemap index file is very similar to the XML format of a Sitemap file.
The Sitemap index file must:
Begin with an opening <sitemapindex> tag and end with a closing </sitemapindex> tag.
Include a <sitemap> entry for each Sitemap as a parent XML tag.
Include a <loc> child entry for each <sitemap> parent tag.
The optional <lastmod> tag is also available for Sitemap index files.
Note: A Sitemap index file can only specify Sitemaps that are found on the same site as the Sitemap index file. For example, http://www.yoursite.com/sitemap_index.xml can include Sitemaps on http://www.yoursite.com but not on http://www.example.com or http://yourhost.yoursite.com. As with Sitemaps, your Sitemap index file must be UTF-8 encoded.