SIMAEC.NET WEB PUBLISHING

Sitemap - The Why and How

Although a sitemap is a structural element calling for being associated with building a website, we add this topic to digital marketing because of its importance in promoting web pages and websites.

Why a Website needs a Sitemap?

There are two main reasons. First, it is a convenient way to keep track of all pages forming a website. It can be used as checklist when monitoring progress of changes covering all web pages. Second, it is the recommended format to communicate a website's inventory to search engines like Google or Bing.

Sitemap XML Format

A sitemap file is using xml format. Xml is a markup language like html although strict. Errors in syntax will make the data unreadable.

An url element in a sitemap has one mandatory tag named "loc" and a few optional tags like "lastmod", "priority" and "frequency". Of the optional tags, we use only lastmod. We consider priority and frequency as artificial values. Search engines ignore all optional tags.

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
	<url>
		<loc>https://www.simaec.net/digital-marketing/sitemap/</loc>
		<lastmod>2022-03-22</lastmod>
	</url>
	...
</urlset>

How to build a Sitemap

There are different ways to build the sitemap. Small websites may use a manual approach, while larger websites are better served with a tool that creates the sitemap automatically.

Manual

Small websites with less than 10 pages, the sitemap.xml file can be maintained manually with a text editor. Make sure though that the xml syntax is valid using an online xml format checker.

Local Directory Crawler

For websites like simaec.net, we use a script which walks through the files and folders of the local drive. It is a simple python script.

Script on Github

Website Crawler

Another approach we use frequently is crawling the public website and retrieve urls of all internally linked webpages. We use this script not only for building a sitemap but also to monitor on page SEO elements like correct canonical links, existence of title tag and content and meta description.

Script on Github

More Methods

For database driven websites, we use a script which combines a local drive crawler with data pulled from the DB. This is possible because our DB driven websites usually have a table containing records of all webpages with title and description.

There are online tools and programs available that perform a crawl of your public website and save the result as sitemap. We prefer a do-it-yourself approach.

Final Thoughts

We test frequently if the xml format of the sitemap is correct. The availability of a sitemap is a point on the SEO checklist.

Resources

Sitemap XML Format

Simaec.net's sitemap is built with the local directory crawler

SEO crawler tools usually provide a sitemap xml export feature. Keep in mind that these tools usually detect only linked resources. Orphaned pages will be missed.