Sitemap - The Why and How
Although a sitemap is a structural element calling for being associated with building a website, we add this topic to digital marketing because of its importance in promoting web pages and websites.
Why a Website needs a Sitemap?
There are two main reasons. First, it is a convenient way to keep track of all pages forming a website. It can be used as checklist when monitoring progress of changes covering all web pages. Second, it is the recommended format to communicate a website's inventory to search engines like Google or Bing.
Sitemap XML Format
A sitemap file is using xml format. Xml is a markup language like html although strict. Errors in syntax will make the data unreadable.
An url element in a sitemap has one mandatory tag named "loc" and a few optional tags like "lastmod", "priority" and "frequency". Of the optional tags, we use only lastmod. We consider priority and frequency as artificial values. Search engines ignore all optional tags.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://www.simaec.net/digital-marketing/sitemap/</loc> <lastmod>2022-03-22</lastmod> </url> ... </urlset>
How to build a Sitemap
There are different ways to build the sitemap. Small websites may use a manual approach, while larger websites are better served with a tool that creates the sitemap automatically.
Manual
Small websites with less than 10 pages, the sitemap.xml file can be maintained manually with a text editor. Make sure though that the xml syntax is valid using an online xml format checker.
Local Directory Crawler
For websites like simaec.net, we use a script which walks through the files and folders of the local drive. It is a simple python script.
Website Crawler
Another approach we use frequently is crawling the public website and retrieve urls of all internally linked webpages. We use this script not only for building a sitemap but also to monitor on page SEO elements like correct canonical links, existence of title tag and content and meta description.
More Methods
For database driven websites, we use a script which combines a local drive crawler with data pulled from the DB. This is possible because our DB driven websites usually have a table containing records of all webpages with title and description.
There are online tools and programs available that perform a crawl of your public website and save the result as sitemap. We prefer a do-it-yourself approach.
Final Thoughts
We test frequently if the xml format of the sitemap is correct. The availability of a sitemap is a point on the SEO checklist.
Resources
Simaec.net's sitemap is built with the local directory crawler
SEO crawler tools usually provide a sitemap xml export feature. Keep in mind that these tools usually detect only linked resources. Orphaned pages will be missed.