Generate a Sitemap from URL (loc) and Last Modified (lastmod) Attributes
-
Site Map Generator is an open source program used to create a customized site map with two attributes:
<loc>
and<lastmod>
.<url> <loc> https://docs.searchunify.com/Content/Content-Sources/Jive.htm </loc> <lastmod>2020-05-07</lastmod> </url>
It offers several options as well. The most useful is probably
-b
which is used to specify the file types that shouldn’t be part of the sitemap.Prerequisites
- Server (or PC) with Python 3 or higher installed
- Server (or PC) with PIP3 installed
Method
- Go to this page and copy the code.
- Create a file named
sitemap_gen.py
.
~$ touch sitemap_gen.py
- Paste the code into
sitemap_gen.py
. - Generate a sitemap.
~$ python3 sitemap_gen.py https://yourwebsite.com/
- Depending on the size of the website, the command might continue to run for a few minutes. Once it has run its course, a file named
sitemap.xml
will be created.
Tips
To exclude certain file types from the sitemap, use the argument
-b
.~$ python3 sitemap_gen.py -b png -b jpeg https://yourwebsite.com/
The above command puts all the images into the sitemap except for png and jpeg images.
Suggested Topics
-
How to Rename a Content Type?
Content Sources • Created • Last Reply Last reply • sugrokker -
Content Source crawling status and crawling logs
Content Sources • Created • Last Reply Last reply • Saurabh Jain -
Restrict SU Search to a specific type of knowledge articles
Content Sources • Created • Last Reply Last reply • Saurabh Jain -
Remove Archived Article from Search Results
Content Sources • Created • Last Reply Last reply • madhuri.tripathi -
Exclude pages from Indexing
Content Sources • Created • Last Reply Last reply • Saurabh Jain -
New Content Source Crawling issues
Content Sources • Created • Last Reply Last reply • Saurabh Jain -
Content Sources Authentication — Cheat Sheet
Content Sources • Created • Last Reply Last reply • sugrokker