Exclude pages from Indexing
-
Question
What is the best way to exclude multiple pages from search when they are not in the same folder structure?
Environment Production Reported product version M'21 Resolved in version M'21 Module Content Sources Answer
The best way to exclude the docs while crawling would be to add all the URLs to the "Should not crawl" section. Below are the steps:
- Navigate to Content Sources in Admin Panel
- Edit website type Content Source from which you want to exclude the docs.
- Navigate to Rule tab and click "By Filter"
- Add pages in "should not crawl" section. This will prevent these pages from being indexed and shown in search result pages.
Note: These URLs will also be excluded by auto crawlers that are running after a given frequency. Also, this feature is only available for website type content source.
Suggested Topics
-
Not able to index all documents of a Content Source
Created • Last Reply Last reply • Saurabh Jain
Search Clients -
Content Source crawling status and crawling logs
Created • Last Reply Last reply • Saurabh Jain
Content Sources -
Content Sources Authentication — Cheat Sheet
Created • Last Reply Last reply • sugrokker
Content Sources -
How to hide specific results from the crawled content without deleting?
Created • Last Reply Last reply • madhuri.tripathi
Search Clients -
Content Source Crawling issues
Created • Last Reply Last reply • Saurabh Jain
Content Sources -
How to Rename a Content Type?
Created • Last Reply Last reply • sugrokker
Content Sources -
Remove Archived Article from Search Results
Created • Last Reply Last reply • madhuri.tripathi
Content Sources