This article provides guidance for customers who need to allow-list SearchUnify crawler traffic at the CDN or WAF layer (e.g., Cloudflare, WP Engine Advanced Network) to prevent crawl failures caused by rate limiting, bot protection, or managed challenges.
When is allow-listing required?
Allow-listing may be required if:
- Your site is protected by Cloudflare, Akamai, Fastly, or a similar WAF/CDN
- You see crawl failures such as Error 1015 (rate limited), bot challenges, or repeated retries in SearchUnify crawl logs
- Your infrastructure blocks or challenges automated traffic by default
SearchUnify crawler IP addresses
SearchUnify crawler IP addresses can vary by environment (production, sandbox) and may change over time for security and infrastructure reasons.
👉 To obtain the correct and current crawler IP addresses for your environment, please raise a support ticket with SearchUnify Support.
When raising the ticket, include:
- Your SearchUnify instance name
- Environment (Production / Sandbox)
- Your CDN/WAF provider (e.g., Cloudflare, WP Engine)
Our support team will share the applicable IPv4/IPv6 details for safe allow-listing.
User-Agent strings used by SearchUnify crawlers
SearchUnify crawlers currently use standard browser-based User-Agent strings. Commonly observed examples include:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36
⚠️ Note: These User-Agent strings are not unique to SearchUnify and should not be relied on alone for bypass rules.
Recommended allow-listing approach
For best results, SearchUnify recommends:
- Primary match: Allow-list crawler traffic by IP address (obtained via support ticket)
- Secondary match (optional): Combine IP allow-listing with User-Agent matching if required by your CDN
- Bypass scope: Ensure bypass applies to:
- Rate limiting rules
- Bot protection
- Managed challenges
- WAF rules
This ensures uninterrupted crawl and indexing without weakening global site security.
Comments
0 comments
Please sign in to leave a comment.