ShopFlixBot - Crawler Documentation & Seller Integration Guide
Overview
Shopflix uses a web crawler (ShopFlixBot) to download and process XML product feeds from seller websites. This document covers the crawler's behavior, network identity, and the steps sellers (or their technical teams) need to take to ensure uninterrupted access.
User-Agent
ShopFlixBot identifies itself using the HTTP header:
User-Agent: ShopFlixBotIn some cases, it may also use a generic browser-style header:
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/37.0.2062.94 Chrome/37.0.2062.94 Safari/537.36Crawled URLs
ShopFlixBot only accesses the following resources:
- XML product feed - the feed URL as registered in the Merchant Portal.
- Image URLs - image links referenced within the XML, required for product creation.
- Product page URLs - product links referenced within the XML, used for occasional validation.
IP Addresses
All requests originate from the following IPs:
| Protocol | Address |
|---|---|
| IPv4 | 18.193.166.252 |
| IPv4 | 3.125.27.163 |
Request Behavior
- ShopFlixBot maintains a reasonable request rate at all times.
- ShopFlixBot does not perform any form of authentication (no cookies, tokens, or login flows).
- ShopFlixBot requires access to seller XML URLs without rate limits.
Seller-Side Configuration
To ensure uninterrupted feed access, sellers must whitelist ShopFlixBot's IPs and User-Agent in their server and security infrastructure. The instructions below cover the most common scenarios.
General (Any Firewall / WAF / Rate Limiter)
- Whitelist the IPs listed above in your firewall and any rate-limiting tools.
- Allow the User-Agent
ShopFlixBotin any ACL or bot-filtering mechanism. - Exclude the IPs from rate limiting - ShopFlixBot needs unrestricted access to the XML feed endpoint.
Cloudflare-Specific Setup
Cloudflare security features (Bot Fight Mode, Managed Challenge, custom WAF rules) frequently block or challenge ShopFlixBot requests. Both steps below are required.
Step 1 - Add IPs to IP Access Rules
Navigate to Cloudflare Dashboard → Security → WAF → Tools → IP Access Rules and add each IP with the action Allow:
18.193.166.2523.125.27.163
Why this is critical: Bot Fight Mode can ignore WAF skip rules entirely. Adding the IPs to IP Access Rules is the only reliable way to bypass it. Additionally, non-Greek traffic may be subject to Managed Challenge via custom rules unless the source IP is explicitly allowed.
Step 2 - Create a WAF Skip Rule for the User-Agent
Navigate to Security → WAF → Custom Rules and create a rule that skips all security checks for requests matching:
(http.user_agent contains "ShopFlixBot")Note: If additional control mechanisms are in place (e.g., custom rules, managed challenges, third-party bot protection policies), requests may still be blocked even with this rule. Step 1 (IP Access Rules) takes precedence and should always be configured.
Testing & Troubleshooting
To simulate a ShopFlixBot request from the command line:
curl -v -A "ShopFlixBot" -L https://www.example.com/feed.xmlIf the feed is inaccessible, verify that both the IP whitelist and User-Agent skip rule are in place, and check your server/CDN access logs for blocked or challenged requests.