...
Web scraping is increasingly used to extract a website's content and data, often conducted through automated via price bots and crawlers. For instance, competitors may target your site this way to retrieve content for various reasons. To discourage scraping of your Customer Self Service eCommerce Platform site, you can enable the Honeypot setting. This helps detect suspicious IP addresses and temporarily restricts them from accessing your site. Administrators can view the list of restricted IP addresses and remove them if needed. Separately, a suspicious activity report can be set up and automatically emailed to specific recipients to enhance monitoring.
Info | ||
---|---|---|
| ||
Not all bots are bad. But you will want to be aware of these bad onesones that could be bad for business:
|
How it works
IP addresses are flagged as suspicious when they access a special trap route on your site. This route to a 'hidden' page will not be accessed through usual browsing or by legitimate purchasing customers/website visitors.
Step-by-step guide
To enable and configure the setting,
- In the CMS, navigate to Settings → Feature Management.
- Select System, then enable Honeypot.
- Click Configure.
- To enable Honeypot on your website, toggle ON Enable Honeypot.
- In Honeypot Trap Route, enter the path of the hidden page. The route can be any name. NOTE - This name should be changed from time to time to counter scrapers from detecting that it is a 'hidden' page.
- In Ip Timeout Minutes, enter the number of minutes a restricted IP address will not be permitted to access your site. After the set timeout, the IP address will be able to access your site again. However, it will remain on the list of sites that have been restricted. NOTE - The timeout minutes must be a number higher than zero. If '0' is entered, it will default to '1'. Default: 60min.
- In Honeypot Code, leave the default code as it is.
- In Response Type, select either '404 - Not Found' or 'Response Message Content'. This determines the page type returned when the trap route is accessed.
- 404 - Not Found: the route has no page so the server returns a not found error.
- Response Message Content: the route leads to a meaningless content page
If 'Response Message Content' was selected, the Response Message Content editor automatically displays. Enter the content including formatting and styling for the page.
Tip This page should be edited to resemble other pages on your site.
- To save the settings, click Save or Save & Exit.
View restricted IP addresses list
All IP addresses that are restricted currently or in the past are listed in the Restricted IP Maintenance section of the Honeypot Settings screen. They will remain in this list even when expired unless manually deleted.
...
To delete an IP address, tick its Delete checkbox, then Save or Save & Exit.
Send suspicious activity report
Info |
---|
Implementing this function requires consultation with Commerce Vision. |
A scheduled task can be set up so that a suspicious activity report (CSV file) can be emailed to specific recipients at regular periods. The report will contain the following information:
- unexpected number of requests per time period for a given User is over a threshold,
- unexpected number of total requests for a set time period,
- and other custom data that can be queried.
...
.
Related help
Content by Label | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|