PreciseTarget Crawler Policy
Overview
PreciseTarget operates an automated web crawler to collect publicly available product information from ecommerce retailers for the purposes of product research, affiliate content creation, and affiliate marketing analysis.
Our crawler retrieves structured product metadata in a manner similar to major search engines, focusing primarily on structured data (e.g., JSON-LD), canonical product URLs, and publicly visible product attributes.
We are committed to responsible crawling practices and respect for retailer infrastructure.
Purpose
The PreciseTarget crawler is used to:
Refresh publicly available product metadata (e.g., name, price, availability, structured data)
Improve accuracy of affiliate product content
Conduct market research on product attributes and trends
Validate catalog information against publicly published product pages
We do not use the crawler to access gated, private, or paywalled content.
Scope of Crawling
Scope is limited to:
Publicly accessible product detail pages
Individual products as published in affiliate catalogs
Publicly embedded structured metadata (e.g., JSON-LD, Open Graph, schema.org markup)
Associated product metadata necessary to describe individual SKUs
We do not intentionally crawl:
Checkout flows
Customer accounts
Search result pagination at scale
APIs or endpoints not intended for public access
Administrative or backend interfaces
Robots.txt and Technical Controls
PreciseTarget Respects:
robots.txt directives
Crawl-delay instructions where specified
Disallow rules for user-agents
HTTP status codes indicating restricted access
If a site disallows crawling via robots.txt or direct request, we will honor that directive.
Rate Limiting and Infrastructure Impact
We design our crawler to:
Operate at conservative request rates
Distribute traffic over time
Avoid burst behavior
Immediately back off on elevated error rates (e.g., 429, 5xx responses)
We continuously monitor request volume and error signals to prevent operational impact.
If your infrastructure is affected, please contact us (crawler@precisetarget.com) and we will promptly investigate.
User-Agent Identification
All crawler requests identify with a descriptive user-agent string that includes:
Crawler name
Company name (PreciseTarget)
Contact email
Policy page URL
Example format:
PreciseTargetBot/1.0 (+https://precisetarget.com/crawler; crawler@precisetarget.com)
The user-agent string and policy page URL are consistent across documentation and production systems.
Opt-Out and Contact
If you would prefer that PreciseTarget not crawl your website, or if you have questions or concerns, please contact:
We aim to respond to opt-out and abuse inquiries within 4 business days.
Upon verified opt-out request, we will:
Cease crawling activity
Remove the domain from active crawl queues
Confirm suppression
Privacy and Data Handling
The PreciseTarget crawler collects only publicly available information, including:
Public URLs
Public HTML content
Public structured metadata (e.g., JSON-LD, schema.org data)
Robots.txt directives and crawl decisions
Timestamped retrieval metadata
We Do Not Collect:
Customer account data
Form submissions
Personal data entered into interactive workflows
Collected product metadata is retained for research, analytics, and affiliate content generation purposes. Data retention policies are reviewed periodically to ensure proportional storage.
For more information about how we handle data, please refer to our Privacy Policy.