PreciseTarget Crawler Policy

Overview

PreciseTarget operates an automated web crawler to collect publicly available product information from ecommerce retailers for the purposes of product research, affiliate content creation, and affiliate marketing analysis.

Our crawler retrieves structured product metadata in a manner similar to major search engines, focusing primarily on structured data (e.g., JSON-LD), canonical product URLs, and publicly visible product attributes.

We are committed to responsible crawling practices and respect for retailer infrastructure.

Purpose

The PreciseTarget crawler is used to:

Refresh publicly available product metadata (e.g., name, price, availability, structured data)
Improve accuracy of affiliate product content
Conduct market research on product attributes and trends
Validate catalog information against publicly published product pages

We do not use the crawler to access gated, private, or paywalled content.

Scope of Crawling

Scope is limited to:

Publicly accessible product detail pages
Individual products as published in affiliate catalogs
Publicly embedded structured metadata (e.g., JSON-LD, Open Graph, schema.org markup)
Associated product metadata necessary to describe individual SKUs

We do not intentionally crawl:

Checkout flows
Customer accounts
Search result pagination at scale
APIs or endpoints not intended for public access
Administrative or backend interfaces

Robots.txt and Technical Controls

PreciseTarget Respects:

robots.txt directives
Crawl-delay instructions where specified
Disallow rules for user-agents
HTTP status codes indicating restricted access

If a site disallows crawling via robots.txt or direct request, we will honor that directive.

Rate Limiting and Infrastructure Impact

We design our crawler to:

Operate at conservative request rates
Distribute traffic over time
Avoid burst behavior
Immediately back off on elevated error rates (e.g., 429, 5xx responses)

We continuously monitor request volume and error signals to prevent operational impact.

If your infrastructure is affected, please contact us (crawler@precisetarget.com) and we will promptly investigate.

User-Agent Identification

All crawler requests identify with a descriptive user-agent string that includes:

Crawler name
Company name (PreciseTarget)
Contact email
Policy page URL

Example format:

PreciseTargetBot/1.0 (+https://precisetarget.com/crawler; crawler@precisetarget.com)

The user-agent string and policy page URL are consistent across documentation and production systems.

Opt-Out and Contact

If you would prefer that PreciseTarget not crawl your website, or if you have questions or concerns, please contact:

crawler@precisetarget.com

We aim to respond to opt-out and abuse inquiries within 4 business days.

Upon verified opt-out request, we will:

Cease crawling activity
Remove the domain from active crawl queues
Confirm suppression

Privacy and Data Handling

The PreciseTarget crawler collects only publicly available information, including:

Public URLs
Public HTML content
Public structured metadata (e.g., JSON-LD, schema.org data)
Robots.txt directives and crawl decisions
Timestamped retrieval metadata

We Do Not Collect:

Customer account data
Form submissions
Personal data entered into interactive workflows

Collected product metadata is retained for research, analytics, and affiliate content generation purposes. Data retention policies are reviewed periodically to ensure proportional storage.

For more information about how we handle data, please refer to our Privacy Policy.