PreciseTarget Crawler Policy

Overview

PreciseTarget operates an automated web crawler to collect publicly available product information from ecommerce retailers for the purposes of product research, affiliate content creation, and affiliate marketing analysis.

Our crawler retrieves structured product metadata in a manner similar to major search engines, focusing primarily on structured data (e.g., JSON-LD), canonical product URLs, and publicly visible product attributes.

We are committed to responsible crawling practices and respect for retailer infrastructure.

Purpose

The PreciseTarget crawler is used to:

  • Refresh publicly available product metadata (e.g., name, price, availability, structured data)

  • Improve accuracy of affiliate product content

  • Conduct market research on product attributes and trends

  • Validate catalog information against publicly published product pages

We do not use the crawler to access gated, private, or paywalled content.

Scope of Crawling

Scope is limited to:

  • Publicly accessible product detail pages

  • Individual products as published in affiliate catalogs

  • Publicly embedded structured metadata (e.g., JSON-LD, Open Graph, schema.org markup)

  • Associated product metadata necessary to describe individual SKUs

We do not intentionally crawl:

  • Checkout flows

  • Customer accounts

  • Search result pagination at scale

  • APIs or endpoints not intended for public access

  • Administrative or backend interfaces

Robots.txt and Technical Controls

PreciseTarget Respects:

  • robots.txt directives

  • Crawl-delay instructions where specified

  • Disallow rules for user-agents

  • HTTP status codes indicating restricted access

If a site disallows crawling via robots.txt or direct request, we will honor that directive.

Rate Limiting and Infrastructure Impact

We design our crawler to:

  • Operate at conservative request rates

  • Distribute traffic over time

  • Avoid burst behavior

  • Immediately back off on elevated error rates (e.g., 429, 5xx responses)

We continuously monitor request volume and error signals to prevent operational impact.

If your infrastructure is affected, please contact us (crawler@precisetarget.com) and we will promptly investigate.

User-Agent Identification

All crawler requests identify with a descriptive user-agent string that includes:

  • Crawler name

  • Company name (PreciseTarget)

  • Contact email

  • Policy page URL

Example format:

PreciseTargetBot/1.0 (+https://precisetarget.com/crawler; crawler@precisetarget.com)

The user-agent string and policy page URL are consistent across documentation and production systems.

Opt-Out and Contact

If you would prefer that PreciseTarget not crawl your website, or if you have questions or concerns, please contact:

crawler@precisetarget.com

We aim to respond to opt-out and abuse inquiries within 4 business days.

Upon verified opt-out request, we will:

  • Cease crawling activity

  • Remove the domain from active crawl queues

  • Confirm suppression

Privacy and Data Handling

The PreciseTarget crawler collects only publicly available information, including:

  • Public URLs

  • Public HTML content

  • Public structured metadata (e.g., JSON-LD, schema.org data)

  • Robots.txt directives and crawl decisions

  • Timestamped retrieval metadata

We Do Not Collect:

  • Customer account data

  • Form submissions

  • Personal data entered into interactive workflows

Collected product metadata is retained for research, analytics, and affiliate content generation purposes. Data retention policies are reviewed periodically to ensure proportional storage.

For more information about how we handle data, please refer to our Privacy Policy.