SeenRank

SeenRank Blog

How to find out which AI engines are sending you traffic

Updated 2026-05-13. By the SeenRank team.

Short answer: most AI engines don’t pass a clean Referer header to your server. That makes “how much traffic is ChatGPT sending me” surprisingly hard to answer with normal analytics. Four methods exist, each with tradeoffs. The right combination for most operators is referrer logs for the engines that do pass them (Perplexity, Google AI Overview), plus a lightweight client-side pixel that fingerprints the inbound visit pattern AI users have. Below is the full playbook.

Why this question is harder than it looks

Traditional analytics (Google Analytics, Plausible, server logs) work by reading the Referer HTTP header on inbound visits. The header tells you which page sent the user. For Google blue-link traffic this is trivial: the referrer is google.com/search, often with the keyword query string. AI engines fall into three buckets:

  • Pass a clean referrer: Perplexity (always), Google AI Overview (the source domain shows as google.com, partially distinguishable), Bing Copilot.
  • Pass an ambiguous referrer: Claude with Web (intermittent, often empty), Gemini (often shows as google.com).
  • Pass no referrer: ChatGPT inside the app (mobile/desktop), most embedded AI assistants. The user clicks your link and arrives with Referer: direct from your analytics’ point of view.

The result: a brand getting real ChatGPT traffic often sees zero attributable AI traffic in Google Analytics, because the inbound visits look like direct or organic-search visits. This is the “dark traffic” problem that AI attribution tools are built to solve.

Method 1: Referrer logs (free, partial coverage)

Open your server logs or analytics tool and filter by referrer hostname. The hostnames worth filtering for:

  • perplexity.ai – clean Perplexity referrals.
  • chat.openai.com, chatgpt.com – rare but exists when users click from the ChatGPT web app.
  • claude.ai – intermittent.
  • copilot.microsoft.com – Bing Copilot.
  • gemini.google.com – rare standalone, more often inside google.com attribution.
  • bing.com with AI Overview parameter (less reliable; analyze full URL).

Pros

Free, no instrumentation, retroactive. You can look at the last 90 days right now.

Cons

Massively undercounts ChatGPT (the highest-volume engine for most categories), undercounts Claude, undercounts Gemini-inside-Google. The number you get is real, but it’s the floor, not the ceiling. Often the true number is 3-10x larger.

How to set it up in 5 minutes

In Google Analytics 4: build a Custom Report with “Session source” as dimension, filter on hostnames above. In Plausible: filter by source. In server logs: grep "Referer: " access.log | sort | uniq -c with the hostnames above.

Method 2: UTM-tagged links you place in your own canonical content

The opposite-side approach: put UTM tags on the links inside content AI engines are likely to cite, so when an AI mentions you and the user clicks through, the referrer is preserved as a query string.

What to tag

Don’t tag every link on your site; that breaks normal SEO attribution. Tag the canonical pages you expect AI engines to cite (your pillar page, your free-tool landing, your strongest comparison pages) only when an external source links to them. For your own internal links, no UTM. For partner pages, podcast show notes, and guest posts, use ?utm_source=podcast-show-x&utm_medium=ai-content.

Pros

Lets you measure conversion attribution from specific content placements that AI engines tend to surface.

Cons

Doesn’t measure direct AI-engine referrals (you only see traffic from the partner page that linked to you). Mostly a complement to method 1 and method 4, not a standalone solution.

Method 3: Server-side log parsing for AI crawler signatures

The crawlers themselves leave fingerprints. GPTBot, ClaudeBot, PerplexityBot, Google-Extended, CCBot, and Applebot-Extended all identify themselves in User-Agent when they crawl. This doesn’t tell you which engines are sending users, but it tells you which engines are reading you, which is the upstream signal.

What to look for

  • Pages crawled by PerplexityBot in the last 30 days – these are the pages Perplexity is most likely to cite.
  • Pages crawled by GPTBot – same for ChatGPT.
  • Crawl frequency – a page hit by AI crawlers weekly is being treated as a citation candidate; a page hit once a quarter is being de-prioritized.

Pros

Free, no instrumentation, signals exactly which pages and which engines care about your content.

Cons

Doesn’t measure traffic, only crawler activity. Useful for content prioritization, not attribution.

Method 4: A purpose-built AI traffic pixel

The most accurate method. A small client-side script that runs on every page load, captures signals that AI-engine-referred users have (referrer where present, document.referrer, the click-source URL pattern, user-agent signals indicating the AI client), and attributes the visit to the correct upstream engine even when the standard Referer header is missing.

What a good AI traffic pixel does

  1. Captures document.referrer at the JavaScript level (sometimes available when the HTTP Referer header is stripped).
  2. Pattern-matches the inbound URL against known AI-engine query patterns (e.g. ChatGPT shared links, Perplexity result IDs, Claude conversation IDs in the URL fragment).
  3. Cross-references the user-agent against known AI in-app browser strings.
  4. Stores the attribution server-side so it survives the user’s session, not just the first page load.
  5. Surfaces results in a dashboard: which engines sent how much traffic, which queries triggered which visits, conversion rates by engine.

Pros

Solves the ChatGPT dark-traffic problem that methods 1-3 can’t. Catches the visits standard analytics miss.

Cons

Requires installing a script (one line of JS), and the attribution is heuristic rather than perfect. The accuracy is typically 70-90% depending on engine and category.

The SeenRank pixel

SeenRank includes a free AI traffic pixel for paid customers. One JavaScript snippet, drops on every page, surfaces AI-attributed traffic in your weekly report alongside visibility tracking. The pixel pairs naturally with the weekly audit data: you see both whether AI engines mention you and which ones are sending real visits.

The recommended combination for most operators

Start free, layer up as the data matters more:

  1. Week 1: Filter your existing analytics by AI-engine hostnames (method 1). Five minutes, free, gives you the floor number for Perplexity, Bing Copilot, and partial Gemini.
  2. Week 1, same session: Add UTM tags to your most important outbound links from podcast notes, guest posts, and partner placements (method 2). Free, takes 30 minutes.
  3. Week 2: Add server-log parsing for AI crawler signatures (method 3). Free, 30 minutes. Tells you which pages AI engines are reading, which is the upstream signal for what they’ll cite.
  4. Month 1+: If AI traffic matters to your business (and for any brand selling B2B, B2C software, or content products it does), install an AI traffic pixel (method 4). This is what closes the ChatGPT gap.

Run a check first, then worry about attribution

One caveat: attribution is mostly useful when you already know you’re being mentioned. If you haven’t yet checked whether AI engines cite you at all, do that first. A 30-second free check tells you whether AI traffic is even a thing for your brand. If it is, the attribution work above is worth doing. If it isn’t yet, fix that first and revisit attribution in 4-8 weeks.

Run a free SeenRank check →

FAQ

Why doesn’t Google Analytics show my ChatGPT traffic?

Because ChatGPT mostly doesn’t pass a Referer header on outbound clicks. Visits show up as (direct) or (none) in GA, indistinguishable from typed-URL traffic. This is the “dark traffic” problem.

Can I see which prompt sent me the traffic?

Almost never from the user side. Perplexity sometimes preserves the question in the URL, but ChatGPT, Claude, and Gemini generally don’t. The only way to know the prompt is to run audits from the engine side (which is what SeenRank does: it asks the engines the prompts, sees who they recommend, and tracks the trend).

Does Cloudflare Analytics catch AI traffic better than GA4?

Slightly. Cloudflare sees raw HTTP traffic and so catches the referrer header more reliably than GA4’s sampled tracking. But for the ChatGPT case (no referrer at all), it’s no better than GA4. The fix is still a client-side pixel, not a server-side analytics tool.

How accurate is heuristic AI attribution?

Honest answer: 70-90% depending on engine and category. Perplexity attribution is near-perfect because it passes a clean referrer. ChatGPT attribution is the hardest, and even purpose-built pixels miss some visits where the user has aggressive referrer-policy settings. The number is directionally accurate; treat it as a trend signal, not a precise count.

Should I bother tracking AI traffic if it’s under 5% of total?

Yes, because the curve is moving. AI search share is growing in nearly every category, and the brands that build attribution infrastructure now will have clean trend data by the time AI traffic becomes a top-three source. The work is small; the option value is large.

Run a free SeenRank check now →

Related: Free AI visibility checker: what to look for  ·  Can you pay to appear in ChatGPT answers?  ·  AI Search Visibility: the 2026 guide.