new AI metrics v2 - Ethan Young

## AI Assistant Tracking Requires New Metrics **Fewer clicks can still mean more revenue—if you’re cited.** Holding AOV constant, if incremental AI sessions (net of cannibalization) are $S_{\mathrm{ai}}/S_0 \approx 0.10$ (10%) and the conversion rate of AI-referred sessions is $\mathrm{CVR}_{\mathrm{ai}}/\mathrm{CVR}_{\mathrm{avg}} \approx 4.4$ (4.4×), then the relative revenue lift is $\approx 44\%$: $ \frac{\Delta R}{R_0}=\frac{S_{\mathrm{ai}}}{S_0}\cdot\frac{\mathrm{CVR}_{\mathrm{ai}}}{\mathrm{CVR}_{\mathrm{avg}}}. $ This shows how AI citations can offset lower search CTR—provided we measure what matters. So CTR alone no longer captures performance. Track cross-assistant (AIO, AI Mode, ChatGPT, Perplexity, Gemini, Claude, etc.) performance with: 1. **AI-referred Sessions:** Identify traffic from assistants (standardized UTMs, referrer allowlist, server-side tagging). 2. **AI Session Outcomes:** $\mathrm{CVR}$, $\mathrm{RPS}$ (revenue/session), AOV, LTV, refunds/returns. 3. **AI Citations:** Count brand mentions and links in assistant responses (owned domain and brand name variants). 4. **Share of Answer (SOA):** Your share of citations within an answer or a set of answers. 5. **Sub-query Rank (SQR):** Traditional SERP rank adapted to “query fan-out.” [Google Analytics 4 (GA4)](https://en.wikipedia.org/wiki/Google_Analytics) can capture AI-referred sessions and conversions; tracking AI citations, SOA, and SQR requires a lightweight study loop: **Study loop (minimum viable implementation)** 1. **Build** a representative query set $Q$ (100–200 long-tail prompts across intents—informational, evaluation, decision—branded and unbranded; grouped by topic). 2. **Sample** responses across target assistants; record whether you’re cited, how (mention vs link), and where (primary answer, collapsed source, footnote). 3. **Generate sub-queries** for each topic (3–10 diverse, assistant-style variants) and **pull organic ranks** for each search engine. 4. **Compute metrics** (SOA, SQR, lift estimates) and **re-test** at fixed intervals (e.g., weekly) with adequate sample sizes to handle variance. 5. **Segment** by assistant, topic cluster, intent, and brand/non-brand to spot where to invest. **Per-query SOA (single response)** $ \mathrm{SOA}(q)=\frac{\text{your brand citations in the response to }q}{\text{all brand citations in that response}}. $ **Aggregate SOA (over a query set $Q$)** $ \mathrm{SOA}(Q)=\frac{1}{|Q|}\sum_{q\in Q}\mathrm{SOA}(q). $ *Notes:* (i) “Citations” = count of your brand’s mentions or links to your company site/blog; (ii) consider a **weighted** SOA if you score prominence (e.g., top answer gt;$ collapsed source gt;$ footnote); (iii) to aggregate across assistants, compute $\mathrm{SOA}_a(Q)$ per assistant $a$ and average—optionally impression-weighted. **Sub-query Rank (SQR).** Using the assistant-generated sub-query set $Q$, measure how often you appear in the top $k$ organic results: $ \mathrm{SQR}@k=\frac{\left|\{\,q\in Q:\ r_q\le k\,\}\right|}{|Q|}. $ *Notes:* (i) $Q$ is the assistant-style sub-query set for a topic cluster; (ii) $r_q$ is your best organic rank for $q$ (set $r_q=\infty$ if you don’t rank within the window), so $\mathrm{SQR}@k$ is equivalent to recall@k; (iii) pick $k$ to match your funnel (e.g., $k\in\{3,10\}$); (iv) de-duplicate hosts when computing $r_q$ if you care about domain coverage rather than page count; (v) compute $\mathrm{SQR}_e@k$ per engine/assistant $e$ and average (optionally impression-weighted). > Are you growing SOA across your query set—and improving SQR on the sub-questions assistants actually generate—fast enough to turn declining CTR into higher RPS, LTV, and total revenue? **Implementation notes & gotchas** - Use **referrer allowlists** plus **server-side tagging** to recover assistant traffic that strips referrers. - Maintain a **brand dictionary** (name variants, ticker, product lines, author names) for reliable citation matching. - Store **citation position** to enable weighted SOA. - Log **assistant, model/version, timestamp, locale** to control for response variability. - Track **cannibalization** explicitly (e.g., model uplift vs holdout) when estimating $\Delta R/R_0$. - For SQR, cache engine responses and enforce **host de-duplication** if you’re measuring domain footprint.