AMC Lookalike Audience Evaluation: Compare Prime Day Lift Against Last Year's Rule-Based Audience

TL;DR

Evaluate an Amazon Marketing Cloud (AMC) lookalike audience after the promotional flight closes, not while attribution is still filling in. Compare the lookalike against last year's rule-based audience on reach, conversion rate, new-to-brand share, cost per new-to-brand customer, and spend-normalized lift. If the lookalike scales reach but loses NTB efficiency, rotate the seed strategy before the next event.

Environment requirements

Table variants used: dsp_impressions, dsp_clicks, amazon_attributed_events_by_traffic_time
Required subscriptions: Amazon Marketing Cloud access, Amazon DSP campaign reporting, AMC audience activation history
Lookback window: Compare matched promotional flight windows; wait until post-flight attribution has settled before reading lift.
API compatibility: AMC reporting query output and Amazon DSP audience reporting
Schema version: AMC promotional-event evaluation pattern, verified against Atlas corpus notes current to 2026-05-13
Last verified: 2026-05-13

What success and failure look like

scenario	signal	operatorDecision
Lookalike wins	NTB share up 18%, cost per NTB customer down 12%	Keep the seed and test a tighter expansion type next event.
Reach wins, efficiency loses	Reach up 41%, NTB share down 9%, cost per NTB customer up 22%	Rotate from Broad or Balanced toward Similar, or rebuild the seed with stricter loyalty signals.
Baseline still wins	Rule-based audience has higher conversion rate and lower cost per order	Use the rule-based audience for lead-out retargeting and reserve lookalikes for earlier prospecting.
Read is inconclusive	Spend or impressions differ by more than 25% between cohorts	Normalize by spend or rerun the comparison with matched budget windows.

Query-to-API payloads

Scheduled AMC evaluation job

Representative job configuration for running the comparison after the promotional attribution window settles.

{"jobName":"prime_day_lookalike_vs_rule_based_evaluation","queryName":"amc_lookalike_lift_comparison","schedule":"once_after_flight_close","waitDaysAfterFlightEnd":14,"parameters":{"lookalikeCampaignId":"LOOKALIKE_CAMPAIGN_ID","baselineCampaignId":"BASELINE_CAMPAIGN_ID","lookalikeWindowStart":"2026-07-01T00:00:00Z","lookalikeWindowEnd":"2026-07-15T23:59:59Z","baselineWindowStart":"2025-07-01T00:00:00Z","baselineWindowEnd":"2025-07-15T23:59:59Z"},"outputs":["reached_users","conversion_rate","ntb_share","cost_per_ntb_order","roas"]}

The Slack message came in two days after the Prime Day campaign closed: "The lookalike audience shipped. Did it actually beat last year's rule-based audience, or did we just add a new prospecting line item?"

That question should not be answered from an Amazon DSP screenshot. The operator needs a matched-window read: same promotional period, comparable spend, and the same conversion definition. If the lookalike audience reached more shoppers but bought new-to-brand customers at a worse cost, it did not win. It scaled noise. If it held cost per new-to-brand customer while expanding reach, the seed is worth keeping.

So I asked our agent in Claude. The agent has Amazon Agent Atlas behind it: a corpus of AMC playbooks, instructional queries, reporting patterns, and audience notes. The same prompt would work in ChatGPT because Kuudo exposes that Amazon context through standards-based Model Context Protocol (MCP), not through one chat app. If the follow-up belongs in code, Cursor or Codex can use the same server. If it needs to run weekly, n8n or Make can call the same workflow. The client changes; the grounded Amazon context does not.

What ChatGPT or Claude without Atlas gets wrong in lookalike evaluation

I ran the same ask without retrieval. The answer sounded reasonable: pull conversions, compare totals, report lift. That is not enough for a Prime Day audience decision.

Five failure modes in the un-grounded response:
It compared the lookalike audience to total campaign performance instead of to last year's rule-based audience baseline. That makes the lookalike inherit credit from every other tactic in the flight.
It treated reach as the win condition. For Prime Day prospecting, reach only matters if new-to-brand efficiency holds.
It ignored the attribution settling period. Reading the audience too early undercounts late conversions and can make the model look worse than the baseline.
It did not normalize for spend. A line item with 40% more spend should produce more conversions; the question is whether it produced cheaper or more incremental customers.
It did not create a next action. The operator needs a keep, rotate, or rebuild decision, not a dashboard summary.

This is the same problem as the seed post in reverse. The model can write the query-shaped thing. The Atlas-grounded agent knows which comparison decides the next Prime Day seed.

What Atlas retrieves for Prime Day lookalike measurement

When the agent gets the question, it pulls the audience-building post's downstream playbook pieces before writing SQL:

The AMC Lookalike Audiences for Promotional Events playbook: the source of the Prime Day lead-in and lead-out logic, plus the expectation that performance is evaluated after activation rather than at seed creation.
The Introduction to AMC Lookalike Audiences instructional query: the expansion-type context that explains why Balanced, Similar, and Broad should be evaluated differently.
The Customer Journey Analytics playbook: the follow-on analysis path for checking whether the lookalike is creating new conversion journeys or entering journeys the rule-based audience already covered.
The DSP and AMC conversion reporting table notes: the table-shape reminders for dsp_impressions, dsp_clicks, and amazon_attributed_events_by_traffic_time.
The Subscribe & Save lift analysis pattern: the upstream check that tells the operator whether SnS behavior was strong enough to be a seed in the first place.

The seed can also be rebuilt around Amazon Standard Identification Number (ASIN) behavior if the evaluation shows that subscription buyers reached well but did not produce enough new customers.

The retrieved set matters because this is not just a measurement query. It is a decision about whether the next seed should stay SnS, rotate to multi-ASIN purchasers, or use a spend threshold.

The agent's working output: AMC lookalike lift SQL

The agent produced an evaluation skeleton that compares the activated lookalike line item with a prior rule-based audience baseline. The placeholder IDs are intentionally visible so the operator can audit what is being compared before running it.

/* AMC lookalike audience evaluation skeleton
   Run after the promotional flight closes and attribution has settled.
   Last verified: 2026-05-13.

   Replace:
  - LOOKALIKE_CAMPAIGN_ID with the Prime Day lookalike line item/campaign
   - BASELINE_CAMPAIGN_ID with last year's rule-based audience campaign
   - flight windows with matched Prime Day periods
*/

WITH comparison_windows AS (
  SELECT
    'lookalike_2026' AS cohort,
    TIMESTAMP '2026-07-01 00:00:00' AS window_start,
    TIMESTAMP '2026-07-15 23:59:59' AS window_end,
    'LOOKALIKE_CAMPAIGN_ID' AS campaign_id
  UNION ALL
  SELECT
    'rule_based_2025' AS cohort,
    TIMESTAMP '2025-07-01 00:00:00' AS window_start,
    TIMESTAMP '2025-07-15 23:59:59' AS window_end,
    'BASELINE_CAMPAIGN_ID' AS campaign_id
),
impressions AS (
  SELECT
    w.cohort,
    i.user_id,
    COUNT(*) AS impressions,
    SUM(i.total_cost) AS spend
  FROM comparison_windows w
  JOIN dsp_impressions i
    ON i.campaign_id = w.campaign_id
   AND i.impression_dt_utc BETWEEN w.window_start AND w.window_end
  WHERE i.user_id IS NOT NULL
  GROUP BY 1, 2
),
conversions AS (
  SELECT
    w.cohort,
    a.user_id,
    COUNT(*) AS orders,
    SUM(a.total_product_sales) AS sales,
    SUM(CASE WHEN a.new_to_brand = TRUE THEN 1 ELSE 0 END) AS ntb_orders
  FROM comparison_windows w
  JOIN amazon_attributed_events_by_traffic_time a
    ON a.campaign_id = w.campaign_id
   AND a.event_dt_utc BETWEEN w.window_start AND w.window_end
  WHERE a.conversion_event_subtype = 'order'
    AND a.user_id IS NOT NULL
  GROUP BY 1, 2
),
cohort_rollup AS (
  SELECT
    i.cohort,
    COUNT(DISTINCT i.user_id) AS reached_users,
    SUM(i.impressions) AS impressions,
    SUM(i.spend) AS spend,
    COUNT(DISTINCT c.user_id) AS converting_users,
    SUM(c.orders) AS orders,
    SUM(c.sales) AS sales,
    SUM(c.ntb_orders) AS ntb_orders
  FROM impressions i
  LEFT JOIN conversions c
    ON i.cohort = c.cohort
   AND i.user_id = c.user_id
  GROUP BY 1
)
SELECT
  cohort,
  reached_users,
  spend,
  orders,
  sales,
  ntb_orders,
  orders / NULLIF(reached_users, 0) AS conversion_rate,
  ntb_orders / NULLIF(orders, 0) AS ntb_share,
  spend / NULLIF(ntb_orders, 0) AS cost_per_ntb_order,
  sales / NULLIF(spend, 0) AS roas
FROM cohort_rollup
ORDER BY cohort;

Three choices matter in that SQL. First, the comparison is campaign-scoped, not account-scoped. The lookalike line item should not borrow credit from branded search, retargeting, or other Prime Day tactics.

Second, the windows are explicit. If you compare a 2026 lead-in window to a 2025 full-event window, the result will look precise and be useless. The agent forces both cohorts into matched promotional periods.

Third, the output includes cost_per_ntb_order, not just return on ad spend (ROAS). For a lookalike audience, the question is not only whether it sold product. The question is whether it found new customers at a cost worth repeating in Amazon demand-side platform (DSP) activation.

The footnotes the agent surfaces before you trust the lift read

Things Atlas surfaced that the operator did not ask for:
Do not evaluate before attribution settles. A same-day read can undercount late attributed conversions and distort the lookalike result.
Spend-normalize before calling lift. More spend usually means more orders. Efficiency metrics decide whether the audience actually improved the plan.
Compare to the right baseline. Last year's rule-based audience is a better baseline than total Prime Day performance because it isolates audience strategy.
Treat expansion type as a test variable. If Balanced scaled reach but lost new-to-brand efficiency, test Similar next. If Similar is efficient but too small, test Broad.
Use path analysis as the next diagnostic. If the lookalike appears in the same journeys as existing retargeting, it may not be creating new demand.

The footnotes change the shape of the readout. Without them, the operator gets a table. With them, the operator gets a decision tree.

What happens next: keep, rotate, or rebuild the seed

The agent turns the output into three possible decisions. If the lookalike improves new-to-brand share and lowers cost per new-to-brand order, keep the seed and test a tighter expansion type. If reach improves but efficiency falls, rotate the seed from SnS subscribers to multi-ASIN purchasers or a spend threshold. If the rule-based audience still wins on conversion rate and cost, keep lookalikes in the lead-in phase and use rule-based audiences for lead-out retargeting.

This is where the standards-based setup matters operationally. If you are in Claude, ask for the recommendation in plain language. If your team works in ChatGPT, ask the same question there. If the marketing analyst wants to inspect the SQL, open it in Cursor or Codex. If the read should run every Monday after a promotional flight, schedule it in n8n or Make. Kuudo speaks MCP, so the same Amazon context follows the work into whichever standards-supporting client your team prefers.

The follow-on analysis is the conversion path. If the lookalike wins, use the path-to-conversion Sankey workflow to see whether it is creating new journeys or merely joining paths your existing campaigns already owned.

Why standards-based AI clients make this more useful

This is not a Claude workflow, a ChatGPT workflow, or a Cursor workflow. It is an Amazon Marketing Cloud workflow exposed through a standard interface. The useful part is the stable context: the tables, playbooks, thresholds, and audience history. The chat or coding tool is just where the operator happens to be working.

That is the same reason the evaluation produces an action instead of a report. The agent can move from question to SQL to scheduling because the context is portable. It can answer in chat, edit in code, and run in a workflow tool without rewriting the operating logic.

If your lookalike readout stops at reach and ROAS, it has not answered the Prime Day question.

Next: use path-to-conversion analysis to see whether the lookalike audience created new customer journeys or just overlapped with existing retargeting paths.

Keep exploring this topic

Use these companion guides to understand the inputs, follow-on analysis, and adjacent workflows behind this playbook.

Start here

Before this: build and size the Prime Day seed

Use the seed SQL, sizing query, and API payload checks before any evaluation exists.

Next step

Next step: map the conversion path

Use path-to-conversion analysis to see whether the lookalike audience is creating new journeys or just entering existing ones.

Also useful

Also useful: Subscribe & Save lift

Validate whether subscription behavior is strong enough to remain the seed strategy.

FAQ

How do I evaluate an AMC lookalike audience after Prime Day?

Compare the activated lookalike audience against a matched rule-based audience or prior-year baseline on reach, conversion rate, new-to-brand share, cost per new-to-brand customer, and spend-normalized lift.

How long should I wait before reading AMC audience performance?

Wait until the promotional flight has closed and attribution has settled. Reading too early can undercount conversions and make the lookalike audience look weaker than it is.

What does it mean if lookalike reach is high but new-to-brand share is low?

The audience is scaling but not finding meaningfully new customers. Tighten the expansion type or rebuild the seed around stronger loyalty or high-value signals.

Should I compare a lookalike audience to a rule-based audience?

Yes. Rule-based audiences are the practical baseline because they show how known, directly targeted users performed without model expansion.

Why does my lookalike audience look worse than last year's audience?

The common causes are mismatched flight windows, unmatched spend levels, a too-broad expansion setting, or a seed that captured high-reach users instead of high-value buyers.

Can I run this evaluation from ChatGPT, Claude, Cursor, or n8n?

Yes, if those clients connect to the same standards-based MCP server. Ask in ChatGPT or Claude, edit the SQL in Cursor or Codex, and schedule the recurring read in n8n or Make.

Sources

Amazon Marketing Cloud product overview — Amazon Ads
Extend reach while maintaining relevancy with Amazon Marketing Cloud lookalike audiences — Amazon Ads
Amazon Marketing Cloud APIs are now part of the Amazon Ads API — Amazon Ads