AMC Lookalike Audience for Prime Day: Seed SQL, Sizing Query, and the 500-500,000 user_id Window

TL;DR

Amazon Marketing Cloud (AMC) lookalike audience seeds should contain between 500 and 500,000 distinct user_id values before submission. The seed query runs in AMC Audiences against conversions_for_audiences, while the sizing companion runs in the main AMC editor against conversions_all. Do not end the seed query on a -- comment line, and submit the Prime Day audience with a fixed time window plus refreshRateDays set to 7.

Environment requirements

Table variants used: conversions_for_audiences, conversions_all
Required subscriptions: Amazon Marketing Cloud access, AMC Audiences access, Flexible Shopping Insights only if using Subscribe & Save event subtypes
Lookback window: Use last year's promotional event window for Prime Day seed construction; size the seed immediately before submission.
API compatibility: AMC Audiences API through Amazon Ads API
Schema version: AMC Audiences playbook pattern, verified against Atlas corpus notes current to 2026-05-13
Last verified: 2026-05-13

What success and failure look like

seedStrategy	sizingResult	status	operatorDecision
Subscribe & Save subscribers	28,400 user_ids	Activates	Strong brand-affinity seed; test first if SnS behavior is strategically important.
Multi-ASIN purchasers	161,750 user_ids	Activates	Broad cross-sell seed; likely the safest first Prime Day prospecting audience.
Total spend >= $250	4,890 user_ids	Activates	High-value long-tail seed; monitor reach because it can be narrow.
Total spend >= $500	470 user_ids	Fails refresh	Below the 500-user minimum; loosen the threshold before submitting.

Query-to-API payloads

AMC Audiences lookalike submission payload

Representative payload shape after the seed query and sizing companion pass validation.

{"audienceName":"prime_day_sns_lookalike_balanced_2026","advertiserId":"1234567890","audienceType":"LOOKALIKE","query":"<flattened seed SQL ending with GROUP BY 1>","timeWindowStart":"2025-07-01T00:00:00Z","timeWindowEnd":"2025-07-15T23:59:59Z","timeWindowRelative":false,"refreshRateDays":7,"lookalikeAudienceExpectedReach":"BALANCED","destination":{"type":"AMAZON_DSP"}}

The Slack message landed six weeks before Prime Day: "Can we build a lookalike audience of customers similar to our best buyers, sized correctly so it actually activates?"

This sounds like one SQL query. It is really four decisions: which customers count as "best," which Amazon Marketing Cloud table variant allows SELECT user_id, which Amazon Standard Identification Number (ASIN) filters belong in the seed, how large the seed can be before the lookalike model refuses it, and how to size the seed before spending two days waiting on a refresh. Get one of those wrong and the audience either fails silently or trains on the wrong population.

So I asked our agent. The agent has Amazon Agent Atlas behind it: a corpus of AMC playbooks, instructional queries, and audience patterns indexed for semantic retrieval. Here's what came back.

What a model without Atlas gets wrong in AMC lookalike seed generation

I ran the same prompt through a frontier model with no retrieval. The output was confident, would have compiled in the wrong query editor, and was wrong about every threshold that decides whether the audience activates.

Five failure modes in the un-grounded response:
It picked one seed definition, "top 20% of spenders," and committed to it. The actual playbook recommends three distinct seed strategies tested separately first: Subscribe & Save (SnS) subscribers, multi-ASIN purchasers, and a total-spend threshold.
It quoted a seed size minimum of "a few thousand users" and no upper bound. The real guardrail is specific: the seed should contain between 500 and 500,000 distinct user_id values.
It wrote the seed query against conversions and used SELECT DISTINCT user_id in the outer query. AMC Audiences requires the _for_audiences table family, specifically conversions_for_audiences, for this seed shape.
It did not size the seed before submission. The playbook has a companion measurement query that runs in the main editor against conversions_all and returns the seed count before the audience refresh starts.
It ended the query on a -- comment line. AMC Audiences can reject submissions whose last line is a comment, so the template intentionally ends on executable SQL.

None of these failures look wrong on a casual read. They are the kind of mistakes you catch after submitting an audience, waiting for refresh, and watching the status sit there.

What Atlas retrieves for an AMC Prime Day lookalike audience

When the agent gets the question, it does a semantic search across the amazon_ads collection and pulls five chunks before writing a line of SQL:

The AMC Lookalike Audiences for Promotional Events playbook, version-tagged 2023-10-01: the canonical Prime Day and Black Friday workflow covering ASIN selection, seed creation, flight-time analysis, and activation.
The Introduction to AMC Lookalike Audiences instructional query (IQ): the mechanics behind seed scoring, the five expansion types, and the three-seed template for high-value customers.
The Companion measurement query chunk from the lookalike audiences IQ: the SELECT COUNT(user_id) FROM (...) pattern that runs in the main AMC editor.
The Creating Audiences Based on High Value Customer Segments playbook: the percentile-rank variant for "top X% by spend" seeds.
The Flexible Shopping Insights Trial Guide Section 5: SnS-specific seed patterns, including firstSnSOrder and repeatSnSOrder event subtype notes for advertisers with Flexible Shopping Insights (FSI).

The agent does not invent the SQL. It surfaces the right template with the right caveats, then adapts it.

Seed strategies and expansion types for AMC lookalike audiences

The playbook gives the operator two taxonomies to make explicit before submission.

Taxonomy	Option	When to use it
Seed strategy	SnS subscribers	Use when repeat subscription behavior is the clearest signal of loyalty.
Seed strategy	Multi-ASIN purchasers	Use when cross-catalog buying is more important than a single-product purchase.
Seed strategy	Total spend threshold	Use when revenue concentration matters more than purchase frequency.
Expansion type	Most Similar	Start here when performance matters more than reach.
Expansion type	Similar	Use when the seed is strong but the campaign needs more scale.
Expansion type	Balanced	Practical default for Prime Day prospecting when you need both reach and relevance.
Expansion type	Broad	Use when the seed is valid but projected audience size is too constrained.
Expansion type	Most Broad	Use for reach-first testing, not for the first high-efficiency launch.

This block matters because it prevents the model from collapsing three separate operator decisions into one vague "high-value lookalike" audience.

The agent produced two artifacts. First, the seed query, lifted from the three-seed template in the Introduction to AMC Lookalike Audiences instructional query with the optional clauses set for the SnS strategy:

/* Audience instructional query: Introduction to lookalike audiences (High Value Customers)
   Run in the AMC Audiences query editor, not the main editor.
   Last verified: 2026-05-13.

   Three seed strategies are supported below. Test them separately first:
   [1 of 4]: ASIN filter
   [2 of 4]: SnS subscribers
   [3 of 4]: Multi-ASIN purchasers
   [4 of 4]: Total purchase value threshold

   Keep the final GROUP BY 1 as executable SQL so the query does not end
   on a comment line.
*/

WITH user_sales_cte AS (
  SELECT
    user_id,
    CASE
      WHEN event_subtype = 'snsSubscription' THEN 1
      ELSE 0
    END AS sns_flag,
    SUM(total_units_sold) AS total_purchases,
    SUM(total_product_sales) AS total_product_sales,
    COUNT(DISTINCT tracked_item) AS unique_items_purchased
  FROM conversions_for_audiences
  WHERE event_subtype IN ('snsSubscription', 'order')
    /* AND tracked_asin IN ('B0HERO0001','B0HERO0002') */
    AND user_id IS NOT NULL
  GROUP BY 1, 2
),
user_aggregate AS (
  SELECT
    user_id,
    CASE WHEN unique_items_purchased > 1 THEN 1 ELSE 0 END AS multi_purchase_flag,
    MAX(sns_flag) AS sns_flag,
    SUM(total_purchases) AS total_purchases,
    SUM(total_product_sales) AS total_product_sales,
    SUM(unique_items_purchased) AS unique_items_purchased
  FROM user_sales_cte
  GROUP BY 1, 2
),
audience_grouping AS (
  SELECT
    user_id,
    sns_flag,
    MAX(multi_purchase_flag) AS multi_purchase_flag,
    SUM(total_purchases) AS total_purchases,
    SUM(total_product_sales) AS total_product_sales,
    SUM(unique_items_purchased) AS unique_items_purchased
  FROM user_aggregate
  GROUP BY 1, 2
)
SELECT user_id
FROM audience_grouping
WHERE sns_flag = 1
  -- AND multi_purchase_flag = 1
  -- AND total_product_sales >= 250
GROUP BY 1;

The event_subtype IN ('snsSubscription', 'order') filter keeps the inner CTE focused on purchase behavior. That prevents cart additions, wishlist saves, or other engagement events from diluting the seed. Filtering there is cheaper and cleaner than trying to fix the population at the final WHERE.

The audience_grouping CTE looks redundant, and the agent kept it anyway. The playbook leaves that pass in place because the model trainer expects a clean row-grain. Stripping it can still work, but the failure mode is opaque enough that the safer version is worth the extra CTE.

The optional clauses stay visible because the operator should run SnS, multi-ASIN, and spend-threshold seeds separately before combining anything. Pre-baking that comparison into the template matches the way operators actually test seeds.

How to size an AMC seed audience before submission

Before submitting any seed to AMC Audiences, the agent produced the sizing companion. This is the part the un-grounded model skipped.

/* Companion measurement query.
   Run in the main AMC query editor, not the Audiences editor.
   Last verified: 2026-05-13.

   Change conversions_for_audiences to conversions_all for sizing.
   Keep SELECT user_id inside the subquery; COUNT() wraps it outside.
*/

SELECT COUNT(user_id) AS user_count
FROM (
  WITH user_sales_cte AS (
    SELECT
      user_id,
      CASE WHEN event_subtype = 'snsSubscription' THEN 1 ELSE 0 END AS sns_flag,
      SUM(total_units_sold) AS total_purchases,
      SUM(total_product_sales) AS total_product_sales,
      COUNT(DISTINCT tracked_item) AS unique_items_purchased
    FROM conversions_all
    WHERE event_subtype IN ('snsSubscription', 'order')
      AND user_id IS NOT NULL
    GROUP BY 1, 2
  ),
  user_aggregate AS (
    SELECT
      user_id,
      MAX(sns_flag) AS sns_flag
    FROM user_sales_cte
    GROUP BY 1
  )
  SELECT user_id
  FROM user_aggregate
  WHERE sns_flag = 1
)
GROUP BY 1;

The query returns one number: the distinct user_id count of the seed. If it is under 500, the audience refresh can fail. If it is over 500,000, the refresh can also fail. The practical habit is to stay comfortably inside the band so a seasonal data swing does not push the audience across either edge.

The diagnostic table above is the operator checkpoint. The fourth row, Total spend >= $500 returning 470 users, is the negative result that saves the most time. Without the sizing companion, the operator would discover that failure only after submitting the audience.

The footnotes the agent surfaces for AMC Audiences

This is the part that separates an Atlas-grounded agent from a fluent one. The agent did not wait to be asked. It surfaced the caveats the operator was about to need:

Things Atlas surfaced that the operator did not ask for:
Seed size has hard boundaries. The seed should contain 500 to 500,000 user_id values. Always run the sizing query first.
Test the three seed strategies separately. Combining SnS, multi-ASIN, and spend thresholds too early can create a seed that is technically valid but strategically empty.
Expansion type is a decision, not a default. Balanced is a practical starting point, but Most Similar, Similar, Broad, and Most Broad change the reach-performance tradeoff.
Lookalikes address the non-ad-exposed gap. Rule-based audiences are useful for remarketing; Prime Day prospecting usually needs users who share seed traits but were not already in the ad-exposed pool.
The query should not end on a comment line. Keep an executable final line such as GROUP BY 1.
The promotional-event variant uses a fixed window. For Prime Day, the API payload should use the prior promotional window rather than a rolling relative window.

Any one of these can eat an afternoon after submission. Getting all six before the first API call is the difference between a day-one launch and a day-three debugging thread.

What happens next: submit the AMC Audiences API payload

The seed query, sizing query, and diagnostic table close the loop on creation but not activation. The next move is to flatten the chosen seed SQL into an AMC Audiences API payload, submit it with audienceName, advertiserId, timeWindowStart, timeWindowEnd, refreshRateDays, timeWindowRelative, and lookalikeAudienceExpectedReach, then wait for the audience to become available for Amazon demand-side platform (DSP) activation.

For Prime Day, timeWindowRelative should be false because the seed is tied to a fixed promotional window. refreshRateDays: 7 keeps the audience fresh without turning the seed into a rolling interpretation of last year's event. lookalikeAudienceExpectedReach: "BALANCED" is a defensible first pass because it gives the team enough scale to test without starting at the loosest expansion setting.

The evaluation pass comes after activation. Compare the lookalike audience against last year's rule-based audience baseline, then decide whether to keep the SnS seed, rotate to multi-ASIN purchasers, or loosen the spend threshold.

Why this matters for Prime Day audience activation

A lookalike audience that fails silently is worse than no audience. The operator has staged creative and media against an audience assumption, and the refresh can still be pending when the event opens. The rules that decide success are knowable, but they are scattered across a playbook, an instructional query, an API reference, and paid-feature notes.

Atlas is not a model upgrade. It is the corpus made available at the moment of need. The agent did not have to remember the 500-to-500,000 band, the _for_audiences suffix rule, the fixed-window payload, or the comment-line restriction. It looked them up.

If your agents are guessing at AMC seed sizes, they do not have to be.

Next: use the flight-time conversion workflow to evaluate whether the shipped lookalike audience outperformed the rule-based audience from the previous Prime Day.

Keep exploring this topic

Use these companion guides to understand the inputs, follow-on analysis, and adjacent workflows behind this playbook.

Start here

Before this: validate whether SnS should be a seed

Use Subscribe & Save lift analysis to decide whether subscribers are valuable enough to seed prospecting.

Next step

After this: evaluate the activated audience

Compare the shipped lookalike audience against conversion-path and flight-time performance signals.

Also useful

Related: retargeting audience sizing

The cart-abandoner workflow uses the same _for_audiences table family and pre-submit sizing habit.

FAQ

What seed size does an AMC lookalike audience need?

Use a seed between 500 and 500,000 distinct user_id values. Below 500 or above 500,000, the refresh can fail before the lookalike audience becomes usable.

Can AMC lookalike seeds use the conversions table?

No. The seed query should use conversions_for_audiences in the AMC Audiences editor. Use conversions_all only in the companion sizing query that runs in the main AMC editor.

Why does AMC say user_id is not selectable?

That usually means the query is running against the wrong table variant for an audience build. Switch the seed query from conversions or conversions_all to conversions_for_audiences.

Why did my AMC audience refresh fail with no useful error?

The most common causes are seed size outside the 500-500,000 user_id window, a query ending on a -- comment line, or submitting the sizing-query table variant instead of the Audiences table variant.

Which AMC lookalike expansion type should I start with for Prime Day?

Balanced is a practical starting point because it trades off reach and similarity. Move toward Most Similar for tighter performance or Broad and Most Broad when the seed is valid but reach is too constrained.

Should I combine Subscribe & Save, multi-ASIN, and spend-threshold seeds?

Test them separately first. Combining seed filters too early can create an over-specific audience that passes SQL validation but produces a weak or tiny lookalike model.

Sources

Amazon Marketing Cloud product overview — Amazon Ads
Extend reach while maintaining relevancy with Amazon Marketing Cloud lookalike audiences — Amazon Ads
Amazon Marketing Cloud announces custom audience creation and activation feature — Amazon Ads
Amazon Marketing Cloud APIs are now part of the Amazon Ads API — Amazon Ads