Back to AI Research

AI Research

Persona Conditioning of Brand Recommendations in Re... | AI Research

Key Takeaways

  • Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit When you ask an AI for a r...
  • The same prompt -- "best CRM software" -- reaches AI assistants from buyers in widely different contexts: a solo founder, an enterprise VP, a UK SMB owner.
  • We audit how strongly that contextual variation reshapes which brands the model recommends.
  • The effect is sharply prominence-stratified: category leaders are persona-resistant (~80% same-brand consistency across personas), but mid-market brands swap up to 75% of the recommendation set as the persona changes.
  • The Anthropic model shows a larger point-estimate effect than the OpenAI configurations, though clustered CIs overlap for the closer contrast (sonnet vs.
Paper AbstractExpand

The same prompt -- "best CRM software" -- reaches AI assistants from buyers in widely different contexts: a solo founder, an enterprise VP, a UK SMB owner. We audit how strongly that contextual variation reshapes which brands the model recommends. The audit samples 2,000 runs over a design space of 10 personas x 8 prompts x 3 model configurations x N=10 reps, with the two OpenAI cells at full 8-prompt coverage and the Anthropic sonnet-4.6 / low cell at 4-prompt coverage. Prefixing the user message with a persona drops the recommendation-set similarity (Jaccard) by Delta = -0.12 to -0.20 relative to a same-persona baseline (clustered 95% CIs exclude zero on all three measured cells; the sonnet cell's CI rests on only 4 prompt clusters and is correspondingly wider). The effect is sharply prominence-stratified: category leaders are persona-resistant (~80% same-brand consistency across personas), but mid-market brands swap up to 75% of the recommendation set as the persona changes. The Anthropic model shows a larger point-estimate effect than the OpenAI configurations, though clustered CIs overlap for the closer contrast (sonnet vs. OpenAI/high); the asymmetry is consistent with Anthropic's more retrieval-unattributed generation route (43-52% recommendations without observed retrieval-layer evidence, vs OpenAI's 8-29%, documented in Jack 2026). Any measurement of AI brand perception must condition on the buyer persona supplying the query: the same prompt produces materially different recommendation sets depending on who the model thinks is asking, and a measurement protocol that aggregates across personas systematically obscures that variation. The effect concentrates at mid-market and is largest on the most priors-reliant generation route in our audit, consistent with persona responsiveness growing as models lean more on training-data priors and richer context integration.

Persona Conditioning of Brand Recommendations in Retrieval-Augmented Commercial Chat: A Prominence-Stratified Cross-Provider Audit
When you ask an AI for a recommendation, the answer you receive is shaped by more than just your prompt; it is also influenced by the AI’s understanding of who you are. This research investigates how much a user’s persona—such as being a solo founder versus an enterprise executive—changes the brands an AI recommends. By auditing 2,000 chat interactions, the authors demonstrate that AI recommendations are not persona-agnostic; instead, they shift significantly based on the context provided, suggesting that AI-driven brand perception is highly sensitive to the buyer's identity.

How the Audit Was Conducted

The researchers tested three different AI model configurations using a set of 10 distinct personas and 8 common commercial prompts. To isolate the effect of the persona, they kept the prompt, temperature, and system instructions constant, only changing the persona description prefixed to the user's message. They measured the "Jaccard similarity"—a way of calculating how much the list of recommended brands overlaps—to see how often the AI swapped its suggestions when the persona changed. They also categorized brands by their market prominence (from category leaders to regional players) to see if some brands are more "persona-resistant" than others.

Key Findings: The Mid-Market Shift

The audit revealed that changing a persona causes a noticeable shift in recommendations, with similarity scores dropping by 12% to 20%. This effect is not uniform across all brands. Category leaders (the most famous brands) are largely "persona-resistant," meaning they are recommended regardless of who is asking. In contrast, mid-market brands are highly sensitive to persona, with up to 75% of the recommendation set changing depending on the user's context. The researchers also observed that the Anthropic model showed a larger shift in recommendations compared to the OpenAI models, which may be linked to how much the models rely on their internal training data versus external search results.

Why This Matters for AI Recommendations

The findings challenge the idea that AI assistants provide a neutral, objective list of "best" products. Instead, the study suggests that AI models are already performing a form of de facto market segmentation. Because the models lean on their internal "priors"—the patterns they learned during training—they naturally tailor their suggestions to fit the persona they perceive. This means that if a brand wants to understand how it is perceived by AI, it cannot rely on a single, aggregate measurement. Instead, brand perception must be analyzed through the lens of specific buyer personas, as the "best" recommendation is highly dependent on who the AI thinks is asking.

Important Considerations

While the results show a clear impact of persona on AI output, the researchers note a few limitations. The study focused on a specific set of models and a hand-curated list of personas, meaning the results might not apply to every AI assistant available. Additionally, while the Anthropic model showed a larger point-estimate effect, the statistical confidence intervals for different models overlap in some areas, meaning the exact ranking of which model is "most" responsive to personas is not definitive. Finally, the researchers emphasize that this study measures the distribution of recommendations rather than the accuracy of the advice, as there is no single "correct" answer in commercial recommendations.

Comments (0)

No comments yet

Be the first to share your thoughts!