Back to AI Research

AI Research

Hierarchical Multi-Persona Induction from User Beha... | AI Research

Key Takeaways

  • Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas This paper introduces a new framework design...
  • Behavioral logs provide rich signals for user modeling, but are noisy and interleaved across diverse intents.
  • Recent work uses LLMs to generate interpretable natural-language personas from user logs, yet evaluation often emphasizes downstream utility, providing limited assurance of persona quality itself.
  • We propose a hierarchical framework that aggregates user actions into intent memories and induces multiple evidence-grounded personas by clustering and labeling these memories.
  • Experiments on a large-scale service log and two public datasets show that our method induces more coherent, evidence-grounded, and trustworthy personas, while also improving future interaction prediction.
Paper AbstractExpand

Behavioral logs provide rich signals for user modeling, but are noisy and interleaved across diverse intents. Recent work uses LLMs to generate interpretable natural-language personas from user logs, yet evaluation often emphasizes downstream utility, providing limited assurance of persona quality itself. We propose a hierarchical framework that aggregates user actions into intent memories and induces multiple evidence-grounded personas by clustering and labeling these memories. We formulate persona induction as an optimization problem over persona quality-captured by cluster cohesion, persona-evidence alignment, and persona truthfulness-and train the persona model using a groupwise extension of Direct Preference Optimization (DPO). Experiments on a large-scale service log and two public datasets show that our method induces more coherent, evidence-grounded, and trustworthy personas, while also improving future interaction prediction.

Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas
This paper introduces a new framework designed to turn messy, fragmented user activity logs into clear, reliable, and human-readable "personas." While many systems currently use AI to summarize user behavior, these summaries often lack transparency or fail to accurately reflect the underlying data. The authors propose a hierarchical method that organizes raw actions into meaningful "intent memories," which are then grouped into distinct personas. By training models to prioritize specific quality standards, the researchers aim to create user profiles that are not only useful for predicting future behavior but are also grounded in actual evidence and free from hallucinations.

From Raw Logs to Structured Memories

The process begins by taking a user’s daily behavioral logs—such as search queries and clicks—and using an AI model to summarize them into "intent memories." These memories represent specific goals or interests. Instead of treating a user’s history as a single, jumbled stream, the framework clusters these memories into multiple personas. Each persona acts as a distinct profile of a user’s characteristic, and crucially, each one is linked to an explicit set of supporting evidence (the specific memories used to create it). This ensures that every part of a persona can be traced back to the user's actual actions.

Defining and Measuring Persona Quality

A central challenge in user modeling is determining what makes a "good" persona. The authors define persona quality through three core criteria:

  • Cohesion: The memories grouped into a single persona must be semantically consistent and distinct from other memories.

  • Alignment: The persona’s label and description must accurately reflect the shared traits of the assigned memories.

  • Truthfulness: The persona must be supported by the evidence and avoid overgeneralizing or inventing information that isn't present in the logs.
    To enforce these standards, the researchers developed a reward-based training system. They use a technique called Direct Preference Optimization (DPO), which trains the model to prefer outputs that score highly on these quality metrics. This allows the model to learn how to generate personas that are both accurate and trustworthy.

Improving Performance and Reliability

The experimental results demonstrate that this quality-focused approach does more than just create better-looking profiles; it also improves the model's ability to predict future user interactions. By ensuring that personas are grounded in evidence, the system becomes more effective at understanding a user's long-term preferences. The researchers tested their method across different datasets, including search and shopping logs, and found that their approach consistently outperformed traditional clustering-based methods and general-purpose large language models. This suggests that explicitly optimizing for persona quality is a highly effective strategy for building more transparent and useful personalization systems.

Comments (0)

No comments yet

Be the first to share your thoughts!