Back to AI Research

AI Research

Teaching Values to Machines: Simulating Human-Like... | AI Research

Key Takeaways

  • Teaching Values to Machines: Simulating Human-Like Behavior in LLMs explores whether Large Language Models (LLMs) can be guided to adopt coherent, human-like...
  • Large Language Models (LLMs) demonstrate a remarkable capacity to adopt different personas and roles; however, it remains unclear whether they can manifest behavior that adheres to a coherent, human-like value structure.
  • In this work, we draw on established psychological value theory to induce human-like values in LLMs and assess their alignment with patterns observed in human studies.
  • Using validated psychological questionnaires, we conduct large-scale experiments -- over 5 million questions -- to evaluate value structures and value-behavior relationships in leading LLMs and compare them to humans.
  • Our findings reveal strong agreement between value-prompted LLMs and humans across both dimensions.
Paper AbstractExpand

Large Language Models (LLMs) demonstrate a remarkable capacity to adopt different personas and roles; however, it remains unclear whether they can manifest behavior that adheres to a coherent, human-like value structure. In this work, we draw on established psychological value theory to induce human-like values in LLMs and assess their alignment with patterns observed in human studies. Using validated psychological questionnaires, we conduct large-scale experiments -- over 5 million questions -- to evaluate value structures and value-behavior relationships in leading LLMs and compare them to humans. Our findings reveal strong agreement between value-prompted LLMs and humans across both dimensions. Moreover, incorporating human value distributions enhances population-level simulations with value-induced LLMs. These findings highlight the potential of value-induced LLMs as effective, psychologically grounded tools for simulating human behavior.

Teaching Values to Machines: Simulating Human-Like Behavior in LLMs explores whether Large Language Models (LLMs) can be guided to adopt coherent, human-like value systems. By applying psychological theories to machine learning, the researchers investigate if LLMs can mirror the way humans prioritize values—such as power, benevolence, or tradition—and how those values influence real-world decision-making. The study aims to determine if these models can serve as effective, psychologically grounded tools for simulating human behavior and social dynamics.

Inducing Human Values

To influence the models, the researchers utilized a "value-prompting" technique based on Schwartz’s theory of basic human values. This framework organizes ten core values in a circular continuum, where adjacent values are compatible and opposing values create conflict. By providing the LLMs with specific descriptions of these values, the researchers successfully steered the models toward distinct, consistent behavioral patterns. For example, models prompted to value "power" exhibited different political and social leanings than those prompted to value "self-transcendence," mirroring the motivational conflicts observed in human psychology.

Testing Against Human Behavior

The researchers conducted a massive experiment involving over 5 million questions across seven leading LLMs. They used validated psychological tools, such as the Portrait Values Questionnaire and the Big Five Inventory, to measure how the models responded to various social dilemmas, charitable causes, and personality assessments. The results showed a strong alignment between the models and human data, with the LLMs demonstrating a similar "circular" structure of values and consistent relationships between those values and specific behaviors.

Simulating Populations

A key challenge in this research was moving from individual behavior to population-level simulation. Since human populations are diverse, the researchers tested different strategies for combining value-prompted LLMs to represent a society. They found that incorporating human-informed distributions—specifically accounting for the fact that many people do not have a single dominant value—significantly improved the accuracy of the simulations. Using "naive" (unprompted) models to represent individuals without a dominant value proved to be the most effective strategy for capturing a realistic human population.

Implications for Future Research

The study concludes that value-induced LLMs are capable of manifesting coherent, human-like value structures that align with established psychological research. This suggests that these models could become powerful tools for researchers to run large-scale, cost-effective psychological experiments. While the models showed high levels of agreement with human patterns, the researchers noted that the effectiveness of these simulations depends heavily on how the population is structured and the specific prompting techniques used, rather than just the size or general performance of the model.

Comments (0)

No comments yet

Be the first to share your thoughts!