Back to AI Research

AI Research

An Infectious Disease Spread Simulation Based on La... | AI Research

Key Takeaways

  • An Infectious Disease Spread Simulation Based on Large Language Model Decision Making This paper introduces a new way to simulate how people make health deci...
  • Modelling individual decision-making during infectious disease outbreaks is crucial for understanding behavioural dynamics and informing effective public health interventions.
  • Prior work has shown that large language models can simulate realistic human behaviour by generating agent decisions based on demographic prompts and situational context.
  • We implement and compare three decision scenarios, independent reasoning, household influence, and message framing, and simulate self-reporting outcomes in San Francisco and Atlanta.
  • Results reveal that income and education are the dominant drivers of reporting rate variation, with smaller but consistent effects from geography, LLM model choice, and message framing.
Paper AbstractExpand

Modelling individual decision-making during infectious disease outbreaks is crucial for understanding behavioural dynamics and informing effective public health interventions. Prior work has shown that large language models can simulate realistic human behaviour by generating agent decisions based on demographic prompts and situational context. We build on this foundation with a spatially grounded, agent-based simulation framework that integrates LLM-generated decisions about self-reported influenza-like illness into a census-based synthetic population of agents. Location is treated as a central feature: agents are assigned to spatial units within cities, capturing the spatial distributions of different demographic groups using real-world census data and enabling geographically diverse behavioural modelling. We implement and compare three decision scenarios, independent reasoning, household influence, and message framing, and simulate self-reporting outcomes in San Francisco and Atlanta. Results reveal that income and education are the dominant drivers of reporting rate variation, with smaller but consistent effects from geography, LLM model choice, and message framing. Our framework generates synthetic data that captures both social and geographic heterogeneity, supporting spatial epidemiological modelling and bias-aware behavioural analysis.

An Infectious Disease Spread Simulation Based on Large Language Model Decision Making
This paper introduces a new way to simulate how people make health decisions during an infectious disease outbreak. By replacing traditional, rigid rule-based systems with Large Language Models (LLMs), the researchers created a more realistic agent-based simulation. This framework allows individual digital "agents" to make complex, context-aware choices—such as whether to report symptoms—based on their specific demographic backgrounds and the social or informational environment they inhabit.

How the Simulation Works

The researchers built their framework on a "Patterns of Life" simulator that models daily human routines, such as going to work or visiting restaurants. They added an infectious disease model (SEIR) to track how illnesses spread through physical contact. To make the agents act like real people, the team used census data to assign each agent a specific demographic profile, including age, income, education, and race.
Instead of using simple math formulas to decide if an agent reports an illness, the team prompted LLMs with these demographic profiles and situational contexts. To keep the simulation fast and consistent, they pre-generated these decisions and stored them in a "decision bank." When an agent in the simulation becomes symptomatic, it retrieves a decision from this bank based on its unique profile.

Testing Different Scenarios

The study explored three specific scenarios to see how different factors influence public health outcomes:

  • Independent Reasoning: Agents make decisions based solely on their own demographic background.

  • Household Influence: Agents are more likely to report symptoms if they know a family member has already reported an illness.

  • Message Framing: Agents receive different types of public health messages—such as those focused on personal risk, altruism, or statistical data—to see which approach is most effective at encouraging reporting.

Key Findings

The simulation results show that an agent's socioeconomic status is the most significant factor in whether they report an illness. Specifically, income and education levels were the primary drivers of variation in reporting rates. While factors like geography, the specific LLM model used, and the way public health messages were framed also had an impact, their effects were smaller and more consistent. By capturing these social and geographic differences, the framework provides a way to study how systemic inequities might influence the accuracy of disease surveillance data.

Considerations for Future Research

The researchers note that while LLMs are not designed to perfectly replicate human health behavior, they are useful for exploring how different variables influence population-level dynamics. Because the agents are built using real-world census data, this framework helps researchers understand how demographic disparities can lead to "reporting bias," where certain groups are underrepresented in official health data. This tool is intended to support public health experts in designing better interventions and analyzing how different communication strategies might affect diverse communities.

Comments (0)

No comments yet

Be the first to share your thoughts!