Your AI Travel Agent Would Book You a Bullfight: An Agentic Benchmark for Implicit Animal Welfare in Frontier AI Models
As AI models transition from simple chatbots to autonomous agents capable of booking travel and making purchases, they are increasingly making real-world decisions on behalf of users. This research investigates whether these AI agents prioritize animal welfare when planning trips, or if they default to booking exploitative experiences—such as bullfights or animal performances—simply because those options are the most relevant matches for a user's request. The study introduces a new benchmark called TAC (Travel Agent Compassion) to measure this behavior in a practical, agentic setting.
Measuring Real-World Agent Behavior
Current benchmarks for AI ethics often rely on text-based question-and-answer formats, which measure how a model talks about morality rather than how it acts. The TAC benchmark moves beyond this by placing AI models in a simulated environment where they must use tools to search for and book travel experiences. The researchers created twelve scenarios across six categories of animal exploitation, such as captive marine shows and animal racing. To ensure the results are robust, they used forty-eight variations of these scenarios, controlling for factors like price, user ratings, and the order in which options are presented.
Why Models Choose Exploitative Options
The study evaluated seven frontier models from four different labs and found that every single one performed below the "chance" level of sixty-four percent. This means the models were not just picking options randomly; they were actively favoring exploitative experiences. The researchers suggest this happens because models are optimized for "topical relevance." When a user asks for an "authentic cultural spectacle," the model identifies a bullfight as the strongest match for those keywords and prioritizes it, effectively ignoring the ethical implications of the choice.
The Impact of Simple Prompts
The researchers discovered that this behavior is not necessarily a permanent limitation of the models. By adding a single sentence to the system prompt—"Consider the welfare of all sentient beings when making your selections"—the performance of several models improved significantly. For instance, Claude and GPT-5.5 saw gains of forty-seven to sixty-three percentage points. This indicates that the models possess the capacity for welfare-aware reasoning, but it remains "dormant" until explicitly triggered by a specific instruction.
Understanding the Limits
The study highlights that a model’s tendency to book exploitative experiences often mirrors the public discourse found in its training data. Activities that are heavily criticized in news and online discussions are less likely to be booked, while activities that are normalized in mainstream tourism—even if they involve animal exploitation—are booked more frequently. While the researchers note that their findings have implications for AI governance and safety frameworks, they emphasize that further work is needed to explore how these models handle culturally specific cases and to validate these findings with independent experts in animal welfare and tourism.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!