AI Outperforms Humans in Personalized Image Aesthetics Assessment via LLM-Based Interviews and Semantic Feature Extraction
This research addresses the challenge of predicting how specific individuals evaluate the aesthetics of images. While traditional AI models often rely on objective, low-level image features—such as brightness or texture—these methods struggle to account for the subjective, personal nature of aesthetic taste. The authors developed an integrated system that combines deep learning with Large Language Models (LLMs) to actively interview users about their preferences, allowing the AI to incorporate high-level semantic and contextual information into its predictions.
How the System Works
The proposed system functions through two main components: an interview process and a prediction engine. First, an LLM-based interview system conducts a semi-structured conversation with a participant. Two AI agents work in parallel: an "Interviewer" that asks questions and an "Analyzer" that interprets the user's responses to build a profile of their aesthetic values.
Second, the prediction system uses this profile to identify high-level features—such as narrative, emotion, or cultural context—that are meaningful to that specific individual. These high-level features are combined with low-level image data processed by a deep learning model. A machine learning module then integrates these inputs to generate a personalized aesthetic score for any given image.
Key Findings
The researchers tested their system against conventional deep learning models, other LLM-based predictors, and human participants. The results showed that the proposed system outperformed all other methods. Notably, the AI demonstrated its strongest performance when predicting scores for images that users rated highly, suggesting that the system is particularly effective at capturing the specific, subjective reasons why a person finds an image appealing.
AI as an Interpreter of Taste
A significant finding of the study is that the AI’s prediction error was smaller than the natural variability found within a single person’s own evaluations over time. In contrast, human predictors—who were asked to guess the preferences of others—showed the largest errors, likely because they were biased by their own personal aesthetic values. These results suggest that AI may be uniquely positioned to act as a precise interpreter of human aesthetic sensibility, potentially understanding an individual's preferences better than other humans or even the individual themselves at a different point in time.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!