Back to AI Research

AI Research

Assessing Y-Axis Influence: Bias in Multimodal Lang... | AI Research

Key Takeaways

  • Assessing Y-Axis Influence: Bias in Multimodal Language Models on Chart-to-Table Translation Multimodal Language Models (MLMs) are increasingly used to conve...
  • Chart-to-table translation converts chart images into structured tabular data.
  • Accurate translation is crucial for Multimodal Language Model (MLM) to answer complex queries.
  • We observe imbalances in the number of images across different aspects of the y-axis information in public chart datasets.
  • Such imbalances can introduce unintended biases, causing uneven MLM performance.
Paper AbstractExpand

Chart-to-table translation converts chart images into structured tabular data. Accurate translation is crucial for Multimodal Language Model (MLM) to answer complex queries. We observe imbalances in the number of images across different aspects of the y-axis information in public chart datasets. Such imbalances can introduce unintended biases, causing uneven MLM performance. Previous works have not systematically examined these biases. To address this gap, we propose a new framework, FairChart2Table, for analyzing y-axis-related bias on five state-of-the-art models. Key Findings: (1) There are significant y-axis biases related to the digit length of the major tick values, the number of major ticks, the range of values, and the tick value format (e.g., abbreviation or scientific format). (2) The number of legends/entities in chart images impacts MLM performance. (3) Prompting MLM with y-axis information can significantly enhance the performance for some MLMs.

Assessing Y-Axis Influence: Bias in Multimodal Language Models on Chart-to-Table Translation

Multimodal Language Models (MLMs) are increasingly used to convert chart images into structured tables, a process essential for answering complex data-driven queries. However, this research highlights that public chart datasets contain significant imbalances in how y-axis information is presented. These imbalances lead to unintended biases, causing MLMs to perform inconsistently depending on the visual characteristics of the chart. This paper introduces a new framework to systematically analyze these biases and understand their impact on model accuracy.

The FairChart2Table Framework

To address the lack of systematic research into these biases, the authors developed a framework called FairChart2Table. This tool is designed to evaluate how y-axis-related factors influence the performance of five state-of-the-art multimodal models. By isolating specific visual and numerical variables, the framework allows researchers to pinpoint exactly where and why these models struggle during the translation process.

Key Drivers of Bias

The study identified several specific factors that contribute to uneven performance across models. The researchers found that MLMs are sensitive to the following y-axis characteristics:

  • Numerical Complexity: The digit length of major tick values and the overall range of values displayed.

  • Scale Density: The number of major ticks present on the axis.

  • Formatting: How the data is presented, such as the use of abbreviations or scientific notation.
    Beyond the y-axis, the study also noted that the number of legends or entities depicted within a chart image significantly impacts how well a model can translate the visual data into a table.

Improving Model Performance

A significant finding of the research is that model performance is not fixed; it can be improved through better prompting strategies. The authors discovered that when MLMs are explicitly provided with y-axis information via prompts, their ability to accurately translate charts into tables increases significantly for some models. This suggests that while inherent biases exist, targeted interventions can help bridge the gap in model reliability.

Comments (0)

No comments yet

Be the first to share your thoughts!