Back to AI Research

AI Research

How Sensitive Are Radiomic AI Models to Acquisition... | AI Research

Key Takeaways

  • How Sensitive Are Radiomic AI Models to Acquisition Parameters?
  • Artificial intelligence models used for medical imaging, such as those that detect lung cance...
  • A main barrier for the deployment of AI radiomic systems in clinical routine is their drop in performance under heterogeneous multicentre acquisition protocols.
  • We formulate a mixed-effects framework for quantifying the influence that clinically relevant acquisition parameters have on models performance, while accounting for subject-level random effects.
  • We have applied our framework to lung cancer diagnosis in CT scans using two independent multicentre datasets (a public database and own-collected data) and several SoA architectures.
Paper AbstractExpand

A main barrier for the deployment of AI radiomic systems in clinical routine is their drop in performance under heterogeneous multicentre acquisition protocols. This work presents a performance-oriented framework for quantifying scan parameter sensitivity of radiomic AI models, while identifying clinically significant parameter regions associated with improved cross-dataset robustness. We formulate a mixed-effects framework for quantifying the influence that clinically relevant acquisition parameters have on models performance, while accounting for subject-level random effects. We have applied our framework to lung cancer diagnosis in CT scans using two independent multicentre datasets (a public database and own-collected data) and several SoA architectures. To evaluate across-database reproducibility, CT parameters have been adjusted using the data collected and tested on the public set. The optimal configuration selected is the current of the X-ray tube >= 200 mA, spiral pitch <= 1.5, slice thickness <= 1.25 mm, which balances diagnostic quality with low radiation dose. These configuration push metrics from 0.79+-0.04 sensitivity, 0.47+-0.10 specificity in low quality scans to 0.90+-0.10 sensitivity, 0.79 +- 0.13 specificity in high quality ones.

How Sensitive Are Radiomic AI Models to Acquisition Parameters?

Artificial intelligence models used for medical imaging, such as those that detect lung cancer in CT scans, often struggle when moved from a controlled research environment to real-world clinical settings. This performance drop occurs because different hospitals use varying scanning protocols, leading to "domain shifts" that confuse AI models. This research introduces a new statistical framework designed to quantify how specific CT scan settings affect AI performance and identifies the optimal scan configurations that ensure the most reliable and robust diagnostic results.

A New Way to Measure Sensitivity

The researchers developed a "generalized mixed-effects modeling" framework to analyze how acquisition parameters—such as the X-ray tube current, spiral pitch, and slice thickness—influence the accuracy of AI models. Unlike traditional methods, this approach accounts for "random effects," such as differences between individual patients, which helps ensure that the findings are applicable to the general population rather than just a specific set of test data. By using this model, the researchers can calculate an "Odds Ratio" to determine the risk of an AI making a diagnostic error when moving from high-quality to low-quality scan conditions.

Optimizing Scan Protocols

To find the best balance between diagnostic accuracy and radiation dose, the team used a multi-objective optimization strategy. They treated the various scan settings as a geometric space and used an iterative process to identify "Pareto layers"—groups of settings that provide the best possible trade-offs. This allows clinicians to see not just one "perfect" setting, but a range of acceptable configurations that maintain high performance across different hospital environments.

Improving Diagnostic Reliability

The framework was tested on lung cancer diagnosis using both a public database and a private, multi-center dataset. The results demonstrated that by selecting an optimal configuration—specifically an X-ray tube current of at least 200 mA, a spiral pitch of 1.5 or less, and a slice thickness of 1.25 mm or less—the AI models showed significant improvements. In low-quality scans, the models initially performed with 0.79 sensitivity and 0.47 specificity. After applying the optimized high-quality configuration, these metrics rose to 0.90 sensitivity and 0.79 specificity, proving that adjusting scan parameters can significantly stabilize AI performance.

Key Considerations for Clinical Use

The study highlights that while feature-level reproducibility is important, it does not automatically guarantee that an AI model will perform well in practice. By explicitly modeling acquisition-induced variability, this framework provides a practical tool for developers and clinicians to validate AI systems before they are deployed. The researchers note that while their current study focused on three specific CT parameters for clarity, the framework is flexible and can be applied to an arbitrary number of acquisition dimensions to suit different clinical needs.

Comments (0)

No comments yet

Be the first to share your thoughts!