How Sensitive Are Radiomic AI Models to Acquisition Parameters?
Artificial intelligence models used for medical imaging, such as those that detect lung cancer in CT scans, often struggle when moved from a controlled research environment to real-world clinical settings. This performance drop occurs because different hospitals use varying scanning protocols, leading to "domain shifts" that confuse AI models. This research introduces a new statistical framework designed to quantify how specific CT scan settings affect AI performance and identifies the optimal scan configurations that ensure the most reliable and robust diagnostic results.
A New Way to Measure Sensitivity
The researchers developed a "generalized mixed-effects modeling" framework to analyze how acquisition parameters—such as the X-ray tube current, spiral pitch, and slice thickness—influence the accuracy of AI models. Unlike traditional methods, this approach accounts for "random effects," such as differences between individual patients, which helps ensure that the findings are applicable to the general population rather than just a specific set of test data. By using this model, the researchers can calculate an "Odds Ratio" to determine the risk of an AI making a diagnostic error when moving from high-quality to low-quality scan conditions.
Optimizing Scan Protocols
To find the best balance between diagnostic accuracy and radiation dose, the team used a multi-objective optimization strategy. They treated the various scan settings as a geometric space and used an iterative process to identify "Pareto layers"—groups of settings that provide the best possible trade-offs. This allows clinicians to see not just one "perfect" setting, but a range of acceptable configurations that maintain high performance across different hospital environments.
Improving Diagnostic Reliability
The framework was tested on lung cancer diagnosis using both a public database and a private, multi-center dataset. The results demonstrated that by selecting an optimal configuration—specifically an X-ray tube current of at least 200 mA, a spiral pitch of 1.5 or less, and a slice thickness of 1.25 mm or less—the AI models showed significant improvements. In low-quality scans, the models initially performed with 0.79 sensitivity and 0.47 specificity. After applying the optimized high-quality configuration, these metrics rose to 0.90 sensitivity and 0.79 specificity, proving that adjusting scan parameters can significantly stabilize AI performance.
Key Considerations for Clinical Use
The study highlights that while feature-level reproducibility is important, it does not automatically guarantee that an AI model will perform well in practice. By explicitly modeling acquisition-induced variability, this framework provides a practical tool for developers and clinicians to validate AI systems before they are deployed. The researchers note that while their current study focused on three specific CT parameters for clarity, the framework is flexible and can be applied to an arbitrary number of acquisition dimensions to suit different clinical needs.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!