Back to AI Research

AI Research

Enhancing Fitness Intelligence through Domain-Speci... | AI Research

Key Takeaways

  • Scientific Fitness Coaching (SFC) is a specialized field that requires deep knowledge of exercise physiology, sports medicine, and nutrition.
  • Scientific Fitness Coaching (SFC) is typically delivered by human professionals, making it costly and inaccessible to many.
  • While recent advances in Large Language Models (LLMs) show considerable promise for more inclusive fitness coaching, directly deploying prevailing general-purpose LLMs in SFC reveals critical limitations.
  • These models often lack sufficient domain-specific knowledge integration, leading to weak performance on complex SFC scenarios.
  • In this paper, we introduce FitOne, a series of fitness LLMs (with 8B and 32B parameters) designed to improve reliability and domain specialization for SFC applications.
Paper AbstractExpand

Scientific Fitness Coaching (SFC) is typically delivered by human professionals, making it costly and inaccessible to many. While recent advances in Large Language Models (LLMs) show considerable promise for more inclusive fitness coaching, directly deploying prevailing general-purpose LLMs in SFC reveals critical limitations. These models often lack sufficient domain-specific knowledge integration, leading to weak performance on complex SFC scenarios. In this paper, we introduce FitOne, a series of fitness LLMs (with 8B and 32B parameters) designed to improve reliability and domain specialization for SFC applications. Built upon the Qwen3 foundation models, FitOne is developed through a three-stage post-training pipeline consisting of continual pre-training, supervised fine-tuning, and reinforcement learning, using large-scale, high-quality datasets derived from rigorous knowledge engineering. We conduct comprehensive evaluations of FitOne on professional fitness certification exams, including ACSM-EP and NSCA-CSCS, as well as general capabilities such as knowledge reasoning and instruction following. Experimental results show that, while retaining strong general capabilities, FitOne-8B/32B achieves average improvements of up to 10.09%/9.29% and 12.73%/7.01% on the ACSM-EP and NSCA-CSCS exams, respectively, compared with the Qwen3 base models. Furthermore, in-depth ablation studies confirm the necessity of each training stage, highlighting the pipeline's effectiveness in balancing domain expertise enhancement with general ability retention. We believe this research advances LLM systems toward more reliable fitness intelligence and will inspire future research on developing domain-specific LLMs.

Scientific Fitness Coaching (SFC) is a specialized field that requires deep knowledge of exercise physiology, sports medicine, and nutrition. While human professionals provide the gold standard for this guidance, their services are often expensive and difficult to access. Recent advancements in Large Language Models (LLMs) offer a way to make fitness coaching more inclusive, but general-purpose models often struggle with the complex, safety-sensitive, and multi-faceted nature of fitness advice. The paper introduces FitOne, a series of domain-specific LLMs designed to bridge this gap by providing more reliable and specialized fitness intelligence.

A Specialized Training Pipeline

To transform general-purpose models into fitness experts, the researchers developed a three-stage post-training pipeline built upon the Qwen3 foundation models. First, the models undergo continual pre-training on a large-scale, high-quality corpus of fitness literature, textbooks, and exercise guidelines. Second, the team uses supervised fine-tuning to improve the model's step-by-step reasoning, ensuring that the advice provided is verifiable and logically sound. Finally, the models are refined through reinforcement learning using a specialized algorithm called DAPO. This stage aligns the model's outputs with real-world requirements, such as following specific fitness formatting principles, while ensuring the model remains helpful and accurate.

Rigorous Knowledge Engineering

A key strength of the FitOne approach is its focus on high-quality data. The researchers categorized fitness coaching into eight core domains, including weight loss, sports nutrition, and exercise prescription. They utilized a systematic four-step preprocessing pipeline—parsing, normalization, refinement, and data mixture—to ensure the training data was accurate and free of noise. By combining this domain-specific data with general-purpose datasets, the researchers ensured that the models gained deep expertise in fitness without losing their ability to handle general language tasks.

Proven Performance Gains

The researchers evaluated FitOne against professional fitness certification exams, specifically the ACSM-EP and the NSCA-CSCS. The results showed that FitOne significantly outperformed the base Qwen3 models across both the 8B and 32B parameter versions. For instance, the FitOne-8B model achieved improvements of up to 10.09% on the ACSM-EP exam and 12.73% on the NSCA-CSCS exam compared to its base model. These results demonstrate that the domain-specialized training pipeline effectively enhances professional fitness knowledge while maintaining strong general reasoning capabilities.

Advancing Fitness Intelligence

The study confirms that each stage of the training pipeline is essential for balancing domain expertise with general performance. By moving beyond simple text generation and focusing on verifiable reasoning and structured output, FitOne represents a significant step toward more reliable, automated fitness coaching. The authors believe this research provides a blueprint for developing domain-specific LLMs in other specialized fields where accuracy and safety are paramount.

Comments (0)

No comments yet

Be the first to share your thoughts!