Scientific Fitness Coaching (SFC) is a specialized field that requires deep knowledge of exercise physiology, sports medicine, and nutrition. While human professionals provide the gold standard for this guidance, their services are often expensive and difficult to access. Recent advancements in Large Language Models (LLMs) offer a way to make fitness coaching more inclusive, but general-purpose models often struggle with the complex, safety-sensitive, and multi-faceted nature of fitness advice. The paper introduces FitOne, a series of domain-specific LLMs designed to bridge this gap by providing more reliable and specialized fitness intelligence.
A Specialized Training Pipeline
To transform general-purpose models into fitness experts, the researchers developed a three-stage post-training pipeline built upon the Qwen3 foundation models. First, the models undergo continual pre-training on a large-scale, high-quality corpus of fitness literature, textbooks, and exercise guidelines. Second, the team uses supervised fine-tuning to improve the model's step-by-step reasoning, ensuring that the advice provided is verifiable and logically sound. Finally, the models are refined through reinforcement learning using a specialized algorithm called DAPO. This stage aligns the model's outputs with real-world requirements, such as following specific fitness formatting principles, while ensuring the model remains helpful and accurate.
Rigorous Knowledge Engineering
A key strength of the FitOne approach is its focus on high-quality data. The researchers categorized fitness coaching into eight core domains, including weight loss, sports nutrition, and exercise prescription. They utilized a systematic four-step preprocessing pipeline—parsing, normalization, refinement, and data mixture—to ensure the training data was accurate and free of noise. By combining this domain-specific data with general-purpose datasets, the researchers ensured that the models gained deep expertise in fitness without losing their ability to handle general language tasks.
Proven Performance Gains
The researchers evaluated FitOne against professional fitness certification exams, specifically the ACSM-EP and the NSCA-CSCS. The results showed that FitOne significantly outperformed the base Qwen3 models across both the 8B and 32B parameter versions. For instance, the FitOne-8B model achieved improvements of up to 10.09% on the ACSM-EP exam and 12.73% on the NSCA-CSCS exam compared to its base model. These results demonstrate that the domain-specialized training pipeline effectively enhances professional fitness knowledge while maintaining strong general reasoning capabilities.
Advancing Fitness Intelligence
The study confirms that each stage of the training pipeline is essential for balancing domain expertise with general performance. By moving beyond simple text generation and focusing on verifiable reasoning and structured output, FitOne represents a significant step toward more reliable, automated fitness coaching. The authors believe this research provides a blueprint for developing domain-specific LLMs in other specialized fields where accuracy and safety are paramount.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!