Back to AI Research

AI Research

BiFedKD: Bidirectional Federated Knowledge Distilla... | AI Research

Key Takeaways

  • BiFedKD: Bidirectional Federated Knowledge Distillation Framework for Non-IID and Long-Tailed ECG Monitoring This research addresses the challenge of trainin...
  • Electrocardiogram (ECG) monitoring in Internet of Medical Things (IoMT) networks is constrained by strict data-sharing regulations and privacy concerns.
  • Federated learning (FL) enables collaborative learning by keeping raw ECG data on devices, but frequent transmissions of high-dimensional model updates incur heavy per-round traffic over bandwidth-limited links.
  • To alleviate this bottleneck, federated distillation (FD) replaces parameter exchange with logit-based knowledge transfer.
  • However, the performance of FD often degrades under the non-independent and identically distributed (non-IID) and long-tailed label distributions in ECG deployments.
Paper AbstractExpand

Electrocardiogram (ECG) monitoring in Internet of Medical Things (IoMT) networks is constrained by strict data-sharing regulations and privacy concerns. Federated learning (FL) enables collaborative learning by keeping raw ECG data on devices, but frequent transmissions of high-dimensional model updates incur heavy per-round traffic over bandwidth-limited links. To alleviate this bottleneck, federated distillation (FD) replaces parameter exchange with logit-based knowledge transfer. However, the performance of FD often degrades under the non-independent and identically distributed (non-IID) and long-tailed label distributions in ECG deployments. To address these challenges, we propose a bidirectional federated knowledge distillation (BiFedKD) framework that employs an aggregation-by-distillation pipeline with temperature scaling to produce a stable global distillation signal for cross-client alignment. Experiments on the MIT-BIH Arrhythmia dataset show that BiFedKD improves accuracy and Macro-F1 over the baseline by $3.52\%$ and $9.93\%$, respectively. Moreover, to reach the same Macro-F1, BiFedKD reduces communication overhead by $40\%$ and computation cost by $71.7\%$ compared with the baseline.

BiFedKD: Bidirectional Federated Knowledge Distillation Framework for Non-IID and Long-Tailed ECG Monitoring
This research addresses the challenge of training accurate machine learning models for heart monitoring (ECG) across multiple medical devices without compromising patient privacy. In medical settings, data is often "non-IID" (meaning different devices see different types of heart conditions) and "long-tailed" (meaning common heart rhythms are frequent, while dangerous arrhythmias are rare). Standard collaborative learning methods often struggle with these imbalances and require heavy data transmission that can overwhelm bandwidth-limited medical networks. The authors propose a new framework called BiFedKD, which uses a "bidirectional" distillation process to create more stable, reliable global models while significantly reducing the amount of data and computing power required.

How the Framework Works

Instead of sharing raw data or large model updates, BiFedKD uses a process called knowledge distillation. Each local device (client) trains its own model and then shares only its "logits"—the model's output predictions—on a small, shared public dataset. The central server collects these predictions and uses a "teacher model" to aggregate them. By applying temperature scaling, the server smooths out the predictions to prevent the model from being biased toward the most common heart rhythms. This refined knowledge is then sent back to the local devices as a "global soft target," which acts as a guide to help the local models learn from the collective experience of all other devices without ever seeing their private data.

Key Performance Improvements

The researchers tested their framework using the MIT-BIH Arrhythmia dataset. Compared to standard baseline methods, BiFedKD demonstrated significant improvements in both accuracy (up 3.52%) and the Macro-F1 score (up 9.93%), which is a key metric for measuring performance on rare or imbalanced classes. By effectively filtering out noise and bias from individual devices, the framework ensures that the global model remains robust even when some devices have very little data on specific heart conditions.

Efficiency and Resource Savings

A major focus of this research is the practical constraint of medical hardware. Because BiFedKD relies on logit-based distillation rather than full model parameter exchange, it is much lighter on network traffic. The study found that to reach the same level of performance as baseline models, BiFedKD reduced communication overhead by 40% and lowered the required computation cost by 71.7%. This makes the framework particularly well-suited for Internet of Medical Things (IoMT) environments where devices have limited battery life, processing power, and network connectivity.

Considerations for Implementation

The effectiveness of the framework depends on the use of a shared public proxy dataset, which allows the server and clients to communicate a common language of predictions. The researchers also noted that the choice of the server-side "teacher model" architecture impacts the trade-off between performance and computation. While more complex models like CNN-Transformers can yield the highest accuracy, lighter models like smaller CNNs can be used to further reduce server-side costs, offering flexibility depending on the specific resource constraints of the medical deployment.

Comments (0)

No comments yet

Be the first to share your thoughts!