MIT researchers developed the Federated Tiny Training Engine (FTTE), a framework that speeds up privacy-preserving AI training on edge devices by 81 percent.
MIT researchers have developed a new framework that accelerates privacy-preserving artificial intelligence training by approximately 81 percent, potentially enabling more accurate and efficient models on resource-constrained edge devices. The system, known as the Federated Tiny Training Engine (FTTE), allows devices like smartwatches, sensors, and mobile phones to participate in collaborative AI training without compromising user data security.
Federated learning typically involves a network of devices that train a shared AI model by processing data locally and sending updates to a central server. While this keeps sensitive information on the device, it often struggles with heterogeneous networks where devices possess varying levels of memory, computational power, and connectivity. When a server waits for all devices to complete their tasks, the resulting lag can slow down or even cause the training process to fail.
The FTTE framework addresses these challenges by optimizing how data is processed and transmitted. By reducing the memory and communication overhead, the researchers have made it possible for a wider array of devices to contribute to model training, even in under-resourced settings where hardware capabilities are limited.
The researchers implemented three primary innovations to improve training efficiency. First, instead of broadcasting the entire AI model, the server sends only a subset of model parameters, which are selected based on the memory capacity of the most constrained device in the network. Second, the server utilizes an asynchronous approach, accumulating incoming updates until a fixed capacity is reached rather than waiting for every device to respond.
Finally, the system weights incoming updates based on their age. By prioritizing newer data, the framework ensures that outdated information does not hinder the training process. This semi-asynchronous method allows the system to involve less powerful devices while preventing more capable devices from remaining idle, thereby optimizing the use of available resources.
In simulations involving hundreds of heterogeneous devices, the FTTE framework reduced on-device memory overhead by 80 percent and communication payloads by 69 percent. These gains were achieved while maintaining near-standard accuracy levels. The researchers also tested the framework on a small network of real-world devices, demonstrating that the technology can function effectively across varying hardware specifications.
Lead author Irene Tenison, an electrical engineering and computer science graduate student, notes that the work is a significant step toward running powerful AI models on the devices people carry daily. By accelerating the training process, the framework also helps conserve battery life on mobile devices. Future research will focus on increasing the personalized performance of AI models on individual devices and conducting larger experiments on real hardware. The research will be presented at the IEEE International Joint Conference on Neural Networks.