Federated learning is a machine learning approach that allows models to be trained across multiple devices or servers without needing to centrally aggregate data. Instead of collecting all the data in a single location, models are trained locally on devices that hold the data. Each device processes the data and sends only the model updates—like weights and gradients—back to a central server. The server then averages these updates to improve the global model. This process continues iteratively, enabling the model to learn from diverse data sources while preserving user privacy since the raw data never leaves the device.
One example of federated learning is a smartphone keyboard app that improves its predictive text feature. Each user's typing data remains on their device. The keyboard app builds a model based on the local input and periodically sends the model updates to the server. The server combines these updates to enhance the overall performance of the keyboard for all users. In this case, the individual data of users is never stored on the cloud, preventing potential privacy violations while still leveraging the diverse typing patterns across users to make the model more accurate.
This method also addresses challenges like communication costs and data heterogeneity. By training on local devices, federated learning reduces the amount of data that needs to be transferred to the central server, which can be especially beneficial in environments with limited connectivity. It also accommodates the varying data distributions found across different devices. By taking advantage of local data while maintaining privacy and efficiency, federated learning helps create more robust machine learning models that are well-suited for applications where data privacy is a priority.