What can we learn about federated learning?

federated learning

Share on

We all know that for machine learning you have to train models on data before actually having fully functional systems that derive results. The same goes for AI. The traditional approach is to collect data, prepare data, train models, and deploy them. But, considering the amount of data and devices we use daily, there was a shift in that approach, and in comes federated learning.

If there are a lot of devices, let’s take smartphones for example, how do you efficiently collect data and deploy machine learning models back to them, without lag time and disturbing user experience?

But first, theory

Federated learning is an old-new concept. It has been around for some time, but it still has major roads ahead to conquer. There are still some questions that need to be answered.

Federated learning is a decentralized approach to training machine learning models. Models aren’t centered on one server, but they are rather deployed on each individual edge device, at the source, and on raw data. After the models are trained on each device, results are sent to the central server where a model trains on the aggregated and anonymized data from individual device models. The next step is to send those results in a cycle, back to edge devices and vice versa.

So, on one side we have local models, and on the other central server model. They all communicate together and share aggregated local updates to make the model more precise and efficient.

The main reason for the rising federated learning popularity is privacy. Data doesn’t leave the local devices but is utilized there. The data that travels to central servers is aggregated and it’s harder to compromise data privacy or access those private user information. Meaning, it’s harder to link data back to the user.

How does it actually work?

Federated learning is based on decentralization and an ongoing cyclic movement of machine learning models and results. The best description is that it doesn’t move data to the models, it moves models to the data. 

The main model is stored at the central server. It is then shared across multiple edge devices. This is where local training happens. On each device ML models use and train on local raw data. Note that at this phase they don’t share the data with the central model or other devices yet. After training on local devices, these locally trained models are sent back to the central model where data or results get aggregated for a global model. The central server trains the central model on accumulated data and validates its accuracy. Next, the cycle gets repeated. The central model sends this trained model back to the edge devices for new iterations and processes. This goes on and on, optimizing model performance until it reaches the expected accuracy and better user experience. 

Why is this good? Because the model works on cyclic iterations making it better and better without breaching user privacy. But, it still provides a personalized experience and approach to device users. 

federated learning model

Source: https://www.altexsoft.com/blog/federated-learning/

Here we can identify a couple of types of federated learning. We have two main types, centralized and decentralized federated learning. Of, course some may identify more than those two, but we’ll focus on the main ones.

Centralized federated learning

This type of federated learning is the same as in the process described above. A central server is tasked with aggregating results and re-training the model then sending it back to edge devices or local models. Only the central server communicates with edge devices. Those devices don’t necessarily interact with each other.

Decentralized federated learning

The decentralized approach implies that there is no central server, but edge devices communicate with each other. This means that they aggregate data among themselves – model updates are shared only among the interconnected edge devices. 

Other types of federated learning would include horizontal, vertical, and federated transfer learning. 

The benefits of federated learning

Why is federated learning taking the ML and AI world by storm? Well, considering the number of devices used daily, the vast volume of data and the need for real-time analytics or experience have driven this development. The more accurate and safer the model, the better the final user experience. Data privacy is also a big determinant.

Increased data privacy and security

Considering the fact that user data doesn’t leave the local device, there is a higher level of data privacy. The data sent to the central server is just model results, and with security measures in place, it makes it harder to link results to specific users. In traditional machine learning methods, data is collected and sent to the central location where it is processed and prepared, which can raise concerns about privacy and security. Federated learning allows for more private data to be held at the device location and not central storage.

Collaborative nature

Because machine learning happens on multiple different devices at the same time, it promotes collaboration between those devices. They all participate in providing model results to reach the higher accuracy of the main model. 

Real-time predictions

Considering that models are trained on local devices, they get updates in real time and provide users with instant results based on their behavior. If the model learns right away from the user’s device usage, they can, at that moment, improve and optimize performance and provide recommendations. That’s the reason why it can give predictions in real time, as well. It can anticipate user behavior. 

Scalability, adaptability, and cost-efficiency

Of course more data means a better and more precise model. However, storing all that data in one centralized depository could lead to inefficient and slow systems. Since with federated learning, all data is stored on local devices, it improves scalability and accelerates ML and AI model deployment and learning cycle. Also, not moving vast amounts of data to the cloud decreases data storage costs.

Data accuracy and diversity

By exposing models to a wider range of devices and data sources, they can be trained on more diverse data sets. It enhances and extends its possibilities and ability to accurately represent data and predictions. Access to a full range of data reduces bias, improves predictions, and uncovers new ways to utilize data.

Well, there are some questions

Federated learning might bring loads of benefits, but there are still some questions about it. As not everything is perfect, there will always be something that needs to be improved and worked on. 

Low communication efficiency

Since federated learning includes many devices, of course, there will be issues with communication among those same devices. Data sharing can be slow due to many reasons. Also, bottlenecks occur if updates are all sent at once, they should be sent out in iteration. 

Heterogeneity of systems and data

Multiple different devices mean that each one generates different data. They also differ in communication, usage, and how they work in general. This means that data comes in different forms and at different velocities, especially if we take into account that various devices are connected to different types of networks. This makes it harder for federated learning to provide efficiency and accuracy if not handled beforehand.

Obvious limitations to data cleaning and labeling

If we deal with heterogeneous systems or devices, of course, there will be issues with data. In the traditional machine learning approach, data is stored in a central depository so there it can be cleaned, prepared, and labeled. Since federated learning happens at the edge device levels and on raw data, it’s harder to have fully clean data.

What comes next?

Well, federated learning is something that has already taken root in machine learning. Considering it still has some issues to resolve, there is room for improvement and future exploration. It will definitely grow in importance more and more. Due to the increasing number of devices we use each day, federated learning has already taken home on most of them. 

With how technology evolves, soon there will be no limitations in the communication aspect of issues. Not to mention tools and methods that already solve issues with raw data. So, be not surprised when federated learning becomes a staple in the ML community and a topic widely discussed.

Stay Connected

More Updates

Zadarska 80, Zagreb

© 2022 DigitalPoirots.com | Deegloo.com