How will federated learning influence your everyday life?

Artificial intelligence (AI) has been considered to be one of the biggest innovations for the next industrial revolution, and this includes machine learning. As crude oil and electricity became the fundamental resources of the modern industry, on the other hand, data is a crucial element for AI and machine learning.

CyberVein
CyberVein
5 min readJul 1, 2020

--

How will federated learning influence your everyday life?

The Conflict between Data Privacy and Demand

The size of the data sample trained determines the reliability and accuracy of the machine learning (ML) results that can be used to enhance the performance of the AI. However, getting useful data from the internet isn’t always an easy task. Web crawling is commonly used to feed hungry ML algorithms with the freshest data. However, web crawling can sometimes turn into an ethical issue, such as the Facebook–Cambridge Analytica data scandal. Their harvesting and use of personal data influenced the outcome of the United States’ 2016 presidential election and the UK’s Brexit referendum. Afterward, many countries introduced data protection regulations, such as the General Data Protection Regulations (GDPR) in Europe, which focuses on the use and protection of personal data by data-related organizations.

What Is Federated Learning

How does federated learning resolve the conflict between private information protection and data demand?

Federated learning is a type of distributed learning method that has the same modeling effect as traditional ML algorithms. However, rather than centralizing all the raw data like traditional ML, federated learning distributes computing tasks to multiple nodes. This method can achieve the same result as traditional methods while keeping the data does on the local database where the participant has control over their data and device.

What Is Federated Learning

Let’s use a classic analogy. The ML model is a sheep, and the data is grass. A traditional way to rear sheep is by buying grass and transporting it to the sheep’s location, much like when we buy datasets and move them to a central server. However, privacy concerns and regulations prevent us from physically moving the data. The grass can no longer travel outside its local area. Instead, federated earning employs a dual methodology. We can let the sheep graze multiple grasslands — our ML model is built in a distributed manner without the data traveling outside its local area. In the end, the ML model grows from everyone’s data, just like the sheep feeds on everyone’s grass (2020, Federated Learning).

Those who participate in federated learning are not only enterprise servers and IoT devices but also personal devices, such as your smartphone or PC, depending on where the data is located. Initially, the participant receives model parameters from the server that has organized the federated learning. The model is applied to the participant’s local data, and the result is used to update the model repeatedly until the model becomes stable. In the next step, model parameters from all participants are collected by the server and integrated into a final model. This process is known as the federated averaging algorithm. During the entire process, the data is never transferred; only the model parameters are transferred.

Cybervien provides a federated learning platform that allows database owners to utilize their stagnant resources and researchers to feed their hungry AI models in a safe and effective environment without data leaving the local server. The Zhejiang University CyberVein R&D Center is CyberVein’s headquarters for the research and development of the technology; the goal is to eliminate data silos and make data much more valuable.

CyberVein Federated Learning allows vertical and horizontal federated learning. Vertical federated learning aims to increase the sample characteristic dimensions across databases that may have the same people but lack the background. Horizontal federated learning aims to increase the sample size across databases that may have the same background but lack people. Both types of learning enable firms to train their models using local data before integrating models from all firms into a joint model, separating the models under encryption, which enhances the level of information security.

Between enterprises, federate learning brings opportunities to mining data across industries with no worries about data leakage or violating data protection laws. The data never leaves the original databases.

Potential Application

Auto Insurance

If you want to buy car insurance, the information you provided plays a vital role in determining your level of risk. Car insurance companies want to minimize risks and maximize profits by setting customize insurance plans for each client. Therefore, their databases should include multidirectional information, such as consumption data from banking institutions, IoV( internet of vehicles) data from motor corporations, and credit ratings from various sources. The databases also need to be updated frequently, have wide coverage, and contain information matched to the client. However, most insurance companies — especially small insurance firms — have little information from their internal history of past clients and limited data about new clients.

Information asymmetry can lead to unfair pricing for all clients because averaging the aggregated risk for each client is the only solution to overcome the issue. However, limited information may have biases or flaws when determining a premium.

With federated learning, insurance companies can get information from different sectors to enrich their pricing models to create an accurate and efficient way to predict the risk dynamically. As a result, clients benefit from low insurance rates without bearing the risks of high-risk clients, and insurance companies can maximize their profits.

Health Care

CyberVein’s federated learning platform supports various applications through its medical big-data platform, such as diagnosing keratitis. The algorithm model can reinforce the accuracy of doctors’ diagnoses, especially for less experienced doctors. The diagnosis algorithm model has been trained by federated learning, so it applies case samples from different hospitals without compromising patient privacy or data confidentiality.

Different types of keratitis caused by bacteria, fungi, and viruses have subtle visual differences, making it hard to diagnosis them correctly with the naked eye and determine the correct treatment plan for the patient. If there are major faults, the patients may go blind. A tested federated learning model achieved a diagnostic accuracy rate of 80%, which was better than 96% of the doctors who volunteered for the experiment.

This new method of diagnosis enhances the ability of all doctors and has the highest level accuracy for all patients. Even less experienced doctors can achieve the same level of diagnostic accuracy as experienced doctors. With an accurate diagnosis, doctors can construct appropriate treatment plans, ultimately improving the cure rate of the disease.

User-data protection policies are being enforced in more countries, so companies have to collaborate and develop their AI using a new method that doesn’t sacrifice people’s privacy but still brings convenience to all.

--

--

CyberVein
CyberVein

CyberVein reinvents decentralized databases and the way we secure and monetize information.