Something You Should Know About Federated Learning

Ryan
CyberVein
Published in
4 min readMay 11, 2020

Nowadays, there are an estimated 7 billion connected devices in the world, these devices are constantly generating new data. Traditional analytics and machine learning need that data to be centrally collected before it is processed to yield insights. The downside of this architecture is that all the data collected by local devices and sensors are sent back to the central server for processing, and subsequently returned back to the devices. This round-trip limits a model’s ability to learn in real-time.

Federated learning in contrast, is an approach that downloads the current model and computes an updated model at the device itself using local data. These locally trained models are then sent from the devices back to the central server where they are aggregated, i.e. averaging weights, and then a single consolidated and improved global model is sent back to the devices.

Approaches for Federated Learning

Federated Learning can be majorly classified as Single Party or Multi-Party. In a Single Party system, only one entity is involved in governance of the distributed data capture and flow system. This could be in several forms such as a smartphone or IoT app, network devices, distributed data warehouses, machines used by employees etc. Models are trained in a Federated manner on data that has the same structure across all client devices and in most cases each data point is unique to the device or user.

In a more general sense, federated learning allows for machine learning algorithms to gain experience from a broad range of data sets located at different locations. The approach enables multiple organizations to collaborate on the development of models, but without needing to directly share secure data with each other. Over the course of several training iterations, the shared models get exposed to a significantly wider range of data than what any single organization possesses in-house. In other words, federated learning decentralizes machine learning by removing the need to pool data into a single location. Instead, the model is trained in multiple iterations at different locations.

Smarter medical system with Federated learning

If you have read the report or article of CyberVein before, then you should be familiar with federated learning. Compared with PISR distributed database, CyberVein Federated Learning is more like a house’s intelligent access control system, which guarantees the security of “properties” in the room. What CyberVein has to do is to prevent the web crawler companies from selling the data of various enterprises , so that the AI models can be trained safely and effectively without data leaving local server. On the premise of ensuring data security, the data can be converted into services and products in order to more accurately meet people’s needs, so that the data can better serve human society and continuously improve people’s happiness.

CyberVein proposes a federated learning workflow for medical research, which can effectively transfer the knowledge learned by the model on different distributed data to ensure the federated learning performance. And the method of knowledge distillation is used to effectively reduce the gradient of transmission and save huge communication overhead.

The decentralized features of the blockchain and the decentralized federal learning algorithm effectively protect the security of communications and computing. Regardless of whether it is static data or dynamic data, under the premise of satisfying data privacy, security and regulatory requirements, CyberVein Federated Learning design a machine learning framework to use their data together more efficiently and accurately. These knowledge have potential value for improving medical quality, effectively controlling expenses, and ensuring medical safety.

The platform supports various applications through the medical big data platform, improve the diagnosis and treatment of medical personnel, assist hospital management personnel in decision-making, accelerate the implementation of scientific research results, and provide patients with accurate medical services. Including clinical assistant decision-making, statistical analysis of large cases of single disease, comparison of treatment methods and efficacy, precise diagnosis and personalized treatment, adverse reaction and error analysis reminders, health prediction and early warning, refined management decision support, scientific research results verification, auxiliary medication analysis and drug development.

The Zhejiang University-CyberVein R&D Center has implemented this workflow with several hospitals, accomplishing an accurate diagnosis of the different type of keratitis. Keratitis can be caused by bacteria, fungi and virus, each type will have subtle differences visually, which makes it extremely hard to diagnosis correctly. The R&D Center had successfully used Federated Learning to train a model with the data from the hospitals, and achieved a 80% accuracy on the diagnosis of keratitis. This performance was better than 96% of the doctors that participated in the experiment. From the accurate diagnosis, doctors can construct more appropriate treatment plans for the patients, ultimately improving the cure rate of the disease.

--

--