Federated Learning 101 with FEDn
Training good machine learning models requires access to good data. The problem is that most of the data in the world is not accessible to use when training machine learning models. Data cannot be moved or shared based on privacy concerns, regulatory constraints, or infrastructural reasons.
In federated learning, we distribute the training of machine learning models to where the data is, addressing critical issues such as data privacy, data security, data access rights, and access to distributed data. This not only enables machine learning on problems unreachable with traditional methods, but it also unlocks new and highly valuable business areas and enables new business models.
The following is a short and easy-to-follow tutorial to set up your first federated learning project with FEDn. This tutorial deploys a pseudo-distributed federated learning project in your local environment. In the ODSC 2021 workshop, we will use the STACKn SaaS platform (that can run anywhere in the distributed cloud) for an even easier set up of a FEDn federated learning project.
Getting started with your first federated learning project
The easiest way to start with FEDn is to use the provided docker-compose templates to launch a pseudo-distributed environment consisting of the core federated learning components — one Reducer, one Combiner, and a few Clients. Together with the supporting storage and database services, this makes up a minimal system for developing a federated model and learning the FEDn architecture. FEDn projects are templated projects that contain the user-provided model application components needed for federated training. The GitHub repository at https://github.com/scaleoutsystems/fedn bundles two such test projects in the ‘test’ folder, and many more are available in external repositories. These projects can be used as templates for creating your own custom federated model.
Clone the repository (make sure to use git-lfs!) and then follow these steps:
We provide docker-compose templates for a minimal standalone, pseudo-distributed Docker deployment, useful for local testing and development on a single host machine.
1. Create a default docker network
We need to make sure that all services deployed on our single host can communicate on the same docker network. Therefore, our provided docker-compose templates use a default external network ‘fedn_default’. First, create this network:
$ docker network create fedn_default
2. Deploy the base services (Minio and MongoDB)
$ docker-compose -f config/base-services.yaml up
Make sure you can access the following services before proceeding to the next steps:
- Minio: localhost:9000
- Mongo Express: localhost:8081
3. Start the Reducer
Copy the settings config file for the reducer, ‘config/settings-reducer.yaml.template’ to ‘config/settings-reducer.yaml’. You do not need to make any changes to this file to run the sandbox. To start the reducer service:
$ docker-compose -f config/reducer.yaml up
4. Start a combiner
Copy the settings config file for the reducer, ‘config/settings-combiner.yaml.template’ to ‘config/settings-combiner.yaml’. You do not need to make any changes to this file to run the sandbox. To start the combiner service and attach it to the reducer:
$ docker-compose -f config/combiner.yaml up
Make sure that you can access the Reducer UI at https://localhost:8090 and that the combiner is up and running before proceeding to the next step.
Train a federated model
Training a federated learning model on the FEDn network involves uploading a compute package, seeding the model, and attaching clients to the network. Follow the instruction here to set the environment up to train a model for digits classification using the MNIST dataset:
Updating/changing the compute package and/or the seed model
By design, it is not possible to simply delete the compute package to restart the alliance — this is a security constraint enforced to not allow for arbitrary code package replacement in an already configured federation. To restart and reseed the alliance in development mode navigate to MongoExpress (localhost:8081), login (credentials are found in the config/base-services.yaml), and delete the entire collection ‘fedn-test-network’, then restart all services.
Using FEDn in STACKn
STACKn lets a user set up FEDn networks as ‘Apps’ directly from the WebUI. STACKn also provides useful additional functionality such as serving the federated model using e.g. Tensorflow Serving, TorchServe, MLflow or custom serving. Refer to the STACKn documentation to set this up, or join the ODSC workshop on June 10, 2021, to learn how to use STACKn for deploying a FEDn federated learning environment.
The deployment, sizing of nodes, and tuning of a FEDn network in production depends heavily on the use case (cross-silo, cross-device, etc), the size of model updates, on the available infrastructure, and on the strategy to provide end-to-end security.
To learn more about federated learning, there is a fully distributed reference deployment available in the GitHub documentation as well: Distributed deployment
Where to go from here for Federated Learning
On GitHub, there are a number of different example projects/clients that can be used for further experimentation:
– PyTorch version of the MNIST getting-started example in test/mnist-pytorch
– Sentiment analyis with a Keras CNN-lstm trained on the IMDB dataset (cross-silo): https://github.com/scaleoutsystems/FEDn-client-imdb-keras
– Sentiment analyis with a PyTorch CNN trained on the IMDB dataset (cross-silo): https://github.com/scaleoutsystems/FEDn-client-imdb-pytorch.git
– VGG16 trained on cifar-10 with a PyTorch client (cross-silo): https://github.com/scaleoutsystems/FEDn-client-cifar10-pytorch
– Human activity recognition with a Keras CNN based on the casa dataset (cross-device): https://github.com/scaleoutsystems/FEDn-client-casa-keras
Wrapping up, I hope this has been a good introduction to federated learning with FEDn, and that we get a chance to meet at ODSC 2021 in my session “Federated Learning from Scratch to Production with Scaleout,” or somewhere else in the future. For more information, please contact Scaleout at firstname.lastname@example.org or @scaleout on Twitter.
About the author/ODSC Europe 2021 speaker on Federated Learning:
Daniel Zakrisson is the CEO and co-founder of Scaleout and has a long background as an entrepreneur and leader in deep tech companies. He co-founded Scandinavia’s first personal DNA-testing company in 2008, was CTO at a multinational growing medtech company for 7 years, and then co-founded the first international accelerator for blockchain startups. As CTO and CEO, he has many years of experience in leading deep tech projects and taking them to market.