VeryFL: A GitHub Benchmark for Blockchain-Based Federated Learning Frameworks

Introduction

VeryFL emerges as a streamlined federated learning framework seamlessly integrated with blockchain technology, specifically Ethereum. This innovative framework leverages PyTorch for the federated learning components and Solidity for the blockchain functionalities, deployed on the Ethereum network to create a tangible execution environment for blockchain-based federated learning algorithms. VeryFL is designed to serve multiple purposes, making it a versatile tool for the educational and research communities.

Firstly, it acts as an excellent educational resource for grasping the fundamental workflow of federated learning. Secondly, it provides a platform for validating centralized federated learning algorithms. Most importantly, VeryFL offers a unique environment for rigorously testing and verifying blockchain-based federated learning algorithms in a real Ethereum setting. This makes it an invaluable benchmark for researchers and developers in the field.

Dependencies

To ensure VeryFL operates smoothly, you’ll need to set up the following environments:

Ethereum Environment:

Node.js & npm: Ensure you have Node.js version 16.0.0 or higher and npm version 7.10.0 or later installed. These are crucial for managing the blockchain components.
Ganache: Ganache is required for setting up a local Ethereum blockchain for development and testing. You can install it globally using npm:
```
npm install ganache --global
```

Python Environment:

Anaconda: Anaconda is recommended to manage your Python environment and dependencies effectively.
Python: VeryFL is compatible with Python versions 3.6 to 3.9.
PyTorch: Version 1.13 of PyTorch is a necessary dependency for the federated learning aspects of VeryFL.
Brownie: Brownie is a Python-based development and testing framework for smart contracts targeting the Ethereum Virtual Machine. Install it using pip:
```
pip install eth-brownie
```

Image showing the VeryFL framework architecture, highlighting the integration of federated learning and blockchain components.

Core Functionalities of VeryFL

VeryFL is equipped with a range of essential functions that make it a robust benchmark and experimental platform:

Federated Learning Experiment Execution

VeryFL is capable of simulating diverse federated learning experiments, encompassing both centralized and decentralized paradigms. The framework includes a collection of image classification datasets and incorporates classic federated learning algorithms, providing a comprehensive toolkit for experimentation. This functionality allows users to thoroughly investigate and compare different federated learning approaches within a controlled environment.

On-Chain Mechanisms via Solidity Smart Contracts

A primary goal of VeryFL is to offer an experimental platform for blockchain-integrated federated learning. To this end, it features an embedded Ethereum network that enables the implementation of on-chain mechanisms using Solidity. These Solidity smart contracts can be deployed and executed within VeryFL, allowing for the practical exploration of blockchain’s role in federated learning. This functionality is critical for researchers aiming to bridge the gap between federated learning and decentralized technologies.

Model Copyright Protection and Transaction Framework

VeryFL demonstrates a framework for protecting model copyright and facilitating model transactions using blockchain technology. By integrating model watermarking techniques, VeryFL allows for the embedding of watermarks into models. These watermarks are then managed on the blockchain to establish model ownership and track transactions. This innovative feature showcases the potential of blockchain to secure and manage intellectual property in the context of federated learning models. Further details on this functionality can be found in the research article referenced below [2].

Code Structure and Practical Usage

Quick Start Guide

To quickly initiate a federated learning experiment using VeryFL with a benchmark dataset like FashionMNIST, execute the following command in your terminal:

python test.py --benchmark FashionMNIST

This command runs the test.py script, which is designed to set up and execute a federated learning task based on the specified benchmark. The script utilizes configurations defined in ./config/benchmark.py to initialize the global parameters, training arguments, and the chosen federated learning algorithm.

Customizing Task Parameters

The ./config/benchmark.py file is central to configuring and customizing your federated learning experiments in VeryFL. Each benchmark defined in this file is structured into three key components:

global_args: These parameters define global federated learning settings such as the number of clients, the dataset to be used, and the model architecture.
train_args: This section specifies training hyperparameters including the learning rate and weight decay, allowing fine-tuning of the training process.
Algorithm: This component dictates the federated learning algorithm to be employed, defined by the Aggregator (server-side logic), Client (client-side logic), and Trainer (training process).

By modifying these parameters in the benchmark configuration file, users can tailor experiments to specific research questions or application requirements.

Extending VeryFL with New FL Algorithms

VeryFL is designed to be extensible, allowing users to incorporate their own federated learning algorithms. To add a new algorithm:

Client-side Algorithm Implementation: Implement a new Trainer class in the ./client/trainer directory. This class should encapsulate the client-side training logic of your federated learning algorithm.
Server-side Algorithm Implementation: Develop a new aggregator class in the ./server/aggregation_alg directory. This class should define the server-side aggregation logic for your algorithm.

Once implemented, these new components can be integrated into VeryFL by updating the benchmark configurations to utilize your custom algorithm.

Integrating New On-Chain Mechanisms

To incorporate new on-chain functionalities into VeryFL:

Solidity Smart Contract Development: Implement the desired on-chain logic using Solidity within the ./chainEnv/contracts directory.
Smart Contract Deployment: Configure the deployment of your smart contracts when the network starts. This is managed within the ./chainfl/interact directory.
Function Call Wrapping with Brownie SDK: Wrap the function calls to your smart contracts using the Brownie SDK within the chainProxy class, located in ./chainfl/interact.
Blockchain Interaction during Training: Utilize the chainProxy class to interact with the blockchain from within your federated learning training process. This allows for seamless integration of on-chain mechanisms with federated learning workflows.

By following these steps, VeryFL can be adapted to explore a wide range of blockchain-based federated learning innovations.

Relevant Publications

For deeper insights into VeryFL and its applications, refer to the following articles:

[1] [VeryFL Design] VeryFL: A Verify Federated Learning Framework Embedded with Blockchain(Arxiv) – This paper details the design and architecture of the VeryFL framework.

[2] [Model Copyright] Tokenized Model: A Blockchain-Empowered Decentralized Model Ownership Verification Platform(Arxiv) – This article explores the model copyright protection and transaction capabilities implemented in VeryFL.

[3] [Overall Background] Towards Reliable Utilization of AIGC: Blockchain-Empowered Ownership Verification Mechanism(OJCS 2023) – Provides a broader context on blockchain’s role in securing AI and AIGC.

[4] [Using VeryFL] A decentralized federated learning framework via committee mechanism with convergence guarantee(TPDS 2022) – Showcases the application of VeryFL in a decentralized federated learning scenario.