Giter Site home page Giter Site logo

fedgen's Introduction

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Research code that accompanies the paper Data-Free Knowledge Distillation for Heterogeneous Federated. It contains implementation of the following algorithms:

Install Requirements:

pip3 install -r requirements.txt

Prepare Dataset:

  • To generate non-iid Mnist Dataset following the Dirichlet distribution D(α=0.1) for 20 clients, using 50% of the total available training samples:
cd FedGen/data/Mnist
python generate_niid_dirichlet.py --n_class 10 --sampling_ratio 0.5 --alpha 0.1 --n_user 20
### This will generate a dataset located at FedGen/data/Mnist/u20c10-alpha0.1-ratio0.5/
  • Similarly, to generate non-iid EMnist Dataset, using 10% of the total available training samples:
cd FedGen/data/EMnist
python generate_niid_dirichlet.py --sampling_ratio 0.1 --alpha 0.1 --n_user 20 
### This will generate a dataset located at FedGen/data/EMnist/u20-letters-alpha0.1-ratio0.1/

Run Experiments:

There is a main file "main.py" which allows running all experiments.

Run experiments on the Mnist Dataset:

python main.py --dataset Mnist-alpha0.1-ratio0.5 --algorithm FedGen --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3 
python main.py --dataset Mnist-alpha0.1-ratio0.5 --algorithm FedAvg --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3 
python main.py --dataset Mnist-alpha0.1-ratio0.5 --algorithm FedProx --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3 
python main.py --dataset Mnist-alpha0.1-ratio0.5 --algorithm FedDistill-FL --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3 

Run experiments on the EMnist Dataset:
python main.py --dataset EMnist-alpha0.1-ratio0.1 --algorithm FedAvg --batch_size 32 --local_epochs 20 --num_users 10 --lamda 1 --model cnn --learning_rate 0.01 --personal_learning_rate 0.01 --num_glob_iters 200 --times 3 
python main.py --dataset EMnist-alpha0.1-ratio0.1 --algorithm FedGen --batch_size 32 --local_epochs 20 --num_users 10 --lamda 1 --model cnn --learning_rate 0.01 --personal_learning_rate 0.01 --num_glob_iters 200 --times 3 
python main.py --dataset EMnist-alpha0.1-ratio0.1 --algorithm FedProx --batch_size 32 --local_epochs 20 --num_users 10 --lamda 1 --model cnn --learning_rate 0.01 --personal_learning_rate 0.01 --num_glob_iters 200 --times 3 
python main.py --dataset EMnist-alpha0.1-ratio0.1 --algorithm FedDistill-FL --batch_size 32 --local_epochs 20 --num_users 10 --lamda 1 --model cnn --learning_rate 0.01 --personal_learning_rate 0.01 --num_glob_iters 200 --times 3 


Plot

For the input attribute algorithms, list the name of algorithms and separate them by comma, e.g. --algorithms FedAvg,FedGen,FedProx

  python main_plot.py --dataset EMnist-alpha0.1-ratio0.1 --algorithms FedAvg,FedGen --batch_size 32 --local_epochs 20 --num_users 10 --num_glob_iters 200 --plot_legend 1

fedgen's People

Contributors

avivbick avatar xcrossd avatar zhuangdizhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

fedgen's Issues

Question about the implementation of "FedProx"

Hi.

Does your implementation code of FedProx correspond to the algorithm block 2 in the original paper of FedProx? More specifically, the formula for updating lines 53-54 of code file "fedoptimizer.py" seems a little strange, right? In particular, what does lambda mean in FedProx algorithm?

The update formula I understand should be :
p.data=p.data - group['lr'] * ( p.grad. data + group ['mu'] * (p.data - pstar.data.clone())

Looking forward to your reply.

Trainloader is not shuffle

The performance of FedAvg is not as good as FedGen simply because the Trainloader does not have a shuffle. After fixing the bugs Fedgen is not as effective as Fedavg.

Wrong tensor type error

If there are wrong tensor type errors when running experiments with FedGen algorithm, see changes in #3

Cannot ultilize GPU for FedGen

I run the example experiment for FedGen on Mnist in README.md with the option "--device cuda" but find out there is no process deployed on GPU. I further explore your code and it seems that you have not handled "args.device" in all scripts. Besides, I add "os.environ["CUDA_VISIBLE_DEVICES"] = '0'" in main.py but the model is still deployed only on CPU. I wonder how I can utilize GPU for FedGen. I really appreciate your help!

run the code on cuda device

It seems that the code does not supprt CUDA?

--device "cuda" can be set but it seems that it is always running on cpu

Thanks

plot problem

I think in the file plot_utils.py, the variable 'all_curves' used in the outside of the loop only saves the last algorithm's results, in this way, when we add several algorithms in the config, the plot figure result will cut the other algorithms' trend by following the last one's scope.

max_acc = np.max([max_acc, np.max(all_curves) ]) + 4e-2

python main_plot.py --dataset EMnist-alpha0.1-ratio0.1 --algorithms FedAvg,FedGen,FedProx,FedDistill --batch_size 32 --local_epochs 20 --num_users 10 --num_glob_iters 200 --plot_legend 1

Reproduce "FedDF" baseline

Thank you for open-sourcing your project. I notice that "FedDF" (Ensemble Distillation for Robust Model Fusion in Federated Learning) is one of your baselines in your paper, however, you provide code for only FedAvg, FedProx, FedDistill, and FedGen. Could you please help me reproduce the results of FedDF? I really appreciate your help.

Partial Parameter Sharing Not Supported

It seems the code implemented does not conduct partial parameter sharing. As shown in line 103 of serverpFedGen.py, the partial parameter is default set to False, but in the paper, the pseudo-code shows only the classifier layer of the user's model is shared. Is it a bug or there is something I misunderstand in the code
self.aggregate_parameters()

Unable to perform Mnist experiments

when i'm ready to run "python main.py --dataset Mnist-alpha0.01-ratio0.05 --algorithm FedAvg --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3"I got the following problem。How can I solve it.

Average Global Accurancy = 0.0950, Loss = 2.31.
Traceback (most recent call last):
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\users\userbase.py", line 163, in get_next_train_batch
(X, y) = next(self.iter_trainloader)
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 676, in _next_data
index = self._next_index() # may raise StopIteration
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 623, in _next_index
return next(self._sampler_iter) # may raise StopIteration
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\main.py", line 85, in
main(args)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\main.py", line 42, in main
run_job(args, i)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\main.py", line 37, in run_job
server.train(args)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\servers\serveravg.py", line 35, in train
user.train(glob_iter, personalized=self.personalized) #* user.train_samples
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\users\useravg.py", line 23, in train
result =self.get_next_train_batch(count_labels=count_labels)
File "C:\kust\xuesu\code\FedGen-main\FedGen-main\FLAlgorithms\users\userbase.py", line 167, in get_next_train_batch
(X, y) = next(self.iter_trainloader)
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 633, in next
data = self._next_data()
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 676, in _next_data
index = self._next_index() # may raise StopIteration
File "C:\Users\Administrator\anaconda3\envs\FedGen\lib\site-packages\torch\utils\data\dataloader.py", line 623, in _next_index
return next(self._sampler_iter) # may raise StopIteration
StopIteration

Can't run EMNIST experiment

When I ran the EMNIST experiment after generation of emnist dataset I got:

(pt) wangshu@ubuntu:~/projects/FedGen$ CUDA_VISIBLE_DEVICES=3 python main.py --dataset EMnist-alpha0.1-ratio0.1 --algorithm FedGen --batch_size 32 --local_epochs 20 --num_users 10 --lamda 1 --model cnn --learning_rate 0.01 --personal_learning_rate 0.01 --num_glob_iters 200 --times 3 
================================================================================
Summary of training process:
Algorithm: FedGen
Batch size: 32
Learing rate       : 0.01
Ensemble learing rate       : 0.0001
Average Moving       : 1.0
Subset of users      : 10
Number of global rounds       : 200
Number of local rounds       : 20
Dataset       : EMnist-alpha0.1-ratio0.1
Local Model       : cnn
Device            : cpu
================================================================================


         [ Start training iteration 0 ]           


Creating model for emnist
Network configs: [6, 16, 'F']
Dataset emnist
/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='none' instead.
  warnings.warn(warning.format(ret))
Build layer 57 X 256
Build last layer 256 X 32
ensemble_lr: 0.0001
ensemble_batch_size: 128
unique_labels: 25
latent_layer_idx: -1
label embedding 0
ensemeble learning rate: 0.0001
ensemeble alpha = 1, beta = 0, eta = 1
generator alpha = 10, beta = 1
Number of Train/Test samples: 12480 8120
Data from 20 users in total.
Finished creating FedAvg server.


-------------Round number:  0  -------------


Traceback (most recent call last):
  File "/home/wangshu/projects/FedGen/main.py", line 85, in <module>
    main(args)
  File "/home/wangshu/projects/FedGen/main.py", line 42, in main
    run_job(args, i)
  File "/home/wangshu/projects/FedGen/main.py", line 37, in run_job
    server.train(args)
  File "/home/wangshu/projects/FedGen/FLAlgorithms/servers/serverpFedGen.py", line 78, in train
    self.evaluate()
  File "/home/wangshu/projects/FedGen/FLAlgorithms/servers/serverbase.py", line 226, in evaluate
    test_ids, test_samples, test_accs, test_losses = self.test(selected=selected)
  File "/home/wangshu/projects/FedGen/FLAlgorithms/servers/serverbase.py", line 165, in test
    ct, c_loss, ns = c.test()
  File "/home/wangshu/projects/FedGen/FLAlgorithms/users/userbase.py", line 137, in test
    loss += self.loss(output, y)
  File "/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/modules/loss.py", line 216, in forward
    return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
  File "/home/wangshu/miniconda3/envs/pt/lib/python3.9/site-packages/torch/nn/functional.py", line 2388, in nll_loss
    ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 25 is out of bounds.
(pt) wangshu@ubuntu:~/projects/FedGen$ 

Pythorch 1.8.1, python 3.9.4.

Question: Broadcasting Updated Generative Model to Users After Training

Description

Hello,

I have been working with the FedGen implementation and have a question regarding the broadcasting of the updated generative model w to users after it has been trained on the server.

Context

In the FedGen class, the generative model w is trained using the train_generator method. However, I couldn't find the part of the code where the updated generative model parameters are broadcasted to the users after each iteration.

I noticed that the send_parameters method broadcasts the global model parameters to users but does not broadcast the generative model parameters.

Code Snippets

def train(self, args):
    #### pretraining
    for glob_iter in range(self.num_glob_iters):
        print("\n\n-------------Round number: ",glob_iter, " -------------\n\n")
        self.selected_users, self.user_idxs=self.select_users(glob_iter, self.num_users, return_idx=True)
        if not self.local:
            self.send_parameters(mode=self.mode)# broadcast averaged prediction model
        self.evaluate()
        chosen_verbose_user = np.random.randint(0, len(self.users))
        self.timestamp = time.time() # log user-training start time
        for user_id, user in zip(self.user_idxs, self.selected_users): # allow selected users to train
            verbose= user_id == chosen_verbose_user                # perform regularization using generated samples after the first communication round
            user.train(
                glob_iter,
                personalized=self.personalized,
                early_stop=self.early_stop,
                verbose=verbose and glob_iter > 0,
                regularization= glob_iter > 0 )
        curr_timestamp = time.time() # log  user-training end time
        train_time = (curr_timestamp - self.timestamp) / len(self.selected_users)
        self.metrics['user_train_time'].append(train_time)
        if self.personalized:
            self.evaluate_personalized_model()

        self.timestamp = time.time() # log server-agg start time
        self.train_generator(
            self.batch_size,
            epoches=self.ensemble_epochs // self.n_teacher_iters,
            latent_layer_idx=self.latent_layer_idx,
            verbose=True
        )
        self.aggregate_parameters()
        curr_timestamp=time.time()  # log  server-agg end time
        agg_time = curr_timestamp - self.timestamp
        self.metrics['server_agg_time'].append(agg_time)
        if glob_iter  > 0 and glob_iter % 20 == 0 and self.latent_layer_idx == 0:
            self.visualize_images(self.generative_model, glob_iter, repeats=10)

    self.save_results(args)
    self.save_model()

Training with CIFAR-10

Thank you for the great work.

Besides, Does anyone try to train with CIFAR-10. I have followed the setup for Mnist: replace the data loader of Mnist to CIFAR-10, change input dimension from 1 to 3, keep the same models. However, the result is not good (about 31%) on FedAvg.

Is there any special setting when do experiment with a new dataset? Thank you

Question about FedProx

Hi.

Does your implementation code of FedProx correspond to the algorithm block 2 in the original paper of FedProx? More specifically, the formula for updating lines 53-54 of code file "fedoptimizer.py" seems a little strange, right? In particular, what does lambda mean in FedProx algorithm?

The update formula I understand should be :
p.data=p.data - group['lr'] * ( p.grad. data + group ['mu'] * (p.data - pstar.data.clone())

Looking forward to your reply.

Network configs: [6, 16, 'F']

Hi, I'm unable to run any of the files.
This was what is churned out. What does the Network configs: [6, 16, 'F'] mean?
python main.py --dataset Mnist-alpha0.1-ratio0.5 --algorithm FedDistll-FL --batch_size 32 --num_glob_iters 200 --local_epochs 20 --num_users 10 --lamda 1 --learning_rate 0.01 --model cnn --personal_learning_rate 0.01 --times 3

Summary of training process:
Algorithm: FedDistll-FL
Batch size: 32
Learing rate : 0.01
Ensemble learing rate : 0.0001
Average Moving : 1.0
Subset of users : 10
Number of global rounds : 200
Number of local rounds : 20
Dataset : Mnist-alpha0.1-ratio0.5
Local Model : cnn
Device : cpu

     [ Start training iteration 0 ]

Creating model for mnist
Network configs: [6, 16, 'F']
Algorithm FedDistll-FL has not been implemented.

the question about main_plot.py

Hello
sorry,I have a problem about main_plot.pyI

the problem
FileNotFoundError: [Errno 2] No such file or directory: 'figs\Mnist/ratio0.5\Mnist-ratio0.5.png'

I hope to have a look during my busy schedule. I just touched this direction.Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.