xtra-computing / niid-bench Goto Github PK
View Code? Open in Web Editor NEWFederated Learning Benchmark - Federated Learning on Non-IID Data Silos: An Experimental Study (ICDE 2022)
License: MIT License
Federated Learning Benchmark - Federated Learning on Non-IID Data Silos: An Experimental Study (ICDE 2022)
License: MIT License
Hello, thank you for your nice code.
Maybe here shoulbe be "default='mlp'" instead of "default='MLP'"
Line 27 in b56de2a
Hi! I've read your paper 'Federated Learning on Non-IID Data Silos: An Experimental Study' and 'Model-Contrastive Federated Learning'. I think both works are very interesting!
However, I notice that the performance of Scaffold is unstable and even extremely poor (e.g. nan) in some cases of your implementation. I've also found that Scaffold is reliably stable and outperforms many baselines across different settings of data heterogeneity and datasets in other works like FedDyn (ICLR 2021), FedDC (CVPR2022). Therefore, I carefully read the source code of yours and found the controlling variable is not initialized as described in 'SCAFFOLD: Stochastic Controlled Averaging for Federated Learning' in Sec.4. Since a random initialization of controlling variable is meaningless and can be misleading when locally training the model (especially at the early stage of training), I wonder whether this is the reason accounting for the unstablility and poor performance of Scaffold.
Hi, when I run the following command
python experiments.py --model=simple-cnn --dataset=cifar10 --alg=all_in --lr=0.01 --batch-size=64 --epochs=10 --n_parties=10 --rho=0.9 --comm_round=50 --partition=noniid-labeldir --beta=0.5 --device=cuda:0 --datadir=./data/ --logdir=./logs/ --noise=0 --init_seed=0
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Maybe this line of code should be preceded by "nets[0].to(device)"
Line 1218 in 885648a
Thanks for your work. I think there is probably a small mistake in line 673 of experiments.py. According to the pseudocode of SCAFFOLD, the denominator here should be the number of all clients instead of the number of selected clients per round. Please correct me if I 'm wrong. Thank you!
Line 725 in e22b421
Number of local samples
Although the batch size is the same for all dataloader,there will be cases where the number of data cannot be divided by the batch size.
I think this will be a little different from the pseudo-code
Thanks for your contribution, it is useful to me
When I change the mlp model to a bigger one , I find that scaffold's loss is 'nan' after several rounds, do you have any idea for this issue?
According to the readme file, we notice that there are different types of noise, and we can select 'noise_type' as level so that gaussian noise will be added to all the pixels of images. However, when we read the class AddGaussianNoise, we found that the list named 'filt' will select part of the pixels to add gaussian noise, which seems odd to us. Thus, we are here to ask the question of whether we should add a conditional judgment before implementing the 'filt' according to the 'noise_type'. (e.g. when noise_type == 'level', we do not implement the 'filt').
The test_ds_local is loaed by test_ds = dl_obj(datadir+'./val/', transform=transform_test)
.
Therefore, test_ds is the same for all clients.
Line 771 in 692569f
Line 860 in 692569f
femnist datatset seems to have only 10 classes?
因为Fednova比Fedavg低了四个千,fedavg比fedprox低了四个千,SCAFFOLD远高于其他方法,但是图上,fednova是比较低的,而另外三种方法的线很接近,似乎和数据不符
Hi, your implementation is good. The FedAvg algorithm in the original paper https://arxiv.org/pdf/2007.07481.pdf includes the learning rate
你好,非常感谢你提供的代码。我有个问题想请问一下,使用noniid-#label2划分数据时,每次程序运行得到的结果都不一样,无法重复上一次的实验数据划分。希望能解答一下,万分感谢
The feature imbalances discussed in Readme are:
There are few issues with it.
It could be great if these algorithms would get implemented, or atleast we get a pseudocode to see how to implement them. As it would be very useful for literature surveys for future works in non IID data distribution since not many works actually talk about feature skew.
HI,I want to know how i find "-load_path" ?
Thanks
Hello,
There is a missing comma in line
Line 202 in 42e6035
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.