talwalkarlab / leaf Goto Github PK
View Code? Open in Web Editor NEWLeaf: A Benchmark for Federated Settings
License: BSD 2-Clause "Simplified" License
Leaf: A Benchmark for Federated Settings
License: BSD 2-Clause "Simplified" License
I tried you script with small femnist data:
python3 main.py -dataset femnist -model cnn
the average accuracy is less than 0.1 after 400 rounds.
I also run sgd algorithm:
python3 main.py -dataset femnist -model cnn --minibatch 1.0
the accuracy is still less than 0.1.
Any thoughts?
thanks!
hi,
Thanks for publishing this lib. AFAIK, this library supports either original, non-i.i.d. data partitions, or i.i.d. equal-sized partitions. Have you considered supporting a third option, where the size of each partition is maintained but the data is being re-samples i.i.d from the dataset? (useful for some experiments, should be a part of the benchmark IMO)
I have ubuntu 16.04 with two GPUs, and I did these steps, but it fails to perform paper_experiments
but after some rounds with zero
s Failed to import the site module Traceback (most recent call last): File "/lib/python3.5/site.py", line 703, in <module> main() File "/lib/python3.5/site.py", line 670, in main virtual_install_main_packages() File "/lib/python3.5/site.py", line 553, in virtual_install_main_packages f = open(os.path.join(os.path.dirname(__file__), 'orig-prefix.txt'))
mv: cannot stat 'sys_metrics.csv': No such file or directory
mv: cannot stat 'stat_metrics.csv': No such file or directory
I have a question that should I add activation function, such as relu, between the lstm output and last dense layer?
leaf/models/sent140/stacked_lstm.py
Line 37 in 70b85d9
fc1 = tf.layers.dense(inputs=outputs[:, -1, :], units=128, activation=tf.nn.relu)
Hi Can you pinpoint the implementation details in the code related to differential privacy?
I am downloading the femnist dataset from s3.amazonaws.com, but the speed is too slow (4.07KB/s, total 984M). Is there other ways to download the data files? such by_class.zip.
Since Mac doesn't come preloaded with wget, I had issues running the preprocessing script for FEMNIST. Installing wget with homebrew didn't seem to work on its own, as the data was still not downloading β I ended up running get_data.sh
separately, and then starting preprocess.sh
afterward. Figured this might be useful for anyone else having the same problem.
Some tensorflow functions are no longer working, i.e. 'set_random_seed'. Should change them to new version, or include tensorflow version in requirements.txt
In the newest versions of Pillow, the Image.ANTIALIAS has been deprecated.
It has changed its name to LANCZOS with the same function as before.
Traceback (most recent call last):
File "/home/hiccup/Desktop/leaf-master/data/utils/split_data.py", line 126, in
file_dir = os.path.join(subdir, files[0])
IndexError: list index out of range
Line 126 in b71d271
Names should be 'batch-size' and 'num-epochs'.
Hi all!
I have updated my old FL simulation repo. (I made it for personal research purpose, FYI):
(https://github.com/vaseline555/Federated-Learning-PyTorch)
It now supports
torchvision.datasets
, torchtext.datasets
, LEAF
benchmarktransformers
library),FedAvg
, FedSGD
, and FedProx
),If you have interest, please visit my repository.
Thank you and also welcome any feedbacks or PRs. π
Hi guys, I want to use federated learning for anomaly detection and attack classification. Which model do you think best matches this problem? and how can I use my own data set on this models? Also my data set does not contain client IDs originally may I just use their number as ID?
@scaldas Hi!
I have a question about the model for CelebA dataset.
In the paper, it said that
For the CelebA experiments, we use 10% of the total clients and the same model we described above for FEMNIST.
...
For the FEMNIST experiments in the same figure, we subsample 5% of the data, and use a model with two convolutional layers followed by pooling, and a final dense layer with 2048 units.
However, the code contains 4 convolutional layers and etc.
What is the correct version?
Thanks.
Hi, I hope you are fine and thank you for sharing the reference implementations. I am trying to produce the plots mentioned in the paper for sent140
dataset. The steps I have followed are as follow:
Above command generated the following plot
:
Can you please help out what can be the issue? Is there anything I am missing?
Thank you
Hi - thanks for the great library.
E-MNIST can be split into various kinds, depending on whether class-balance is needed or not.
The balanced dataset header in this link shows that this is derived from the ByMerge
split, and finally has 47 classes, unlike 62 (unbalanced).
Does each user in the F-EMNIST dataset have balance across classes? Directly by using your code, I got an unbalanced dataset per user with many classes having as low as 1 image within that task.
Is there a way to generate class-balanced user splits from the code?
Thank you
Hello, thank you very much for this informative project.
I would want to ask that how can a fedprox algorithm be added to this code?
I mainly have some problems with with entering the global weights(in server) to the model in order to add it to the sum.
When I download femnist data, I want to partition users into train-test groups, so I sh preprocess.sh -s niid --sf 1.0 -k 0 -t user.
But I get the result is same as sh: preprocess.sh -s niid --sf 1.0 -k 0 -t sample?
Why? thank you !
Using the splitting method provided in the paper_experiment, I found that the testing data appeals exactly in the training data, resulting in a 100% testing accuracy if you use SGD+2 layers LSTM to train it.
For example, in the training set, 'THE_FIRST_PART_OF_HENRY_THE_SIXTH_MORTIMER''s words are:
[..., g age, Let dying Mortimer here rest himself. Even like a man new haled from the ',
' age, Let dying Mortimer here rest himself. Even like a man new haled from the r',
'age, Let dying Mortimer here rest himself. Even like a man new haled from the ra',
'ge, Let dying Mortimer here rest himself. Even like a man new haled from the rac',
'e, Let dying Mortimer here rest himself. Even like a man new haled from the rack',
', Let dying Mortimer here rest himself. Even like a man new haled from the rack,',
'Let dying Mortimer here rest himself. Even like a man new haled from the rack, S',
'et dying Mortimer here rest himself. Even like a man new haled from the rack, So',
't dying Mortimer here rest himself. Even like a man new haled from the rack, So ',
' dying Mortimer here rest himself. Even like a man new haled from the rack, So f',
'dying Mortimer here rest himself. Even like a man new haled from the rack, So fa',
'ying Mortimer here rest himself. Even like a man new haled from the rack, So far',
'ing Mortimer here rest himself. Even like a man new haled from the rack, So fare',
'ng Mortimer here rest himself. Even like a man new haled from the rack, So fare ',
'g Mortimer here rest himself. Even like a man new haled from the rack, So fare m',...]
and in the testing set, you can find the exact santence:
[' Let dying Mortimer here rest himself. Even like a man new haled from the rack, ']
My model will get a testing accuracy about 1 in about 35 epochs.
The documentations states that The plots shown below can be generated using plots.py file in the repo root. I don't appear to be able to locate this file though.
Also a statistical files metrics to leaf/models/metrics/stat_metrics.csv and leaf/models/metrics/sys_metrics.csv, they appear to be called metrics_stat.csv metrics_sys.csv.
Can you clarify the correct file names please.
I use the following script to generate the shakespeare data.
./preprocess.sh -s niid --sf 1.0 -k 0 -t sample -tf 0.8
The statistics is:
###################################
DATASET: shakespeare
557 users
2177224 samples (total)
3908.84 samples per user (mean)
num_samples (std): 7226.23
num_samples (std/mean): 1.85
num_samples (skewness): 4.38
num_sam num_users
0 336
2000 77
4000 43
6000 17
8000 24
10000 13
12000 14
14000 5
16000 5
18000 2
But the paper shows the Shakespeare has 2288 users.
Since I am rushing a paper based on LEAF dataset, could you help to fix this problem? Thanks!
The Reddit subsampled dataset location is at https://drive.google.com/file/d/1PwBpAEMYKNpnv64cQ2TIQfSc_vPbq3OQ/view?usp=sharing. However on downloading this zip, we find that the train, validation and test sets are exactly identical. Is there any plan to address this ? Also, are the results presented in the LEAF paper based on this set of subsampled data ? @scaldas
Hi,
Thank you for the benchmark. I am seeing a strange behavior. I am trying to run the experiment using synthetic
dataset. I am observing that simple network using tf.layers.dense does not give predictions accurately but when I use following code in place of tf.layers.dense then I get values closer to ground truth:
Original logits calculation
logits = tf.layers.dense(features, self.num_classes, tf.nn.relu)
Modified logits calculation
init_value = tf.constant(value=-0.0826, shape=[self.input_dim, self.num_classes])
weights = tf.Variable(initial_value=init_value,shape=[self.input_dim, self.num_classes], name="Weights")
init_value = tf.constant(value=-0.0826, shape=[self.num_classes])
biases = tf.Variable(initial_value=init_value, shape=[self.num_classes], name="Biases" )
logits = tf.nn.relu(tf.matmul(features, weights, name="MatMulLogits") + biases)
Input: [-0.9564917406583653, 0.6703038000763276, -0.8226291466995398, -1.0770311337470495, -0.785290071358798, 0.3809045777819794, -1.3688283052201289, -0.653962343061565, -0.7081657613676491, 1.3065677857572335]
LEAF output using tf.layers.dense: [[0.24843943, 0.25128123, 0.24843943, 0.24843943, 0.24843943]
LEAF output using tf.nn.relu(tf.matmul(features, weights, name="MatMulLogits") + biases): [[0.2490078 0.2490078 0.2490078 0.2490078 0.2490078]]
Actual Values i.e., ground truth multiplication value (input * weights + bias): [[0.249007804 0.249007804 0.249007804 0.249007804 0.249007804]]
As you can see that tf.matmul is producing values closer to ground truth whereas tf.layers.dense is away from actual values in 3rd decimal places. Can you please guide what is the issue and how can we get logits closer to ground truth?
Thank you
Best Regards,
Line 40 in client.py will always be 2 as it counts the number of elements in a dictionary and not the the number of samples.
I run the command: bash preprocess.sh -s niid --sf 1.0 -k 0 -t sample -tf 0.8 under the shakespeare dataset and visualize the number of users in the training data set, the number I get is 1129 which does not match the statistics in the README file(2288 users).
When I run python main.py -dataset femnist -model cnn -lr 0.06 --minibatch 0.1 --clients-per-round 3 --num-rounds 20
======================End of Report==========================
Clients in Total: 0
--- Random Initialization ---
--- Round 1 of 20: Training 3 Clients ---
Traceback (most recent call last):
File "main.py", line 186, in
main()
File "main.py", line 87, in main
server.update_model()
File "/mnt/c/ul/ai/Thesis/DataSet/FEMNIST/leaf-master/leaf-master/models/server.py", line 72, in update_model
base = [0] * len(self.updates[0][1])
IndexError: list index out of range
Any ideas welcome.
Is there a PyTorch version of leaf?
When preprocessing sent140, the intermediate .csv
file saved by combine_data.py
will have blank lines, causing data_to_json.py
to fail to run.
In addition, an error in the encoding format will also be reported.
It is suggested to change line 27
of combine_data.py
into the following form:
with open(out_file_name, 'w', encoding='ISO-8859-1', newline='') as f:
I was going to preprocess FEMNIST dataset following the introduction of leaf/data/femnist/README.md
. The command that I used is
./preprocess.sh -s niid --sf 1.0 -k 0 -t sample
However, I only got four files generated in the folder leaf/data/femnist/data/all_data
. When I reflected on the printed log, I found something abnormal
...
converting data to .json format
writing all_data_0.json
writing all_data_1.json
writing all_data_2.json
writing all_data_3.json
./data_to_json.sh: line 56: 2906 Killed python3 data_to_json.py
finished converting data to .json format
------------------------------
sampling data
...
The memory seemed to be exhausted, which cause the process which ran python3 data_to_json.py
to teminate early.
To resolve the problem, the code should be modified to control the usage of memory.
On the site https://leaf.cmu.edu/, FEMNIST is said to have 3,550 users and 805,263 samples. However, I ran the provided command
./preprocess.sh -s niid --sf 1.0 -k 0 -t sample
to get the full-sized dataset, and then run ./stats.sh to get the statistics. The outputs are as follows
0 1
20 4
40 11
60 5
80 16
100 66
120 125
140 394
160 1241
180 329
200 47
220 62
240 95
260 107
280 125
300 167
320 168
340 185
360 172
380 149
400 87
420 36
440 3
460 1
480 0
Summing up the number of clients, we get 3,597 rather than 3,550 ones. Actually, I've also count the total number of samples in train/ and test/, and got 817,851 rather than 805,263.
Is there anything wrong with my command for data processing, which leads to such inconsistency?
Clients in Total: 5
--- Random Initialization ---
train_accuracy: 0.00500833, 10th percentile: 0, 50th percentile: 0.00304878, 90th percentile 0.0118636
train_loss: 4.13098, 10th percentile: 4.12663, 50th percentile: 4.13026, 90th percentile 4.13544
test_accuracy: 0.0148149, 10th percentile: 0, 50th percentile: 0, 90th percentile 0.0368978
test_loss: 4.12193, 10th percentile: 4.10857, 50th percentile: 4.12467, 90th percentile 4.13296
--- Round 1 of 2000: Training 3 Clients ---
Traceback (most recent call last):
File "main.py", line 186, in
main()
File "main.py", line 83, in main
sys_metrics = server.train_model(num_epochs=args.num_epochs, batch_size=args.batch_size, minibatch=args.minibatch)
File "/hy-tmp/leaf/models/server.py", line 60, in train_model
comp, num_samples, update = c.train(num_epochs, batch_size, minibatch)
File "/hy-tmp/leaf/models/client.py", line 39, in train
comp, update = self.model.train(data, num_epochs, num_data)
File "/hy-tmp/leaf/models/model.py", line 88, in train
self.run_epoch(data, batch_size)
File "/hy-tmp/leaf/models/model.py", line 96, in run_epoch
for batched_x, batched_y in batch_data(data, batch_size, seed=self.seed):
File "/hy-tmp/leaf/models/utils/model_utils.py", line 18, in batch_data
np.random.shuffle(data_x)
File "mtrand.pyx", line 4865, in mtrand.RandomState.shuffle
File "mtrand.pyx", line 4868, in mtrand.RandomState.shuffle
TypeError: 'tuple' object does not support item assignment
solutions:File "/hy-tmp/leaf/models/utils/model_utils.py", line 18, in batch_data
change np.random.shuffle(data_x) ->np.random.shuffle(list(data_x))
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.