When I read the paper and the relevant code from this repo, I have several questions about the setting of tinyImagenet experiments:
I went through your paper and did not find any descriptions on this point. I printed the model architecture of tiny-Imagenet experiments and found that each classification head has an output of 200
. Based on my understanding, shouldn't they be 40 classifiers with output=5
in each classifier?
SubNet(
(drop1): Dropout(p=0.0, inplace=False)
(drop2): Dropout(p=0.0, inplace=False)
(conv1): SubnetConv2d(3, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
(conv2): SubnetConv2d(160, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
(conv3): SubnetConv2d(160, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
(conv4): SubnetConv2d(160, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
(linear1): SubnetLinear(in_features=2560, out_features=640, bias=False)
(linear2): SubnetLinear(in_features=640, out_features=640, bias=False)
(last): ModuleList(
(0): Linear(in_features=640, out_features=200, bias=False)
(1): Linear(in_features=640, out_features=200, bias=False)
(2): Linear(in_features=640, out_features=200, bias=False)
(3): Linear(in_features=640, out_features=200, bias=False)
(4): Linear(in_features=640, out_features=200, bias=False)
(5): Linear(in_features=640, out_features=200, bias=False)
(6): Linear(in_features=640, out_features=200, bias=False)
(7): Linear(in_features=640, out_features=200, bias=False)
(8): Linear(in_features=640, out_features=200, bias=False)
(9): Linear(in_features=640, out_features=200, bias=False)
(10): Linear(in_features=640, out_features=200, bias=False)
(11): Linear(in_features=640, out_features=200, bias=False)
(12): Linear(in_features=640, out_features=200, bias=False)
(13): Linear(in_features=640, out_features=200, bias=False)
(14): Linear(in_features=640, out_features=200, bias=False)
(15): Linear(in_features=640, out_features=200, bias=False)
(16): Linear(in_features=640, out_features=200, bias=False)
(17): Linear(in_features=640, out_features=200, bias=False)
(18): Linear(in_features=640, out_features=200, bias=False)
(19): Linear(in_features=640, out_features=200, bias=False)
(20): Linear(in_features=640, out_features=200, bias=False)
(21): Linear(in_features=640, out_features=200, bias=False)
(22): Linear(in_features=640, out_features=200, bias=False)
(23): Linear(in_features=640, out_features=200, bias=False)
(24): Linear(in_features=640, out_features=200, bias=False)
(25): Linear(in_features=640, out_features=200, bias=False)
(26): Linear(in_features=640, out_features=200, bias=False)
(27): Linear(in_features=640, out_features=200, bias=False)
(28): Linear(in_features=640, out_features=200, bias=False)
(29): Linear(in_features=640, out_features=200, bias=False)
(30): Linear(in_features=640, out_features=200, bias=False)
(31): Linear(in_features=640, out_features=200, bias=False)
(32): Linear(in_features=640, out_features=200, bias=False)
(33): Linear(in_features=640, out_features=200, bias=False)
(34): Linear(in_features=640, out_features=200, bias=False)
(35): Linear(in_features=640, out_features=200, bias=False)
(36): Linear(in_features=640, out_features=200, bias=False)
(37): Linear(in_features=640, out_features=200, bias=False)
(38): Linear(in_features=640, out_features=200, bias=False)
(39): Linear(in_features=640, out_features=200, bias=False)
)
)
Sorry, I am a freshman in the field of Continual Learning. Looking forward to getting any replies.