Comments (7)
@qingzhouzhen Sorry for slow response because I didn't notice the issues on my repo. I don't have imagenet dataset now. But I trained it on mnist and cifar. The acc seems normally increasing. I notice that if the initializer is xavier, it cannot converge on cifar. It can converge if using uniform initializer.
You can verify it on cifar now by
git clone https://github.com/ZiyueHuang/MXShuffleNet.git
cd MXShuffleNet/image-classification
python train_cifar10.py --network shufflenet --batch-size 128 --gpus 0 --lr 0.01
Or on mnist by
python test_mnist.py
I am not sure this implementation is totally correct.
Could you please train it on imagenet for a few epochs again? If it cannot converge, please let me know.
I'll be very appreciated for your responses.
Here is the commit which add shufflenet, fdb1e77
from mxshufflenet.
@qingzhouzhen By the way, how about your shufflenet in gluon? Did you reproduce the paper?
from mxshufflenet.
Ok, but now I am doing somework else, I will tell your result if I test you net.
No, exactly, I do not know how to use gluon, such as 'cancate' operation is not defined, I need sometime to understand how to use it.
from mxshufflenet.
Thanks. Feel free to contact me if there are some problems.
To use operators like contact
which are not in gluon.nn
, there are two ways,
Use Block
, internally use ndarray
from mxnet.gluon import Block
from mxnet import ndarray as F
class Net(Block):
def __init__(self, **kwargs):
...
def forward(self, x):
# x is an ndarray
return F.concat(x, ...)
Use HybridBlock
, internally use ndarray
or symbol
from mxnet.gluon import HybridBlock
class Net(HybridBlock):
def __init__(self, **kwargs):
...
def forward(self, F, x):
# x is an ndarray or symbol
return F.concat(x, ...)
from mxshufflenet.
Thanks a lot, would you show me a detailed example about how to use operators which are not in gluon.nn
if you know?
from mxshufflenet.
For example, transpose
is not in gluon.nn
,
class Net(Block):
def __init__(self, num_class, **kwargs):
super(Net, self).__init__(**kwargs)
self.dense = nn.Dense(num_class, flatten=False)
def forward(self, x):
out = F.transpose(x, axes=(1, 0, 2))
out = self.dense(out)
return out
net = Net(num_class=11)
net.initialize()
out = net(F.zeros((2, 3, 4), ctx=mx.cpu(0)))
from mxshufflenet.
@qingzhouzhen have you reproduce the result?
from mxshufflenet.
Related Issues (5)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mxshufflenet.