Comments (23)
@mostafa-saad data is the input for LSTM. Clip is a binary indicator of continuity of the data (sequence).
For example, you can give different input sequences (i.e., [1 2 3 4], [1 2 3]) as one input data as follows: data = [1 2 3 4 1 2 3], clip = [0 1 1 1 0 1 1]. "0" indicates head of the sequence. By default, clip is [0 1 1 1 .... 1], which assumes that only one sequence from its head is given as input.
You can also do several forward passes for a very long or variable-length sequence.
For example, data = [1 2 3 4 5] can be divided into 5 forward passes as follows:
- data=[1], clip=[0]
- data=[2], clip=[1]
- data=[3], clip=[1]
- data=[4], clip=[1]
- data=[5], clip=[1]
Although this seems very inefficient, it is actually necessary especially when a prediction is used as an input for the next time-step (i.e., text modelling).
I guess you don't have to use "clip" because 1) the input data is always given from the data, and the input sequence is complete (starting from its head, continuous). So, the default clip value should work for your case.
from caffe-lstm.
https://github.com/jeffdonahue/caffe/tree/recurrent would be helpful.
from caffe-lstm.
I think it's not different from the example I provided.
You can define LSTM layer on top of the K images in prototxt.
As @nakosung mentioned, Jeff Donahue has another LSTM implementation (seems to be merged to master branch soon) with examples on images.
You can find prototxt from his branch.
Thanks.
from caffe-lstm.
@junhyukoh
Thanks so much. One more question, what are the inputs for your LSTM layer? In an example, it takes 2 elements, data and clip?
from caffe-lstm.
Thanks. Just to make sure I understand you. Assume I am extending Alex Network with 1 LSTM layer, and say I have 3 videos for training. One with 4 frames, other with 3 frames and 3rd one with 5 frames. Clip should be as following:
clip = [0 1 1 1 0 1 1 0 1 1 1 1]?
How to use level db to input the clip layer from hard disk not from memory? Is it possible to just provide a text file?
I am just novice in Caffe and still learning, sorry for many questions.
from caffe-lstm.
That's correct.
The current data_layer implementation (src/caffe/layers/data_layer.cpp) doesn't support clip.
So, you may have to implement your own data layer where the output is data/clip if you want to use leveldb.
Another way is to give data/clip directly from your own program like my example code (lstm_sequence.cpp) without using leveldb, but this doesn't run on a separate thread, which might be slower than implementing a new data layer.
from caffe-lstm.
What about an ImageData input layer <image, label> pair: images are dummy and the labels be the binary clip input? Do you think this would work?
from caffe-lstm.
I think it would work if you give the pair correctly.
from caffe-lstm.
Excusme me, if this is a very simple question, but I am just starting with learning using neural networks using caffe.
Is it possible to use this network to train on continuous sequence of 2 variables for ex. [(2.77, 9.03), (2.01, 10.48),.....]
and then predict next in sequence for supplied input?
So for training I could have sequence [[t0]...[t9]] (sequence of 10 time steps)
as input and [t10]
as expected output. And then do the prediction in same manner.
from caffe-lstm.
@mecp Yes. It's possible to train the network on multi-dimensional input/output.
from caffe-lstm.
@junhyukoh what's the difference between batch size, N_ and sequence length T_?
from caffe-lstm.
@HaiboShi In RNN training, a training example is x_{1}, x_{2}, ..., x_{T_}
(sequence). We can define N_
number of such sequences as a mini-batch.
from caffe-lstm.
@junhyukoh and the diff of that mini-batch sum up together for updating the weight?
from caffe-lstm.
@HaiboShi Yes, the diff is accumulated through mini-batch.
However, loss layers usually give normalized diffs to the bottom blobs (by dividing it by the size of mini-batch).
So, the weight diff is actually normalized by the size of the mini-batch.
from caffe-lstm.
@junhyukoh Thanks. It helps a lot. and there's another specific question:
In lstm layer class, Blob h_to_h_;
What is that variable standing for? I noticed that it appears only in backward propagation step. Thanks!
from caffe-lstm.
@junhyukoh while also, it seems that there's no top diff data in backward_cpu() function, I wonder how the gradient from last layer pass to lstm layer? thanks! 💯
from caffe-lstm.
@HaiboShi h_to_h_
is an intermediate blob that computes h_{t+1} -> h_{t} gradient.
There is a top diff in backward_cpu() function in Line 209.
Dtype* top_diff = top_.mutable_cpu_diff()
top_
shares its memory with the actual top
blob.
from caffe-lstm.
@junhyukoh hi, thanks for your reply. there's one more question:
what is the clipping_threshold_ standing for? is it related to pregate_gradient?
I notice you do the accumulation for the batch:
caffe_add(H_, dh_t_1, h_to_h, dh_t_1);
does it mean that h{t} gradient is composed by the h{t+1}gradient of all elements in one batch?
from caffe-lstm.
@junhyukoh Hi, junhyukoh,I am new to caffe, and I have read your example, and I have two questions here:
Firstly, in your example, TotalLength=seq_length=320,which means there is only one sequence for input. However,if I have more sequence,and train thousands of times, after the first time, the clip array turns to all 1, what does it mean if the value of clip is [1,1...]? Continue to input another sequence after the beginning one with a head 0 in clip? (I mean this line: train_clip_blob->mutable_cpu_data()[0] = seq_idx > 0;)
The second one is that it is noted that during the test phase, you reshape the input data that I cannot fully understand, also,there is no input data during test,isn't it? Can u explain it, please? (This line: test_data_blob->Reshape(shape);
test_clip_blob->Reshape(shape);)
l ll appreciate ur answer, thanks a lot!
from caffe-lstm.
@junhyukoh And also, what is the difference between data and label? There is an object named 'data', but it is not mentioned in your code!
from caffe-lstm.
Hi @junhyukoh, I have a question about "clip" array. Let's say, during training phase, my input "data" is [A B C (eos)], and the desired label is [W X Y Z (eos)], does the "data", "label", and "clip" become something like this:
Data | A | B | C | (EOS) | W | X | Y | Z |
---|---|---|---|---|---|---|---|---|
Label | W | X | Y | Z | (EOS) | |||
Clip | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 |
from caffe-lstm.
@junhyukoh What is the sequence length if I have a feature vector of (10, 50, 4, 4) ?
from caffe-lstm.
@junhyukoh When training an LSTM with a single (long) repeated sequence and multiple epochs, should the clip value be 0 at the start of each epoch/data sequence, or just the first epoch?
from caffe-lstm.
Related Issues (20)
- How to leverage the large-scale unlabeled data to train the LSTM model?
- Question about 'lr_mult' HOT 3
- Meaning of label in lstm_short example HOT 2
- why the batch_size is a parameter of the layer HOT 3
- Use of default clip markers as [0,1,1, 1...,1] HOT 7
- why not the lstm_swquence.bin file HOT 1
- Loss layer calculation
- Is caffe command line available?
- 'clipping_threshold' LSTM parameter same at 'clip_gradient' Caffe parameter? HOT 7
- Does the batch_size have to be at least as large as the sequence length ? HOT 2
- Issue with linking HOT 5
- How to do Many-to-One LSTM learning? HOT 7
- How to build many to one LSTM HOT 1
- Variable-sized input sequence HOT 1
- Suggestions for implementing CWRNN in Caffe with c++
- Multi-inputs to one output HOT 1
- lstm_layer.cu:172] Check failed: error == cudaSuccess (9 vs. 0) invalid configuration argument
- Using LSTM for text generation in caffe
- bug? HOT 2
- running on android
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from caffe-lstm.