hello, I have run your code successful. I have als

Ok It was simple : <div class="snippet-clipboard-content notranslate position-rela

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to do Stacked LSTM with attention using this framework ? about keras-attention HOT 3 CLOSED

rjpg commented on August 17, 2024

How to do Stacked LSTM with attention using this framework ?

from keras-attention.

Comments (3)

rjpg commented on August 17, 2024

Ok It was simple :


lstm=Bidirectional(LSTM(100,recurrent_dropout=0.4,dropout=0.4,return_sequences=True),merge_mode='concat')(inputNet) #worse using stateful=True
        #lstm=SeqSelfAttention(attention_activation='sigmoid')(lstm)
        lstm=attention_3d_block(lstm,timeSteps)
        lstm=Bidirectional(LSTM(50,recurrent_dropout=0.4,dropout=0.4,return_sequences=True),merge_mode='concat')(lstm) #worse using stateful=True 
        lstm=attention_3d_block(lstm,timeSteps)
        lstm=Bidirectional(LSTM(20,recurrent_dropout=0.4,dropout=0.4,return_sequences=False),merge_mode='concat')(lstm) #worse using stateful=True

from keras-attention.

rjpg commented on August 17, 2024

by the way, I try to use attention with conv1D to specify the "neighbors lenght" contribute to the importance of the step in question (using the size of the kernel) , the results improved:

def attention_3d_block(inputs,timesteps):
    input_dim = int(inputs.shape[2])
    time_steps=timesteps
    a_probs = Conv1D(input_dim,3,strides=1,padding='same',activation='softmax')(inputs)
    output_attention_mul= Multiply()([inputs, a_probs]) #name='attention_mul'
    return output_attention_mul

this way you also do not need to permute - it will build attention vector for time steps and not for variables without permuting...

from keras-attention.