cnn_lstm_seq2seq's Introduction

CNN_LSTM_Seq2Seq

Abstractive Text Summarization Using Sequence to Sequence Model

Project Overview

Abstractive text summarization, on the other hand, generates summaries by compressing the information in the input text in a lossy manner such that the main ideas are preserved. The advantage of abstractive text summarization is that it can use words that are not in the text and reword the information to make the summarizes more readable. In this model, a CNN-LSTM encoder and LSTM decoder model are used to generate headlines for articles using the Gigaword dataset. To improve the quality of the generated summaries, a Bahdanau attention mechanism, a pointer-generator network and a beam-search inference decoder are applied to the model.

Install

This project requires Python 3.6 and the following Python libraries installed:

You will also need to have software installed to run and execute a Jupyter Notebook

If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included. Make sure that you select the Python 3.6 installer.

Architecture

Hyperparameters

Parameters	Values
Kernel Size	[1,3,5]
Filter Size	100
Encoder Hidden Units	256
Encoder Layers	1
Decoder Hidden Units	512
Decoder Layers	1
Beam Width	10
Embedding	300d - GloVe
Dropout	0.5
Loss Function	torch.nn.CrossEntropyLoss
Optimizer	Adam Optimizer
Learning Rate	0.001

Dataset

The model is trained on the Gigaword corpus found at https://github.com/harvardnlp/sent-summary. The dataset contains the first sentence of articles as the input text and the headlines as the ground-truth summaries.

Results

The generated summaries achieved a ROUGE-1 score of 29.79 using the files2rouge function.

cnn_lstm_seq2seq's People

Contributors

Stargazers

Watchers

cnn_lstm_seq2seq's Issues

How to input a text and get a summarized one ?

Sorry for bothering you, but right now I don't know how to get the summary text after the model train is finished, could you please help me, I appreciate it

Greedy Decoder: RuntimeError: CUDA out of memory during

While computing the Greedy Decoder script getting the error. Can you suggest the type of GPU and the amount of memory required to run the script?
The output of the script while running in spyder the following code along with its error is given below. I have also enclosed all details. The machine is Dell Precision Tower 7920.
with torch.no_grad():
greedy_time = time.time() # start timer
loss_greedy = []
greedy_predict = []
model.eval()
# initialize the encoder hidden states
val_hidden = model.encoder.init_hidden(batch_size=32)
for x_val, y_val in get_batches(val_text,val_summary,batch_size=32):
# convert data to PyTorch tensor
x_val = torch.from_numpy(x_val).to(device)
y_val = torch.from_numpy(y_val).to(device)
val_hidden = tuple([each.data for each in val_hidden])
# run the greedy decoder
val_loss, prediction = model.inference_greedy(x,y,val_hidden,criterion,batch_size=32)
loss_greedy.append(val_loss.item())
greedy_predict.append(prediction)

model.train()
print("Greedy Test: {0} s".format(time.time()-greedy_time))
print("Val Greedy Loss: {:.4f}".format(np.mean(loss_greedy)))

Traceback (most recent call last):

File "", line 14, in
val_loss, prediction = model.inference_greedy(x,y,val_hidden,criterion,batch_size=32)

File "", line 59, in inference_greedy
logits, d_hidden = self.decoder(dec_input,enc_output,d_hidden,x,batch_size)

File "/home/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)RuntimeError: CUDA out of memory

File "", line 74, in forward
output_probability = torch.mul(p_pointer.unsqueeze(1),pointer_prob) + torch.mul(p_gen.unsqueeze(1),generator_prob)

RuntimeError: CUDA out of memory. Tried to allocate 11.75 MiB (GPU 0; 7.92 GiB total capacity; 5.84 GiB already allocated; 20.75 MiB free; 717.94 MiB cached)

Recommend Projects

murak038 / cnn_lstm_seq2seq Goto Github PK

cnn_lstm_seq2seq's Introduction

CNN_LSTM_Seq2Seq

Project Overview

Install

Architecture

Hyperparameters

Dataset

Results

cnn_lstm_seq2seq's People

Contributors

Stargazers

Watchers

Forkers

cnn_lstm_seq2seq's Issues

How to input a text and get a summarized one ?

Greedy Decoder: RuntimeError: CUDA out of memory during

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent