deeppavlov / question_generation Goto Github PK

View Code? Open in Web Editor NEW

173.0 12.0 60.0 29 KB

It is a question-generator model. It takes text and an answer as input and outputs a question.

License: GNU General Public License v3.0

Python 54.68% Shell 19.51% Dockerfile 25.82%

seq2seq deep-learning natural-language-processing docker corenlp opennmt

question_generation's Introduction

Description

It is a question-generator model. It takes text and an answer as input and outputs a question.

Question generator model trained in seq2seq setup by using http://opennmt.net.

Environment

Docker ver. 17.03+:
- Ubuntu: https://docs.docker.com/engine/installation/linux/ubuntu/#install-using-the-repository
- Mac: https://download.docker.com/mac/stable/Docker.dmg
Docker-compose ver. 1.13.0+: https://docs.docker.com/compose/install/
Python 3
pyzmq dependencies: Ubuntu sudo apt-get install libzmq3-dev or for Mac brew install zeromq --with-libpgm

Setup

run ./setup. This script downloads torch question generation model, installs python requirements, pulls docker images and runs opennmt and corenlp servers.

Usage

./get_qnas "<text>" - takes as input text and outputs tsv.

First column is a question,
second column is an answer,
third column is a score.

Example

Input:

./get_qnas "Waiting had its world premiere at the \
  Dubai International Film Festival on 11 December 2015 to positive reviews \
  from critics. It was also screened at the closing gala of the London Asian \
  Film Festival, where Menon won the Best Director Award."

Output:

who won the best director award ? menon -2.38472032547
when was the location premiere ?  11 december 2015  -6.1178450584412

Notes

First model feeding may take a long time because of CoreNLP modules loading.
Do not forget to install pyzmq dependencies.

question_generation's People

Contributors

Stargazers

Watchers

Forkers

madrugado sxdkxgwan leezqcst sanjeeku xinyuanlu00 jkcchan arasharchor antriv adrieankhisbearchives justkaran albertusk95 bigdatasciencegroup nalinc kalininvn xennygrimmato abizerlokhandwala dantodor oneplusekjoy kredaro akhilagp revirevy yashodhank tomarraj008 zubair1811 colinsongf tonghuikang webstruck xiaojino jagadishgv sayduke krtrth adeeb10abbas shreyanand avinashtgoje fatimatasnim sarthak221995 sundeeppidugu reloadbrain pidugusundeep huiyingli2023 onursahil ant28 dmytrosytro katkamrachanaso janciswang turja1981 dywsjtu edcastaneda zzj0402 deen-abdullah mshc1947 zhecanjameswang soadsamir homizoka rahulrajrr adbmd mrteey shinroo mrschnappi williamxn

question_generation's Issues

How to set up on windows?

Hello,

I'm having issues setting this up on windows 10. When I run the setup file it gives me this error: "setup: line 15: pip3: command not found"

and then:

Traceback (most recent call last):
File "convert_text_to_opennmt_format.py", line 3, in
Traceback (most recent call last):
File "get_qnas.py", line 2, in
from pycorenlp import StanfordCoreNLP
ModuleNotFoundError: No module named 'pycorenlp'
import zmq, sys, json
ModuleNotFoundError: No module named 'zmq'

When I try to install pyzmq==16.0.2 manually it fails.
pycorenlp==0.3.0 installs successfully.

Model.t7 not extracted. 404 Not found

It seems as though the dropbox link to extract the model is no longer working. Here is my output clipped

shasum: data/model.t7: 
data/model.t7: FAILED open or read
shasum: WARNING: 1 listed file could not be read
----------------Downloading question-generation model----------------
--2018-08-25 10:13:17--  https://www.dropbox.com/s/jtp6ll263lz32xl/model.t7
Resolving www.dropbox.com (www.dropbox.com)... 162.125.6.1, 2620:100:601c:1::a27d:601
Connecting to www.dropbox.com (www.dropbox.com)|162.125.6.1|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /s/raw/jtp6ll263lz32xl/model.t7 [following]
--2018-08-25 10:13:17--  https://www.dropbox.com/s/raw/jtp6ll263lz32xl/model.t7
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 404 Not Found
2018-08-25 10:13:17 ERROR 404: Not Found.

----------------Installing python requirements----------------
The directory '/home/ndesai/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/ndesai/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Requirement already satisfied: pycorenlp==0.3.0 in /usr/local/lib/python3.5/dist-packages (from -r requirements.txt (line 1)) (0.3.0)
Requirement already satisfied: pyzmq==16.0.2 in /usr/local/lib/python3.5/dist-packages (from -r requirements.txt (line 2)) (16.0.2)
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from pycorenlp==0.3.0->-r requirements.txt (line 1)) (2.9.1)
----------------Pulling corenlp and opennmt docker images----------------
3.6.0: Pulling from sld3/corenlp
Digest: sha256:9cdd0d21326868cfcaaaed6e2d1db2507c059c9634378a99df7fd3e0317cb490
Status: Image is up to date for sld3/corenlp:3.6.0
latest: Pulling from sld3/opennmt
Digest: sha256:af916e064bd1e2414f8e3eb7608475e692abdde8265111a7736a88dacf8f1410
Status: Image is up to date for sld3/opennmt:latest
----------------Running corenlp and opennmt servers----------------
c2f76b73b2dc0fad18c0608dfe5b15d723b2595cadbf684e2fa6106e05909d90
353c38e804e0ff8b0fee829ac6c942640d2d953e50b3ad1bbd8840644035ac57
----------------Test output----------------
Waiting had its world premiere at the   Dubai International Film Festival on 11 December 2015 to positive reviews   from critics. It was also screened at the closing gala of the London Asian   Film Festival, where Menon won the Best Director Award.

The code just gets stuck right after this.

double free or corruption (!prev)

What does this mean ?

running setup

How to run setup in windows?

Results are very strange when using external text

The questions generated exhibit text that is not in the source block of text - can you explain how the model is built, and how to substitute it with another model (or train our own)
thanks

Code is stuck

I installed all the required libraries. When I run ./setup, it gets stuck after some time and remain stuck for at least 8 hours.

Here is the output:

data/model.t7: OK
----------------Installing python requirements----------------
Collecting pycorenlp==0.3.0 (from -r requirements.txt (line 1))
Collecting pyzmq==16.0.2 (from -r requirements.txt (line 2))
Using cached pyzmq-16.0.2-cp35-cp35m-manylinux1_x86_64.whl
Collecting requests (from pycorenlp==0.3.0->-r requirements.txt (line 1))
Using cached requests-2.18.4-py2.py3-none-any.whl
Collecting chardet<3.1.0,>=3.0.2 (from requests->pycorenlp==0.3.0->-r requirements.txt (line 1))
Using cached chardet-3.0.4-py2.py3-none-any.whl
Collecting idna<2.7,>=2.5 (from requests->pycorenlp==0.3.0->-r requirements.txt (line 1))
Using cached idna-2.6-py2.py3-none-any.whl
Collecting urllib3<1.23,>=1.21.1 (from requests->pycorenlp==0.3.0->-r requirements.txt (line 1))
Using cached urllib3-1.22-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests->pycorenlp==0.3.0->-r requirements.txt (line 1))
Using cached certifi-2018.1.18-py2.py3-none-any.whl
Installing collected packages: chardet, idna, urllib3, certifi, requests, pycorenlp, pyzmq
Successfully installed certifi-2018.1.18 chardet-3.0.4 idna-2.6 pycorenlp-0.3.0 pyzmq-16.0.2 requests-2.18.4 urllib3-1.22
You are using pip version 8.1.1, however version 9.0.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
----------------Pulling corenlp and opennmt docker images----------------
3.6.0: Pulling from sld3/corenlp
Digest: sha256:9cdd0d21326868cfcaaaed6e2d1db2507c059c9634378a99df7fd3e0317cb490
Status: Image is up to date for sld3/corenlp:3.6.0
latest: Pulling from sld3/opennmt
Digest: sha256:af916e064bd1e2414f8e3eb7608475e692abdde8265111a7736a88dacf8f1410
Status: Image is up to date for sld3/opennmt:latest
----------------Running corenlp and opennmt servers----------------
506572c0a313494f01a96970cd94086e8203a9f9e9b7e82ad26e8a0ba2a9acc3
8b8a841f687bb8e460a85f1f22ed9ae40c6fb0f36b66698bf7ba196b79f7d04e
----------------Test output----------------
Waiting had its world premiere at the Dubai International Film Festival on 11 December 2015 to positive reviews from critics. It was also screened at the closing gala of the London Asian Film Festival, where Menon won the Best Director Award.
[{'src': 'waiting￨O￨UP￨VBG￨O had￨O￨LOW￨VBD￨O its￨O￨LOW￨PRP$￨O world￨O￨LOW￨NN￨O premiere￨O￨LOW￨NN￨O at￨O￨LOW￨IN￨O the￨O￨LOW￨DT￨O dubai￨O￨UP￨NNP￨O international￨O￨UP￨NNP￨O film￨O￨UP￨NNP￨O festival￨O￨UP￨NNP￨O on￨O￨LOW￨IN￨O 11￨B￨LOW￨CD￨DATE december￨I￨UP￨NNP￨DATE 2015￨I￨LOW￨CD￨DATE to￨O￨LOW￨TO￨O positive￨O￨LOW￨JJ￨O reviews￨O￨LOW￨NNS￨O from￨O￨LOW￨IN￨O critics￨O￨LOW￨NNS￨O .￨O￨LOW￨.￨O\n'}, {'src': 'it￨O￨UP￨PRP￨O was￨O￨LOW￨VBD￨O also￨O￨LOW￨RB￨O screened￨O￨LOW￨VBN￨O at￨O￨LOW￨IN￨O the￨O￨LOW￨DT￨O closing￨O￨LOW￨NN￨O gala￨O￨LOW￨NN￨O of￨O￨LOW￨IN￨O the￨O￨LOW￨DT￨O london￨O￨UP￨NNP￨O asian￨O￨UP￨NNP￨O film￨O￨UP￨NNP￨O festival￨O￨UP￨NNP￨O ,￨O￨LOW￨,￨O where￨O￨LOW￨WRB￨O menon￨B￨UP￨NNP￨PERSON won￨O￨LOW￨VBD￨O the￨O￨LOW￨DT￨O best￨O￨UP￨JJS￨O director￨O￨UP￨NN￨O award￨O￨UP￨NN￨O .￨O￨LOW￨.￨O\n'}]

OpenNMT Container Error

I am still receiving this error even on the latest build:

bash: line 1: 6 Illegal instruction (core dumped) th tools/translation_server.lua -host 0.0.0.0 -port 5556 -model /root/data/model.t7 -beam_size 12

the setup never finishes

Want to Know Score Calculation & why it is providing negative score value

I am able to run the code & getting the expected output but want to know how model is calculation pred_score & why it is showing as negative value what does it mean

Thanks in advance

Tuning the Model !

This project uses pre-trained model. How can we tune it or train our own model ?

how can we train the model?

Because I'm doing some research with QG model.
I want to make a Chinese QG model.
Can I take your training model as a reference?
I will appreciate you very much.

how can we train the model form scratch

Issue while running ./get_qnas "<text>"

Hi,

I was trying this project on my docker. I am able to successfully install dependency of this project using ./setup Everything is running on my docker and all containers are up and running. However, I encounter an error while trying using ./get_qnas Below is the error stack.

----------------Test output---------------- Waiting had its world premiere at the Dubai International Film Festival on 11 December 2015 to positive reviews from critics. It was also screened at the closing gala of the London Asian Film Festival, where Menon won the Best Director Award. Traceback (most recent call last): File "convert_text_to_opennmt_format.py", line 120, in <module> main(text) File "convert_text_to_opennmt_format.py", line 106, in main output =json.loads(output, encoding='utf-8', strict=False) File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/json/__init__.py", line 367, in loads return cls(**kw).decode(s) File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/Users/prakritidevvema/anaconda3/envs/rk/lib/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) ----------------Stopping containers---------------- corenlp corenlp opennmt opennmt

Did anyone encounter this issue? Or is it me who did something wrong.

Thanks

[BUG] In some machines opennmt container fails

Description

opennmt container fails to process the input:

[07/12/17 14:35:29 INFO] Loading model
[07/12/17 14:35:29 INFO] Loading '/root/data/model.t7'...
[07/12/17 14:35:30 INFO] Model seq2seq trained on bitext
[07/12/17 14:35:30 INFO] Server initialized at tcp://0.0.0.0:5556
bash: line 1:     6 Illegal instruction     (core dumped) th tools/translation_server.lua -host 0.0.0.0 -port 5556 -model /root/data/model.t7 -beam_size 12

Reproducing

This bug can be repoduced only in specific machines.
You can create Digital Ocean One-click app droplet with Docker 17.06 in Amsterdam 01 datacenter.
Then clone the repo, install dependencies and run ./setup.
After all you see the bug in opennmt container logs.

But this bug cannot be reproduced in another datacenter!
For example, try to do the same steps in Frankfurt 01 datacenter to see that everything works fine.

Update #2: must be a problem with the particular VM that the docker was running on. I spun up the container on a different (physical) machine, and it's working fine.

Not running anyway!!

I am also trying to run this model, but it is not running giving different issues different times. Now after installing all the necessary libraries, I am getting the following error and the code get stuck for a while

Exception: Check whether you have started the CoreNLP server e.g.
$ cd stanford-corenlp-full-2015-12-09/
$ java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer

Please help!!

docker-compose instead of installation bash script

The dockerfile is redundant and losing its purpose. Making a docker-compose to link coreNLP and openNMT with model downloaded inside the container is better than installation bash script.

Zmq connection issue

File "zmq/backend/cython/socket.pyx", line 693, in zmq.backend.cython.socket.Socket.recv (zmq/backend/cython/socket.c:7683)
File "zmq/backend/cython/socket.pyx", line 727, in zmq.backend.cython.socket.Socket.recv (zmq/backend/cython/socket.c:7460)
File "zmq/backend/cython/socket.pyx", line 150, in zmq.backend.cython.socket._recv_copy (zmq/backend/cython/socket.c:2437)
File "zmq/backend/cython/socket.pyx", line 145, in zmq.backend.cython.socket._recv_copy (zmq/backend/cython/socket.c:2344)
File "zmq/backend/cython/checkrc.pxd", line 19, in zmq.backend.cython.checkrc._check_rc (zmq/backend/cython/socket.c:9823)
zmq.error.Again: Resource temporarily unavailable

The code is getting blocked after displaying output text, Kindly provide solution for this issue.

Problem in ConnectionHandler class

I tried many times, and sometimes the program will stuck and I think this is because self.sock.recv() dose not received message. Any method to solve this problem?

class ConnectionHandler:
def init(self):
signal(SIGPIPE, SIG_DFL)
self.sock = zmq.Context().socket(zmq.REQ)
self.sock.connect("tcp://125.0.0.1:5556")

def __call__(self, data):
    self.sock.send_string(json.dumps(data))
    recieved = json.loads(str(self.sock.recv(), "utf-8"), encoding='utf-8', strict=False)
    recieved = [(row[0]['tgt'], row[0]['pred_score'], row[0]['src']) for row in recieved]
    return get_with_answers(recieved)

deeppavlov / question_generation Goto Github PK

question_generation's Introduction

Description

Environment

Setup

Usage

Example

Notes

question_generation's People

Contributors

Stargazers

Watchers

Forkers

question_generation's Issues

Description

Reproducing

I am also trying to run this model, but it is not running giving different issues different times. Now after installing all the necessary libraries, I am getting the following error and the code get stuck for a while

Exception: Check whether you have started the CoreNLP server e.g. $ cd stanford-corenlp-full-2015-12-09/ $ java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer

Recommend Projects

Recommend Topics

Recommend Org

Exception: Check whether you have started the CoreNLP server e.g.
$ cd stanford-corenlp-full-2015-12-09/
$ java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer