Giter Site home page Giter Site logo

cdqa-suite / cdqa-annotator Goto Github PK

View Code? Open in Web Editor NEW
86.0 6.0 43.0 785 KB

⛔ [NOT MAINTAINED] A web-based annotator for closed-domain question answering datasets with SQuAD format.

License: Apache License 2.0

JavaScript 6.13% HTML 5.58% Vue 88.30%
reading-comprehension question-answering natural-language-processing annotator vuejs

cdqa-annotator's People

Contributors

dependabot[bot] avatar fmikaelian avatar rsjain1978 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cdqa-annotator's Issues

is there any limit on the no.of rows in a data-frame for the annotator to load the json file?

Is there any limit for the number of rows that can be present in a data-frame to be passed into the df2squad function for converting into .json file

As when i passed a data-frame with this particular dimension "1115 rows × 2 columns" The cdqa tool after loading the output json from the df2squad function in to the cdqa annotator tool it was able to show only 341 documents in the tool.

Using tool to validate semi-automatically created question/answers pairs

This tool has the potential to encourage sharing more data on QA from various domains.
One idea I think it is worth looking into, is to combine it with semi-automatically created questions/answers pairs (and documents) and use it to validate them.
For example by pre-populating the questions/answers fields.

In the excellent video tutorial
https://www.youtube.com/watch?v=YhVgl70Tn_k
it is mentioned at the end an open source system which - if I understood it correctly - could help in this. [Close?)
Can someone share the link and we can explore if it can be used for that end ?

Upload a pre-defined set of questions

Hello Everybody

First of all, thanks for the great work you provided here. I think this tool is the one I needed for what I am about to do.

My question is (similar to the one in this issue), since my questions will ALWAYS be the same on my ~1k documents, is there a way to pre-load them in the JSON to upload, so that I don't have to re-type the text of the answer every time, but it is enough to select the answer in the context?

Thanks a lot to everyone

Dependencies not found running vue serve

Linux Mint 19.xx
npm 6.13.1
vue 4.1.1
These dependencies were not found:

  • core-js/library/fn/array/from in ../node_modules/bootstrap-vue/esm/utils/array.js
  • core-js/library/fn/array/is-array in ../node_modules/bootstrap-vue/esm/utils/array.js
  • core-js/library/fn/object/assign in ../node_modules/bootstrap-vue/esm/utils/object.js
  • core-js/library/fn/object/is in ../node_modules/bootstrap-vue/esm/utils/object.js

Update JSON to SQuAD2.0

  • Add question id on addAnnotation() method
  • If answer is empty on addAnnotation() add "is_impossible":true otherwise false
{
    "version":"v2.0",
    "data":[
       {
          "title":"Beyonc\u00e9",
          "paragraphs":[
             {
                "qas":[
                   {
                      "question":"When did Beyonce start becoming popular?",
                      "id":"56be85543aeaaa14008c9063",
                      "answers":[
                         {
                            "text":"in the late 1990s",
                            "answer_start":269
                         }
                      ],
                      "is_impossible":false
                   },
                   {
                      "question":"What areas did Beyonce compete in when she was growing up?",
                      "id":"56be85543aeaaa14008c9065",
                      "answers":[
                         {
                            "text":"singing and dancing",
                            "answer_start":207
                         }
                      ],
                      "is_impossible":false
                   },

TypeError: _vm.json is null

[Vue warn]: Error in render: "TypeError: _vm.json is null" found in ---> <AnnotationsPage> at src/components/AnnotationsPage.vue <HomePage> at src/components/HomePage.vue <App> at src/App.vue <Root>

Error in output json

The output file of the annotator is not recognised as a json.

When tried to load the json into a jupyter notebook we got the following error: UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xac in position 3852: invalid start byte

I get Attribute error when try your example code in my system during prediction step. Though it works fine in colab

query = 'Since when does the Excellence Program of BNP Paribas exist?'
prediction = cdqa_pipeline.predict(query)

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "\localhost\c$\myfiles\new_cdqa\cdqa-master\cdqa\reader\bertqa_sklearn.py", line 326, in _example_to_features_parallel
query_tokens = tokenizer.tokenize(example.question_text)
AttributeError: 'NoneType' object has no attribute 'tokenize'
"""

The above exception was the direct cause of the following exception:

AttributeError Traceback (most recent call last)
in ()
1 query = 'Since when does the Excellence Program of BNP Paribas exist?'
----> 2 prediction = cdqa_pipeline.predict(query)

\localhost\c$\myfiles\new_cdqa\cdqa-master\cdqa\pipeline\cdqa_sklearn.py in predict(self, query, n_predictions, retriever_score_weight, return_all_preds)
182 retrieve_by_doc=self.retrieve_by_doc,
183 )
--> 184 examples, features = self.processor_predict.fit_transform(X=squad_examples)
185 prediction = self.reader.predict(
186 X=(examples, features),

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in fit_transform(self, X, y, **fit_params)
515 if y is None:
516 # fit method of arity 1 (unsupervised transformation)
--> 517 return self.fit(X, **fit_params).transform(X)
518 else:
519 # fit method of arity 2 (supervised transformation)

\localhost\c$\myfiles\new_cdqa\cdqa-master\cdqa\reader\bertqa_sklearn.py in transform(self, X)
1054 is_training=self.is_training,
1055 verbose=self.verbose,
-> 1056 n_jobs=self.n_jobs,
1057 )
1058

\localhost\c$\myfiles\new_cdqa\cdqa-master\cdqa\reader\bertqa_sklearn.py in convert_examples_to_features(examples, tokenizer, max_seq_length, doc_stride, max_query_length, is_training, verbose, n_jobs)
292 verbose,
293 )
--> 294 for (example_index, example) in enumerate(examples)
295 ],
296 )

C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py in map(self, func, iterable, chunksize)
264 in a list that is returned.
265 '''
--> 266 return self._map_async(func, iterable, mapstar, chunksize).get()
267
268 def starmap(self, func, iterable, chunksize=None):

C:\ProgramData\Anaconda3\lib\multiprocessing\pool.py in get(self, timeout)
642 return self._value
643 else:
--> 644 raise self._value
645
646 def _set(self, i, obj):

AttributeError: 'NoneType' object has no attribute 'tokenize'

Local host issue

Hii. I am a newbie.. need some help. when I try to run the command npm run server. it says your app is running on localhost. http://localhost:8080/ but when I open it. it is not reachable. I get an error. how to resolve this issue?

"Next" Button doesn't work

My dataset had 3 sets of data, each set having one paragraph and context. When I upload the JSON file into the annotator the "Next" button only works on the first click, but the second click causes an error: [Vue warn]: Error in render: "TypeError: Cannot read property 'context' of undefined"

JSON output is not able to be trained.

I create a SQUAD compatible dataset using a set of PDF documents. I then upload this to the annotator, however the downloaded annotation JSON is missing question ids? Is this a feature to be implemented?

'Next' button freezes after 3 documents.

I have tested with the MLQA dataset files as well as my custom dataset. Tool can process any number of single/multiple paragraphs up to the third 'document'. Then freezes.
If I put all the context paragraphs in the same document i.e. under the same 'title' in the JSON file, it works fine.

So,
1] I cannot browse previously existing datasets as they are.
2] Cannot put context paragraphs under different titles in new datasets.

I've tested on chrome as well as edge browser.
Have installed the latest version from git.
cdqa_error

Issue with running vue serve on windows

Hello, I am having an issue with running vue serve on my Windows machine. When I try to run it, I get an error saying I'm missing (parts of) core-js. I have tried to install core-js with npm install core-js from the cdQA-annotator directory and am still getting the same issue.

I'm on Node v 12.13.0 and npm 6.12.0.

Full error:

 ERROR  Failed to compile with 4 errors                                                                    9:40:29 PM

These dependencies were not found:

* core-js/library/fn/array/from in ../node_modules/bootstrap-vue/esm/utils/array.js
* core-js/library/fn/array/is-array in ../node_modules/bootstrap-vue/esm/utils/array.js
* core-js/library/fn/object/assign in ../node_modules/bootstrap-vue/esm/utils/object.js
* core-js/library/fn/object/is in ../node_modules/bootstrap-vue/esm/utils/object.js

It suggests I install these with npm install --save core-js/library/fn/array/from [...], but that didn't solve the issue either.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.