epfl-dlab / genie Goto Github PK
View Code? Open in Web Editor NEWThe autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.
License: MIT License
The autoregressive information extraction system GenIE (Generative Information Extraction) implemented in PyTorch.
License: MIT License
I found the aNLG component leaderboard from AI2: https://leaderboard.allenai.org/genie-anlg/submissions/public
They are using ROUGE to evaluate the performance, however, I fail to find which variant of ROUGE is used. Any idea?
Running bash setup.sh
will raise the following error.
ERROR conda.cli.main_run:execute(32): Subprocess for 'conda run ['pip', 'install', '-r', 'pip_requirements.txt']' command failed. (See above for error)
Collecting numpy==1.20.3
Downloading numpy-1.20.3-cp38-cp38-macosx_10_9_x86_64.whl (16.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.0/16.0 MB 11.6 MB/s eta 0:00:00
Collecting jsonlines==2.0.0
Downloading jsonlines-2.0.0-py3-none-any.whl (6.3 kB)
ERROR: Could not find a version that satisfies the requirement pytorch==1.8.0 (from versions: 0.1.2, 1.0.2)
ERROR: No matching distribution found for pytorch==1.8.0
The error was raised because the package name for pytorch
is torch
rather than pytorch
.
A simple fix is replacing pytorch==1.8.0
with torch==1.8.0
Hi, I'm trying to use your code.
To exercise with your demo code, there's a code "from .utils import label_smoothed_nll_loss".
But I could not find that .utils.
Where can I find it?
Thank you for read my issue.
Hi, I am working on a tool to translate text to RDF format under the DBpedia ontology. I think your tool is great and I would like to use it in my project, but I have seen that you use the wikidata ontology.
Do you think it is possible to use another ontology? what would be needed?
I believe that DBpedia Spotlight does something similar to what GENRE does
Thanks in advance
May I ask what is the license of GenIE codes and also data?
Hi @martinj96,
I have a question regarding the random initialisation (which is the default right?) It is not clear to me here how
tokenizer = transformers.BartTokenizer.from_pretrained("martinjosifoski/genie-rw")
is trained/obtained beforehand?
Thank you!
Hello,
The link to the custom prefix tree construction cell seems to be broken in these two cells in demo.ipynb
:
To construct a prefix trie for your custom set of strings see this section.
and
The last two examples illustrate how the generation for any of the GenIE models can be constrained with an arbitrary prefix tries. See how you can construct your custom prefix trie
Hello, thank you for your valuable work. I found it very interesting!
I have tried to run the GenIE/notebooks/Demo.ipynb
notebook and I found some mismatches with the provided outputs. I was wondering if you have any idea of why this is happening.
For instance, under the Unconstrained Generation subsection, I get the following output:
[[{'text': ' <sub> KSAZ <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.1369926631450653}, {'text': ' <sub> KSAZ-TV <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.200978085398674}]]
while the expected one is:
[[{'text': ' <sub> KTRK, Carson <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.19589225947856903}, {'text': ' <sub> KTRK, Carson <rel> located in the administrative territorial entity <obj> Arizona <et> <sub> KSAZ <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.2037668377161026}]]
Same behaviour under the Constrained Generation subsection. For instance, the Small Schema Constrainted Generation output is:
[[{'text': ' <sub> Arizona <rel> capital <obj> Phoenix, Arizona <et>', 'log_prob': -0.21632088720798492}, {'text': ' <sub> Phoenix, Arizona <rel> capital of <obj> Arizona <et> <sub> Arizona <rel> capital <obj> Phoenix, Arizona <et>', 'log_prob': -0.3067542612552643}]]
while the expected one is:
[[{'text': ' <sub> Fox Broadcasting Company <rel> located in the administrative territorial entity <obj> Arizona <et> <sub> Phoenix, Arizona <rel> capital of <obj> Arizona <et> <sub> Arizona <rel> capital <obj> Phoenix, Arizona <et>', 'log_prob': -0.43319371342658997}, {'text': ' <sub> Fox Broadcasting Company <rel> headquarters location <obj> Arizona <et> <sub> Phoenix, Arizona <rel> capital of <obj> Arizona <et> <sub> Arizona <rel> capital <obj> Phoenix, Arizona <et>', 'log_prob': -0.4518451988697052}]]
Similarly, the output under the Large Schema Constrainted Generation is:
[[{'text': ' <sub> KSAZ <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.1369926631450653}, {'text': ' <sub> KSAZ-TV <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.200978085398674}]]
while the expected one is:
[[{'text': ' <sub> KTRK <rel> headquarters location <obj> Phoenix, Arizona <et> <sub> KSAZ-TV <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.22215303778648376}, {'text': ' <sub> KTRK <rel> headquarters location <obj> Phoenix, Arizona <et>', 'log_prob': -0.22950957715511322}]]
PS: I also had to downgrade torchmetrics
to 0.6.0
as the default conda
installation through the provided bash.sh
script threw the following ImportError
:
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).
The package version of protobuf
installed by default is 4.22.0
in my case, a fix is to downgrade the package into 3.20.x.
We could also consider specifying the version in the requirements.txt
setup.py
Demo.ipynb
notebook"""Load the Model"""
from genie.models import GeniePL
ckpt_name = "genie_r.ckpt"
path_to_checkpoint = os.path.join(DATA_DIR, 'models', ckpt_name)
model = GeniePL.load_from_checkpoint(checkpoint_path=path_to_checkpoint)
Explicitly set protobuf
package version in the requirements.txt
: protobuf==3.20
There may have better ways to fix the error.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.