Comments (3)
At the moment, the model only supports title
and abstract
fields of a paper.
So the valid options for --included-text-field
is title
abstract
or title abstract
.
If you'd like to use sections other than abstract and are using your own data, one workaround is to replace or concatenate the content of abstract
with the content of that specific section in your data.
from specter.
Thanks for your reply. I have tried abstract
title abstract
and got the same embedding.
Is it normal for the model?
from specter.
Sorry for bad naming, we should fix this, but the model by default includes the title
. So abstract
==title abstract
If you want to modify this behavior or add other sections, you can easily change these lines:
Lines 248 to 259 in 673346f
from specter.
Related Issues (20)
- Nan loss in training specter with specter/scripts/pytorch_lightning_training_script/train.py
- ArrayField.empty_field problem with DatasetReader HOT 6
- json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1) HOT 1
- Training Data Set for Spectre HOT 5
- Error when training: "None" is not a <Class 'allennlp.data.fields.field.Field'> HOT 1
- How to change the sequence length? HOT 1
- How to deal with this bug?
- How to train my own model with other checkpoints?
- Use custom dataset without hard negatives
- Code does not go to completion when `njobs` is greater than 1 HOT 1
- Positive paper sampling HOT 2
- Matching articles from SPECTER's dataset with S2ORC IDs HOT 1
- Using trained model: which tokenizer?
- where can I get the dataset used in the paper? HOT 1
- ERROR: Allennlp params.py ValueError: Cannot convert variable to bool: all
- Does the model use Bert's vocabulary or scibert's vocabulary, or do they all use it? Where is it used HOT 1
- How to create the vocab files? (tokens.txt, non_padded_namespaces.txt, venue.txt) HOT 5
- RoBERTa instead of BERT: what changes in SPECTER scripts?
- concat_title_abstract must be True in scripts/pytorch_lightning_training_script/train.py#L73 HOT 1
- Format specified for json file containing title and abstract is wrong.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from specter.