Comments (4)
Hi @ninalopatina thanks for experimenting with OpenKiwi!
The WMT 2017 test data is available on their website: here (there is a purple link with gold-standard labels)
We should probably also include a link for these in the Quickstart document to make them more visible.
On the other hand the test tags for 2018 are not public (because they are the same as 2019 which is currently accepting submissions)
I'm closing this issue as it is solved and not really a bug with OpenKiwi. Feel free to re-open if you have any other questions!
Edit: I had wrongly stated that 2017 gold files were also not available
from openkiwi.
Thanks for looking into this so quickly. @captainvera. I had attempted to run this with the test data for 2017 & 2018, which I had obtained from the same site you linked. For both years, the test data includes only a .mt, .src, and .align file. There is no .tags file for either year, nor for 2016. Should I replace the test set links with dev set, to have a .tags files to evaluate with?
from openkiwi.
Hey @ninalopatina ! The .tags file is downloaded from a different location than the other test files. It's pointed out here:
You can download the .tags from there and evaluate your model! You could also replace it with the dev set but (if you trained using the dev set as validation) that would just give you your validation scores which you should be familiar with and not a "real" evaluation.
from openkiwi.
Thanks so much, @captainvera, I missed those links! I was thinking to test out the pipeline with the dev data until the test data becomes available, but this will work much better
from openkiwi.
Related Issues (20)
- TypeError: cannot unpack non-iterable NoneType object HOT 1
- The prediction process is not complete by Predictor Estimator. HOT 5
- OpenKiwi always download the tokenizer files for XLMRoberta even if a local path is configured. HOT 2
- Do openKiwi have confident score? HOT 1
- Error Pre-Training Predictor: "model -> encoder -> encode_source extra fields not permitted (type=value_error.extra)" HOT 1
- some confusions
- pkgutil.iter_modules() error: 'PosixPath' object has no attribute 'startswith'
- Got exception when import kiwi
- Seems that maximum token support for a sentence is 512?
- PicklingError: Can't pickle <class 'kiwi.data.encoders.wmt_qe_data_encoder.InputFields[PositiveInt]'>: attribute lookup InputFields[PositiveInt] on kiwi.data.encoders.wmt_qe_data_encoder failed HOT 2
- Do you need to tokenize your data when using a BERT/ROBERTA model?
- Pretrain config file
- What are source_pos and target_pos in the train_config.yaml?
- Why does it need "--model" paramter when I give a specific config? HOT 2
- What languages do the OpenKiwi support?
- some problems about data without alignments HOT 11
- I suppose that the code comment should be remove. HOT 2
- Error at Predictor Training: "Predictor is not a subclass of QESystem" HOT 2
- OSError: Can't load weights for 'xlm-roberta-base'. HOT 16
- open cannot unpack non-iterable NoneType object HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openkiwi.