Giter Site home page Giter Site logo

mcl-wic's People

Contributors

federicomartelli avatar navigli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

strategist922

mcl-wic's Issues

Wrong indexes

Hi!
It seems to me that some indexes of the target tokens are wrong.
I found problem in the next instances:
ru-ru: 0, 4
en-en: 2, 3
en-ru: 13
And may be the other pairs of languages have the same problem

Examples:
trial.en-en.2, resolution, NOUN, en, 17, en, 8, 'Ability to detect vertical movements , variations and dislocations of ground areas to a resolution of a few centimetres ;',
'Although Iraq was fully implementing all United Nations resolutions , it continued to suffer the devastating effects of the sanctions .'

trial.ru-ru.4, год, NOUN, ru, 3, ru, 12, 'В мае 1987 года на вершине горы Окубо в 4,1 км к западу от пусковой площадки был построен новый телеметрический центр.',
'Так , более 100 000 пуэрториканцев приняли участие в демонстрации 14 июля прошлого года в знак протеста против присоединения .'

Regarding the mode of communication

We wanted to know what is the main channel of communication for the task(We had few queries about the task) as we are unable to find the google group mentioned on the official SemEval page. Also, we wanted to know when will the training dataset be released.

Labels wrongly include numbers

Hi,
It seems to me that some of the labels for the Chinese and Arabic multilingual trial data are wrong, because they include numbers in addition to the label e.g.
trial.zh-zh.0 3F

List of all ids where this is the case:

  • trial.ar-ar.0 2T
  • trial.ar-ar.1 3T
  • trial.ar-ar.2 2T
  • trial.ar-ar.3 3T
  • trial.ar-ar.4 2F
  • trial.ar-ar.5 3F
  • trial.ar-ar.6 2T
  • trial.ar-ar.7 3T
  • trial.ar-ar.8 2T
  • trial.ar-ar.9 3T
  • trial.zh-zh.0 3F
  • trial.zh-zh.1 3F
  • trial.zh-zh.2 2T
  • trial.zh-zh.3 3T
  • trial.zh-zh.4 2T
  • trial.zh-zh.5 3T
  • trial.zh-zh.6 2R
  • trial.zh-zh.7 2R
  • trial.zh-zh.8 3R
  • trial.zh-zh.9 3R

POS tags with extra trailing suffixes among the training instances

Hi!
There are 4 instances in the training data collection, where extra string follow the POS tags.
These are instances at indices 2000, 2001, 3396 and 3397 in the list of training instances.
Can one just safely remove these trailing suffixes from these POS tags or are these instances in a pending status where updates could still happen in a later release?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.