Giter Site home page Giter Site logo

ucca_english-ewt's People

Contributors

danielhers avatar dotdv avatar omriabnd avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

ruixiangcui

ucca_english-ewt's Issues

Relational noun coordination with outer possessive

In a786df3:

"my wife and kids"
"my family or friends"
"my friends and family"

Shouldn't the possessive be repeated as a remote, as suggested at huji-nlp/ucca#62? I.e. "my wife and (my) kids".

Consider that the possessive could be used with a coordinated relational noun and nonrelational noun:

my kids and dog
[my_A kids_S+A]_C and_N [(my)_S+A dog_A]_C

Tokenization divergences between UCCA and UD

Some tokenizations don't match between UCCA and UD.
E.g., in doc 020851, "Jack-s", paragraph-position 18 in UCCA / sentence 2 tokens 13-14 is split in UD, but one token in UCCA.
Same with doc 020992, "#2", paragraph_position 25,
doc 059005, "Max's", paragraph_position 3
and doc 059416, "Fraiser's", paragraph_position 18

This might be more complicated to fix as it affects annotations, but it would be nice if there is some way to make them match.

Some bugs in passage id mapping

I noticed that something is off with the STREUSLE-mapped passage IDs.
E.g., 010820.xml in this repo really corresponds to UD doc_id 108338 (and the real UD document 10820 is in 107692.xml).
Another file that seems to be affected by this are 011257.xml (there might be more, but these were the ones I noticed).

UD/UCCA Mismatches

@nschneid wrote:

Other types of mismatches I noticed:

** @danielhers wrote
Punctuation is ignored in the matching, so that's not the reason.
I guess many cases are due to Function and Relator units,
and for the Participant Scenes it's mostly linkage mismatches.**

When a noun is modified by a relative clause (which seems rather common in reviews), in UCCA the relative clause will sometimes be split to a Parallel Scene and then it's a separate unit.

Is it an UCCA error to not treat a relative clause as an E-scene?
E.g. "the software I included on my resume" (https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-EWT/blob/master-images/270502-0003.svg) has "included" as the main relation, which has to be an error

Are nonrestrictive relative clauses treated as parallel scenes in UCCA?

There are also NP coordinations which are annotated as parallel scenes: https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-EWT/blob/master-images/225632-0008.svg looks like an error

** @danielhers wrote: There are also many cases of copular clauses, where in UD both the copula and any modifier of the noun are dependents of the noun, but in UCCA only the NP is a Participant.**

Ah, this is huge. Because of how UD treats copulas, the subject and copula are dependents of the complement, but in UCCA usually the complement alone is a Participant.

Interestingly, "X's own" is a unit in UCCA but usually not a UD constituent (UniversalDependencies/docs#638). I'm not actually sure what the ideal semantics is for this construction.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.