Giter Site home page Giter Site logo

nlp2rdf / ontologies Goto Github PK

View Code? Open in Web Editor NEW
36.0 36.0 7.0 659 KB

All ontologies used in NIF 2.0 (NIF-Core + vocabulary modules + helper modules)

Home Page: http://nlp2rdf.org

ApacheConf 1.48% Shell 1.48% Web Ontology Language 3.69% HTML 93.35%

ontologies's People

Contributors

cirola2000 avatar der-bruemmer avatar jimkont avatar jimregan avatar kurzum avatar neradis avatar ricardousbeck avatar vladimiralexiev avatar znerol avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ontologies's Issues

ITSRDF properties for confidence and provenance

@VladimirAlexiev wrote:

Just noticed that this proposal also disregards the direct props that exist in ITSRDF [...]

I added (A new section in the docs)[http://nif.readthedocs.org/en/2.1-rc/prov-and-conf.html#relation-of-nif-2-1-companion-properties-to-itsrdf-properties] discussion the ITS semantics and why we decided to create own complementary versions.

464c0d7 also added notes in the ontology documents and formal OWL declarations of their relatedness to the maximum degree possible from my point of view without imposing OWL inference ramification to ITSRDF project without prior coordination.

nif:topic assertions contains illegal punning

This property is structurally violating OWL2 DL constraints:

nif:topic 
    a owl:DatatypeProperty ;
    owl:versionInfo "0.0.1" ; 
    rdfs:label "topic" ;
    rdfs:comment """The topic of a string
    Changelog:
    * 0.0.1 initial commit of property"""@en ;
    rdfs:domain nif:String ;
    rdfs:range nif:Annotation .

@der-bruemmer Sebastian told me you added this property a bit 'quick-and-dirty' for GERBIL. How is it actually used there? (ObjectProperty pointing to nif:Annotation vs. DatatypeProperty with some string description)
@der-bruemmer Can you maybe also suggest a more fleshed-out description with better guidance when and how usage of this property is appropriate?

Are lots of sub-classes and sub-properties needed?

Consider are the consequences of making sub-classes and sub-properties. Economy of representation (number of triples) is an important consideration to keep NLP as RDF a feasible idea (because NLP generates a lot of data), and NIF 2 thought carefully about that (counting triples for the Simple, Stanbol and OpenAnnotation profiles).
Injudicious use of sub-classes and sub-properties might induce NIF users to abandon RDFS... or NIF itself.

remove nif:topic

@der-bruemmer says in #10: "The point of this property was to annotate a complete text with topical tags or entity classes, as opposed to a single entity... I'll ask the Gerbillians for more information."

These questions of mine were not answered, and more information from Gerbilians was not received:

  • If it's a property of the complete text, why not define the domain as nif:Context?
  • For "topical tag", why not use dct:subject but invent a new property?
    (Actually, its:taIdentRef is better for this purpose)
  • For "entity class" why not use its:taClassRef but invent a new property?

I don't see a need for such property, we can use its:taIdentRef or its:taClassRef

comment on itsrdf:taAnnotatorsRef

In http://vladimiralexiev.github.io/Multisensor/#sec-4 I got this:

<#char=1116,1123> a nif:Word;
  itsrdf:taClassRef nerd:Organization;
  itsrdf:taConfidence 0.9; # means the same as "0.9"^^xsd:decimal
  itsrdf:taAnnotatorsRef "text-analysis|http://linguatec.com". !!!!

And later: "itsrdf:taAnnotatorsRef is not a URL but a specially formatted string (coming from the XML heritage of ITS, see 5.7 ITS Tools Annotation)"

But ITSRDF defines itsrdf:taAnnotatorsRef a owl:ObjectProperty, so it should be a URL not a "specially formatted string". So my mind is badly twisted, interpreting XML data into RDF in such twisted way.

To help people like me, please add a comment eg

itsrdf:taAnnotatorsRef rdfs:comment 
"""URL or URI of the software or person that made the annotation, eg 
<http://some-company.com> or <http://some-company.com/some-software> or 
<http://some-company.com/some-software/v1.23>.

Unlike ITS XML which uses a specially formatted string 
(e.g. "text-analysis|http://some-company.com"),
use a proper URL or URI, so you can attach extra info to it, eg

<http://some-company.com/some-software/v1.23>
  a prov:SoftwareAgent, doap:Version ;
  doap:shortdesc "Some Company's powerful Software" ;
  doap:revision "1.23".

Both itsrdf:taAnnotatorsRef and itsrdf:taConfidence apply to all annotations attached to the same node,
including itsrdf:taClassRef, itsrdf:taIdentRef, nif:oliaLink, nif:oliaClass, etc
"""

(Note: this assumes that NIF uses itsrdf:taAnnotatorsRef and not nif:provenance, as per #15). If not, put this comment on nif:provenance

nif:AnnotationUnit vs nifs:Annotation vs fise:EntityAnnotation vs fam:EntityAnnotation

@kurzum "do you also agree to merge nifs:Annotation from https://github.com/NLP2RDF/ontologies/blob/master/nif-core/nif-stanbol.ttl#L78 into nif-core and rename it AnnotationUnit ? We really need the feature to express basic alternative annotations."

Such node is definitely needed. But let's examine "prior art". in chronological order I think it is:

  1. FISE (Stanbol): http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.html
  2. http://jens-lehmann.org/files/2013/iswc_nif.pdf fig.3, which uses FISE
  3. NIF Stanbol: http://persistence.uni-leipzig.org/nlp2rdf/specification/stanbol.html, incl nifs:Annotation
  4. FAM (Fusepool P3): https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md

2 is contradicted by 3, eg

2 3
nif:EntityAnnotation fise:EntityAnnotation
nifs:extractedFrom fise:extracted-from
nif:oliaConf fise:confidence

But I guess 2 is older and 3 is current.

I don't know if someone has used 3 (NIF Stanbol). Multisensor hasn't.

I'd be happy if you consolidate all this into AnnotationUnit and friends.
KEY QUESTION: I guess that with NIF 2.1, the Stanbol profile will go away?

Ideally and if possible, this should be coordinated with the FAM people, but I don't know any of them...

An additional property for annotation assertions grouped by distinct individual?

As previously discussed, we want a hybrid scheme for annotations in NIF 3.0:

  • a lean way to attach annotations directy to nif:Strings when there is only one annotation
    per aspect
  • a scheme with an intermediate (blank) node for alternative annotations concerning the
    same aspect

Here is my idea for details, already as draft for documentation text:

==== QUOTE START====

NIF 2.1 annotation schemes

NIF 2.1 offers two schemes to attach annotations to a nif:String individual S:

direct attatchment

assertions comprising information for a given annotation are attachted
directly, S being des subject of corresponding triples. Several
sub-properties of nif:annotation are provided for this approach,
e.g. itsrdf:taIdentRef, nif:oliaLink.

If one intends to specify confidence and provenance information, no more than
one annotation property assertion per aspect must be attached directy,to prevent ambiguity:

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
    itsrdf:taConfidence "0.9"^^xsd:decimal .

To allow use of the direct attachtment scheme to annotate several aspect on S
simultanesously, corresponding specialisation of nif:confidence and nif:provenance
are provided for some of the specialisations of nif:annotation (e.g. nif:oliaConf
and nif:oliaProv for nif:oliaLink):

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
    itsrdf:taConfidence "0.9"^^xsd:decimal ;
    nif:oliaLink <http://acoli.cs.uni-frankfurt.de/resources/olia/penn.owl#NNP> ;
    nif:oliaProv "0.95"^^xsd:decimal .

related T-Box assertions / constraints

SubObjectPropertyOf(itsrdf:taIdentRef nif:annotion)
SubObjectPropertyOf(nif:oliaLink  nif:annotion)

SubDataPropertyOf(itsrdf:taConfidence nif:confidence)
SubDataPropertyOf(nif:oliaConf nif:confidence)
FunctionalDataProperty(itsrdf:taConfidence) # this still has to be coordinated with the itsrdf maintainers
FunctionalDataProperty(nif:oliaConf)

individual per annotation

a new nif:AnnotationUnit indiviual is introduced for each annotation and
conntected to S using nif:annotationUnit. Using this scheme, several
annotations for the same annocation aspect can be expressed for S
(for example, different links to Linked Data Resources for the same
token obtained from several Named Entity Recognition and
Disabmiguation System)

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans> ;
        itsrdf:taConfidence "0.8"^^xsd:decimal ;
    ] .

related T-Box assertions / constraints

ObjectPropertyDomain(nif:annotationUnit nif:String)
ObjectPropertyRange(nif:annotationUnit nif:AnnotationUnit)

each nif:AnnotationUnit carries the information of exactly one
piece of annotation information. This can be a property assertion
with a nif:annotation, nif:classAnnotation or nif:literalAnnotation
or inherent annotation of a text span if the annotation unit is also
a nif:TextSpanAnnotation.

There is no more than one confidence and provenance assertion to
prevent ambiguity:

SubClassOf(nif:AnnotationUnit DataMaxCardinality( 1 nif:confidence))
SubClassOf(nif:AnnotationUnit ObjectMaxCardinality( 1 nif:provenance))

unified use and querying of both schemes

Naturally, both annotation schemes can be combined for a nif:String indiviual:

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
    itsrdf:taConfidence "0.9"^^xsd:decimal ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans_(saxophonist)> ;
        itsrdf:taConfidence "0.4"^^xsd:decimal 
    ] ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans> ;
        itsrdf:taConfidence "0.8"^^xsd:decimal ;
    ] ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Stewart_(musician)> ;
        itsrdf:taConfidence "0.2"^^xsd:decimal
    ].

The property path feature in SPARQL allows for a concise way to access
annotation expressed using both schemes simultaneously:

PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>

select ?str ?link ?conf {
 ?str nif:annotationUnit? [
      itsrdf:taIdentRef ?link ;
      itsrdf:taConfidence ?conf
  ] ;
  a nif:String .
}

If the used SPARQL processor supports RDFS-inference (or if relevant inferences are
materialized) it is possible to query more general for all (object) annotations in a
NIF document in a similar fashion:

PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>

select ?str ?anno ?conf {
 ?str nif:annotationUnit? [
      nif:annotation ?anno ;
      nif:confidence ?conf
  ] ;
  a nif:String .
}

==== QUOTE END====

Do you also deem a specific nif:annotationUnit property justified and useful?

location of NIF3.0 and issue tracker

  1. Dmitris Kontokostas told me NIF 3.0 is coming out end of this month. Where can I take a look at a draft, and change log?
  2. What's the preferred tracker? I see issues tracked here, but some months ago I posted a couple at https://github.com/NLP2RDF/specification/issues
  3. Is it correct to make a reference to a linguistic resource, eg itsrdf:taIdentRef bn:123456. I though that taIdentRef is for named entities. I like such usage, but it should be clarified with an example.

Cheers!

annotations over strings

the idea is to enable annotation over strings. So that we could track provenance at annotation level - if needed. Currently, it is proposed to use nif:AnnotationGroup for this purpose.

URL persistence vs modularisation

Modularisation of NIF moved a bunch of elements from the nif: namespace to the nif-ann: namespace. See "deprecated" in https://github.com/NLP2RDF/ontologies/blob/nif2.1/nif-core/nif-core.ttl.

Modularization is all nice, but have you considered the tons (megatons!) of NIF data already in existence? Is the word persistence in http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core# merely a charade?

I think this one change will prevent us in Multisensor from using NIF 2.1. I think that an overriding prioriy in NIF 2.1 must be backward compatibility.

Note: backward compatibility only starts with URL permanence. Other issues to consider are the consequences of making sub-classes and sub-properties. Economy of representation (number of triples) is an important consideration to keep NLP as RDF a feasible idea, and NIF 2 thought carefully about that. Injudicious use of sub-classes and sub-properties will force NIF users to abandon RDFS... or NIF itself.

nif:lang has multiple domains

nif-core.ttl defines nif:lang to have two domains:

# in manchester: "String and not Context"
rdfs:domain nif:String ;
# additional constraint, must not be used on a nif:Context.
rdfs:domain [ rdf:type owl:Class ; owl:complementOf nif:Context ] . 
# too complex: rdfs:domain [ owl:intersectionOf (  nif:String 
    #                         [ rdf:type owl:Class ; owl:complementOf nif:Context ]     ) ] . 

Unfortunately this doesn't work: rather than constraining the subject of nif:lang, it will infer said subject to have both types. So either:

  1. Use the "too complex" definition.
  2. Use just nif:String and leave the rest to a RDFUnit test case.
    I favor 2.

For 1, you should mention it's an owl:Class as follows.

     rdfs:domain [ rdf:type owl:Class ;
                   owl:intersectionOf ( nif:String
                                        [ rdf:type owl:Class ;
                                          owl:complementOf nif:Context ])].

Don't know if it's critical but you can verify it by converting this Manchester notation to Turtle using this convertor

Prefix: nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
Class: nif:String
Class: nif:Context
ObjectProperty: nif:lang
    Domain: 
        nif:String and not nif:Context

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.