Giter Site home page Giter Site logo

ekaf / wordnet-prolog Goto Github PK

View Code? Open in Web Editor NEW
29.0 3.0 5.0 93.55 MB

Prolog versions of the WordNet databases

License: Other

Prolog 44.48% Makefile 0.01% Roff 0.27% Perl 33.53% Raku 21.73%
princeton-wordnet prolog wordnet relational-databases

wordnet-prolog's Introduction

wordnet-prolog

https://github.com/ekaf/wordnet-prolog

Wordnet-prolog includes new versions of the WNprolog databases, compiled by Eric Kafe (https://github.com/ekaf/wordnet-prolog), and bundled with a copy of the original WNprolog-3.0 documentation (c) 2012 Princeton University.

WNprolog-3.1

WNprolog-3.1 is a Prolog version of WordNet 3.1. The Prolog databases were generated from the original WordNet 3.1 databases (c) 2011 Princeton University,

Some missing links were added, in order to enforce full symmetry of the symmetric relations. Also, this version avoids duplicates, and contains only unique clauses:

  • wn_ant.pl: 7988
  • wn_at.pl: 1278
  • wn_cls.pl: 9559
  • wn_cs.pl: 221
  • wn_der.pl: 74781
  • wn_ent.pl: 408
  • wn_exc.pl: 6053
  • wn_fr.pl: 21684
  • wn_g.pl: 117791
  • wn_hyp.pl: 89172
  • wn_ins.pl: 8589
  • wn_mm.pl: 12288
  • wn_mp.pl: 9111
  • wn_ms.pl: 797
  • wn_per.pl: 8074
  • wn_ppl.pl: 73
  • wn_sa.pl: 4054
  • wn_sim.pl: 21434
  • wn_sk.pl: 207272
  • wn_s.pl: 207272
  • wn_syntax.pl: 1054
  • wn_vgp.pl: 1744
  • total: 804644

Other Prolog versions of WordNet

The wordnet-prolog repository also includes alternative branches with Prolog versions of WordNet 3.0 and Open English WordNet 2022.

Utilities:

wn_morphy.pl is a SWI-prolog lemmatizer, similar to morphy, the morphological processor from WordNet.

wn_valid.pl is a SWI-prolog program testing for some potential issues in WordNet:

  • check_keys: ambiguous sense keys, pointing to more than one synset
  • symcheck: missing symmetry in the symmetric relations
  • asymcheck: direct loops in the asymmetric relations
  • hypself: self-hyponymous word forms
  • check_duplicates: find duplicate clauses

The accompanying wn_query.pl file is a SWI-prolog program implementing some common WordNet use cases, and a few formal checks, like symmetry and transitive loop detection.

For convenient inter-operation with other projects, the wn2csv.pl program converts the Prolog databases to comma-separated CSV files, which can be easily imported into most database systems.

Type "make valid" or "make query" to run the SWI-prolog programs, or "make csv" to generate CSV databases.

News (2020):

CSV versions of the WordNet databases (output by wn2csv.pl) are now available through the wncsv project at:

https://github.com/ekaf/wncsv

wordnet-prolog's People

Contributors

ekaf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

wordnet-prolog's Issues

Swapped frames arguments

Thanks to Todd Kelley from algonquincollege.com for catching this bug:

The fr(synset_id,f_num,w_num) predicate seems to behave as if it is actually fr(synset_id,w_num,f_num) with the f_num and w_num reversed.

You're completely right: in the fr/3 predicate, the second and third arguments are swapped w.r.t. the documentation. This is an "undocumented change" which has persisted since WNprolog-2.1 from back in 2006. Before that, the fr/3 predicate followed the order in the documentation.

I would rather fix the documentation than the dbs, since having the word number as second argument corresponds to the order used in all the other lexical predicates.

Failures caused by current_functor/2 in newer swipl versions

Some goals, which used to succeed with swipl versions up to 8.x, now fail with swipl v. 9.0. The failures occur when current_functor/2 returns surprising functor/arity pairs of questionable authenticity, which are neither in the loaded program files, nor Prolog builtins. In particular,
this causes trouble with WordNet's s/6 predicate, because current_functor(s, Arity) reports Arity=1 and Arity=3 in addition to the expected Arity=6.

The Swipl documentation explains this behaviour by the fact that current_functor/2 also returns noncurrent functors, which have not been garbage collected. Still, it is a mystery where s/1 and s/3 come from, and why they did not cause trouble with previous Swipl versions.

However, the analogous current_predicate/2 does not suffer from the same problem.

Morphy

Thanks to Chris from micallef.io for raising this question:

How do you query lemmas using your library of predicates? For example, the lemma for "walking", "walks", "walked" is "walk".

Older Princeton WordNet releases used to include a morphology analyser called "morphy", but it was not a part of the WordNet databases, and therefore neither included in the old WNprolog nor the newer wordnet-prolog releases.

However, maybe it is time to consider including the morphological exceptions file, since this would be very easy, and maybe even a Prolog equivalent of the "morphy" program.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.