Comments (2)
@fhardison Thanks for flagging this.
We are (currently!) shipping three versions of data in this repo:
nodes
contains this data in a set of nested Node elements suitable for many NLP systems and other systems that use recursive algorithmslowfat
contains the same data in a form more suitable for some kinds of query systems and some kinds of display.TSV
contains the word-level data in a TSV table, without syntactic tree structure. This is simpler for many programs that do not need the complexity of graph structures.
We have an open issue for fixing the lowfat
dataset, where indeed there are some gaps in the data:
I just merged a PR last month (#110) that updated the TSV
dataset; it is derived from the nodes
dataset, and both nodes
and TSV
have all of the text from the WLC.
I apologize for any confusion this may have caused you; I will be sure to loop back when we have an updated version of the lowfat
data that corrects the issue from #65.
Please let us know if you the nodes
or TSV
datasets won't cover what you're hoping to do with the data, and I'll see if there is anything else we can do to help assist.
from macula-hebrew.
Is a duplicate of #65
from macula-hebrew.
Related Issues (20)
- Add lemmas to Hebrew nodes trees HOT 4
- There are missing `m/@xml:id`s in our current lowfat trees HOT 1
- Marble Domains (`Domain`, `ContextualDomain`, `CoreDomain`) HOT 6
- 5. Repopulate Hebrew lowfat with the latest updates:
- transcription and gloss attributes from SIL are still missing, at least from Genesis 1.
- Problems in `morpheme-mappings.xml` HOT 1
- Word Sense (from macula-greek) HOT 1
- Greek beta-to-unicode in Genesis 1:1 HOT 1
- Incorrect closing </w> tag
- Implicit article stealing attributes from following sibling
- Split node at GEN 50:10!4
- Replace `c` node with merged `m` in PSA 102:4
- After in Gen 1:12 HOT 2
- Incorrect mapping to lowfat HOT 1
- Low-fat word parts missing HOT 5
- Lowfat 'c' fields have no glosses HOT 1
- include Ketiv into Macula-Hebrew ? HOT 2
- Misnumbered nodes in 1 Chronicles 20 HOT 1
- Macula Contextual Domains
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from macula-hebrew.