Giter Site home page Giter Site logo

havarot's Introduction

havarot

A Typescript package for getting syllabic data about Hebrew text with niqqud.

example

import { Text } from "havarot";
const heb: string = "אֱלֹהִים";
const text: Text = new Text(heb);
const sylText = text.syllables.map((syl) => syl.text);
sylText;
//
//  [
//    "אֱ"
//    "לֹ"
//    "הִים"
//  ]

DOCS

The general idea of this package is that a Text is composed of Words which are composed of Syllables which are composed of Clusters which are composed of Characters.

Text

Text() requires an input string.

import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");

Text.original

Returns the original string.

import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.original;
// "הֲבָרֹות"

Text.text

Returns a string that has been decomposed, sequenced, qamets qatan patterns converted to the appropriate unicode character (U+05C7), and holem-waw sequences corrected.

import { Text } from "havarot";
const text: Text = new Text("וַתָּשָׁב");
text.text;
// וַתָּשׇׁב

Text.words

Returns a one dimensional array of Words

import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.words;
// [Word { original: "הֲבָרֹות" }]

Text.syllables

Returns a one dimensional array of Syllables

import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.syllables;
// [
//    Syllable { original: "הֲ" },
//    Syllable { original: "בָ" },
//    Syllable { original: "רֹות" }
//  ]

Text.clusters

Returns a one dimensional array of Clusters

import { Text } from "havarot";
const text: Text = new Text("יָד");
text.clusters;
// [
//    Cluster { original: "יָ" },
//    Cluster { original: "ד" }
//  ]

Text.chars

Returns a one dimensional array of Chars

import { Text } from "havarot";
const text: Text = new Text("יָד");
text.chars;

//  [
//    Char { original: "י" },
//    Char { original: "ָ" },
//    Char { original: "ד" }
//  ]

Word

Text.text is split at each space and maqqef (U+05BE) both of which are captured. Thus, the string passed to instantiate each Word is already properly decomposed, sequenced, and qamets qatan patterns converted to the appropriate unicode character (U+05C7).

Word.original

Returns the original string passed which has been decomposed, sequenced, qamets qatan patterns converted to the appropriate unicode character (U+05C7), and holem-waw sequences corrected.

import { Text } from "havarot";
const text: Text = new Text("אֵיפֹה־אַתָּה מֹשֶה");
const words = text.words.map((word) => word.original);
words;
// [
//    "אֵיפֹה־",
//    "אַתָּה",
//    "מֹשֶׁה " (note the included space)
//  ]

Word.text

Returns a string that has been properly trimmed, built up from the .text of its constituent parts.

import { Text } from "havarot";
const text: Text = new Text("אֵיפֹה־אַתָּה מֹשֶה");
const words = text.words.map((word) => word.text);
words;
// [
//    "אֵיפֹה־",
//    "אַתָּה",
//    "מֹשֶׁה"
//  ]

Word.syllables

Returns a one dimensional array of Syllables

import { Text } from "havarot";
const text: Text = new Text("אֵיפֹה־אַתָּה מֹשֶה");
text.words[0].syllables;
// [
//    Syllable { original: "אֵי" },
//    Syllable { original: "פֹה־" }
//  ]

Word.clusters

Returns a one dimensional array of Clusters

import { Text } from "havarot";
const text: Text = new Text("אֵיפֹה־אַתָּה מֹשֶה");
text.words[0].clusters;
// [
//    Cluster { original: "אֵ" },
//    Cluster { original: "י" },
//    Cluster { original: "פֹ" },
//    Cluster { original: "ה־" }
//  ]

Word.chars

Returns a one dimensional array of Chars

import { Text } from "havarot";
const text: Text = new Text("אֵיפֹה־אַתָּה מֹשֶה");
text.words[0].chars;
// [
//    Char { original: "א" },
//    Char { original: "ֵ" }, (tsere)
//    Char { original: "פ" },
//    Char { original: "ֹ" }, (holem)
//    Char { original: "ה"},
//    Char { original: "־" }
//  ]

Syllable

A Syllable is created from an array of Clusters.

This is where things get tricky. The string from Word.original is passed into syllabify() from "./src/utils/syllabifier". This string is then converted into Clusters which are analyzed as being part of a syllable since a syallble can have more than one cluster. The syllabify() function determines if a what is a syllable and if it is closed, accented, or final.

See the syllabification doc for how a syllable is determined.

Syllable.text

Returns a string that has been built up from the .text of its constituent parts.

import { Text } from "havarot";
const text: Text = new Text("וַיִּקְרָ֨א");
const sylText = text.syllables.map((syl) => syl.text);
sylText;
//
//  [
//    "וַ"
//    "יִּקְ"
//    "רָ֨א"
//  ]

Syllable.clusters

Returns a one dimensional array of Clusters

import { Text } from "havarot";
const text: Text = new Text("וַיִּקְרָ֨א");
text.syllables[1].clusters;
// [
//    Cluster { original: "יִּ" },
//    Cluster { original: "קְ" }
//  ]

Syllable.chars

Returns a one dimensional array of Chars

import { Text } from "havarot";
const text: Text = new Text("וַיִּקְרָ֨א");
text.syllables[2].chars;
// [
//    Char { original: "ר" },
//    Char { original: "ָ" },
//    Char { original: "" }, i.e. \u{05A8} (does not print well)
//    Char { original: "א" }
//  ]

Syllable.isClosed

Returns a boolean.

import { Text } from "havarot";
const text: Text = new Text("וַיִּקְרָ֨א");
text.syllables[0].isClosed;
// true
text.syllables[2].isClosed;
// false

Syllable.isAccented

Returns a boolean.

Though Hebrew words are typically accented on the final syllable, this is not always the case.

import { Text } from "havarot";
const text: Text = new Text("וַיִּקְרָ֨א"); // note the taamei over the ר
text.syllables[0].isAccented; // i.e. "וַ"
// false
text.syllables[2].isAccented; // i.e. "רָ֨א"
// true

Syllable.isFinal

Returns a boolean.

import { Text } from "havarot";
const text: Text = new Text("וַיִּקְרָ֨א"); // note the taamei over the ר
text.syllables[0].isFinal; // i.e. "וַ"
// false
text.syllables[2].isFinal; // i.e. "רָ֨א"
// true

Cluster

A cluster is group of Hebrew character constituted by:

  • an obligatory Hebrew consonant character
  • an optional ligature mark
  • an optional vowel
  • an optional taamei

A Syllable is a linguistic unit, whereas a Cluster is an orthgraphic one. The word יֹו֑ם is only one syllable, but it has three clusters—יֹ, ו֑, ם.

Because Hebrew orthography is both sub and supra linear, clusters can be encoded in various ways. For the issues concerning normalization, see the SBL Hebrew Font Manual, p.8.

Cluster.text

Returns a string that has been built up from the .text of its constituent parts.

import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
const clusters = text.clusters.map((cluster) => cluster.text);
// [
//  "הֲ",
//  "בָ",
//  "רֹ",
//  "ו",
//  "ת"
// ]

Cluster.chars

Returns a one dimensional array of Chars.

import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.clusters[0].chars;
// [
//  Char { original: "ה" },
//  Char { original: "ֲ " },   i.e. \u{05B2} (does not print well)
// ]

Cluster.hasLongVowel

Returns true if the following long vowel character are present:

  • \u{05B5} TSERE
  • \u{05B8} QAMATS
  • \u{05B9} HOLAM
  • \u{05BA} HOLAM HASER FOR VAV
import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.clusters[0].hasLongVowel;
// false
text.clusters[1].hasLongVowel;
// true

Cluster.hasShortVowel

Returns true if the following long vowel character are present:

  • \u{05B4} HIRIQ
  • \u{05B6} SEGOL
  • \u{05B7} PATAH
  • \u{05BB} QUBUTS
  • \u{05C7} QAMATS QATAN
import { Text } from "havarot";
const text: Text = new Text("מַלְכָּה");
text.clusters[0].hasShortVowel;
// true
text.clusters[2].hasShortVowel;
// false

Cluster.hasHalfVowel

Returns true if the following long vowel character are present:

  • \u{05B1} HATAF SEGOL
  • \u{05B2} HATAF PATAH
  • \u{05B3} HATAF QAMATS
import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.clusters[0].hasHalfVowel;
// true
text.clusters[1].hasHalfVowel;
// false

Cluster.hasVowel

Returns true if Cluster.hasLongVowel, Cluster.hasShortVowel, or Cluster.hasHalfVowel is true.

According to syllbaification, a shewa is a vowel and serves as the nucleus of a syllable. Because Cluster is concerned with orthography, a shewa is not a vowel character.

import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.clusters[0].hasVowel;
// true
text.clusters[4].hasVowel;
// false

Cluster.hasMetheg

Returns true is the following character is present:

  • \u{05BD} METEG
import { Text } from "havarot";
const text: Text = new Text("הֲבָרֹות");
text.clusters[0].hasMetheg;
// false

Cluster.hasShewa

Returns true is the following character is present:

  • \u{05B0} SHEWA
import { Text } from "havarot";
const text: Text = new Text("מַלְכָּה");
text.clusters[0].hasShewa;
// false
text.clusters[1].hasShewa;
// true

Cluster.hasTaamei

Returns true is the following characters are present:

  • \u{0591}-\u{05AF}\u{05BF}\u{05C0}\u{05C3}-\u{05C6}\u{05F3}\u{05F4}
import { Text } from "havarot";
const text: Text = new Text("אֱלֹהִ֑ים");
text.clusters[0].hasTaamei;
// false
text.clusters[2].hasTaamei;
// true

Char

A Hebrew character and it's positioning number for being sequenced correctly. See Cluster for correct normalization.

Char.text

Returns a string of the character that is passed in

import { Text } from "havarot";
const text: Text = new Text("אֱלֹהִ֑ים");
text.chars[0].text;
// "א"

Char.sequencePosition

Returns a number used for sequencing

  • consonants = 0
  • ligatures = 1
  • dagesh or rafe = 2
  • niqqud (vowels) = 3
  • taamei (accents) = 4
import { Text } from "havarot";
const text: Text = new Text("אֱלֹהִ֑ים");
text.chars[0].sequencePosition; // the aleph
// 0
text.chars[1].sequencePosition; // the segol
// 3

Contributing

See the TODO list for some ideas of what needs to get done. Of feel free to open an issue or pull request.

See the terms list for a list of naming convention.

havarot's People

Contributors

charlesloder avatar rivkahcarl avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.