Giter Site home page Giter Site logo

dhowe / ritajs-v2 Goto Github PK

View Code? Open in Web Editor NEW
104.0 3.0 11.0 12.61 MB

RiTa: generative language tools

Home Page: https://rednoise.org/rita

License: GNU General Public License v3.0

ANTLR 0.06% JavaScript 98.94% HTML 0.32% Shell 0.02% CSS 0.65%
generative-text text-analysis natural-language rita

ritajs-v2's Introduction

license npm version CDNJS

RiTa: tools for generative natural language

RiTa is implemented in Java and JavaScript, with a common API for both, and is free/libre/open-source via the GPL license.

Features in v2.0

  • Smart lexicon search for words matching part-of-speech, syllable, stress and rhyme patterns
  • Fast, heuristic algorithms for inflection, conjugation, stemming, tokenization, and more
  • Letter-to-sound engine for feature analysis of arbitrary words (with/without lexicon)
  • Integration of the RiScript scripting language, designed for writers
  • New options for generation via grammars and Markov chains

Note: version 2.0 contains breaking changes -- please check the release notes

Installation

  • For node: npm install rita
  • For browsers: <script src="https://unpkg.com/rita"></script>
  • For developers

Example (node)

let RiTa = require('rita');

// to find rhymes
let rhymes = RiTa.rhymes('sweet');
console.log(rhymes);

// to analyze a sentence
let data = RiTa.analyze("The elephant took a bite!");
console.log(data);

// to load a grammar
let grammar = RiTa.grammar(jsonRules);
console.log(grammar.expand());

API

RiTa RiMarkov RiGrammar
RiTa.addTransform()
RiTa.alliterations()
RiTa.analyze()
RiTa.concordance()
RiTa.conjugate()
RiTa.evaluate()
RiTa.grammar()
RiTa.hasWord()
RiTa.isAbbrev()
RiTa.isAdjective()
RiTa.isAdverb()
RiTa.isAlliteration()
RiTa.isNoun()
RiTa.isPunct()
RiTa.isQuestion()
RiTa.isStopWord()
RiTa.isRhyme()
RiTa.isVerb()
RiTa.kwic()
RiTa.markov()
RiTa.pastPart()
RiTa.phones()
RiTa.pos()
RiTa.posInline()
RiTa.presentPart()
RiTa.pluralize()
RiTa.randomOrdering()
RiTa.randomSeed()
RiTa.randomWord()
RiTa.rhymes()
RiTa.search()
RiTa.sentences()
RiTa.singularize()
RiTa.soundsLike()
RiTa.spellsLike()
RiTa.stem()
RiTa.stresses()
RiTa.syllables()
RiTa.tokenize()
RiTa.untokenize()
addText()
completions()
generate()
probability()
probabilities()
size()
toString()
toJSON()
fromJSON()











addRule()
addRules()
expand()
removeRule()
toJSON()
toString()
fromJSON()













RiScript

RiScript is a writer-focused scripting language integrated with RiTa. It enables simple generative primitives within plain text for dynamic expansion at runtime. RiScript primitives can be used as part of any RiTa grammar or executed directly using RiTa.evaluate(). For more info, see this interactive notebook.




Developing

To install/build the library and run tests (with npm/mocha and node v14.x):

$ git clone https://github.com/dhowe/ritajs.git
$ cd ritajs 
$ npm install
$ npm run build 
$ npm run test

If all goes well, you should see a list of successful tests and find the library built in 'dist'


During development it is faster to run tests directly on the source, rather then the built library:

$ npm run test.src

You can also watch the source code and build automatically on any change:

$ npm run watch.src

Please make contributions via fork-and-pull - thanks!


 

Visual Studio Code

Once you have things running with npm/mocha, you might also try VSCode.

Some of the following extensions may also be useful:

  • hbenl.vscode-mocha-test-adapter
  • hbenl.vscode-test-explorer
  • ms-vscode.test-adapter-converter

Here you can see the tests in the VSCode Testing view

vscode-tests

 

About

 

Quick Start

A simple sketch

Create a new file on your desktop called 'test.html' with the following lines, save and drag it into a browser:

<html>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
  <script src="https://unpkg.com/rita"></script>
  <script>
    window.onload = function() {
      let words = RiTa.tokenize("The elephant took a bite!");
      $('#content').text(words);
    };
  </script>
  <div id="content" width=200 height=200></div>
<html>

With p5.js

Create a new file on your desktop called 'test.html' and download the latest rita.js from here, add the following lines, save and drag it into a browser:

<html>
  <script src="https://unpkg.com/p5"></script>
  <script src="https://unpkg.com/rita"></script>
  <script>
  function setup() {

    createCanvas(200,200);
    background(50);
    textSize(20);
    noStroke();

    let words = RiTa.tokenize("The elephant took a bite!")
    for (let i=0; i < words.length; i++) {
        text(words[i], 50, 50 + i*20);
    }
  }
  </script>
</html>

With node.js and npm

To install: $ npm install rita

let RiTa = require('rita');
let data = RiTa.analyze("The elephant took a bite!");
console.log(data);

 

Contributors

Code Contributors

This project exists only because of the people who contribute. Thank you!

Financial Contributors

ritajs-v2's People

Contributors

cqx931 avatar dependabot[bot] avatar dhowe avatar karliezhao avatar kennyviperhk avatar real-john-cheung avatar vansul avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ritajs-v2's Issues

New examples for various platforms

Based on the old ritajs readme file
Shall we create a new wiki page for these?

A simple sketch


Create a new file on your desktop called 'test.html' and download the latest rita.js from here, add the following lines, save and drag it into a browser:

<html>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js"></script>
  <script src="./rita.js"></script>
  <script>
    window.onload = function() {
      let words = RiTa.tokenize("The elephant took a bite!");
      $('#content').text(words);
    };
  </script>
  <div id="content" width=200 height=200></div>
<html>

With node.js and npm


To install: $ npm install rita

let rita = require('rita');
let features = rita.analyze("The elephant took a bite!");
console.log(features);

With p5.js


Create a new file on your desktop called 'test.html' and download the latest rita.js from here, add the following lines, save and drag it into a browser:

<html>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.4.3/p5.min.js"></script>
  <script src="./rita.js"></script>
  <script>
  function setup() {

    createCanvas(200,200);
    background(50);
    textSize(20);
    noStroke();

    let words = RiTa.tokenize("The elephant took a bite!")
    for (let i=0; i < words.length; i++) {
        text(words[i], 50, 50 + i*20);
    }
  }
  </script>
</html>

With browserify and npm


Install browserify (if you haven't already)

$ sudo npm install -g browserify

Create a file called 'main.js' with the following code:

require('rita');

let features = RiTa.analyze("The elephant took a bite!");
console.log(features);

Now install RiTa

$ npm install rita

Now use browserify to pack all the required modules into bundle.js

$ browserify main.js -o bundle.js

Create create a file called 'test.html' with a single script tag as below, then open it in a web browser and check the output in the 'Web Console'

<script src="bundle.js"></script>

With processing.js


Create a new file on your desktop called 'test.html' and download the latest rita.js from here, add the following lines, save and drag it into a browser:

<html>
  <script src="https://cdnjs.cloudflare.com/ajax/libs/processing.js/1.4.8/processing.min.js"></script>
  <script src="./rita.js"></script>
  <script type="text/processing" data-processing-target="processing-canvas">
    void setup() {

      size(200,200);
      background(50);
      textSize(20);
      noStroke();

      String words = RiTa.tokenize("The elephant took a bite!");
      for (int i=0; i < words.length; i++) {
          text(words[i], 50, 50 + i*20);
      }
    }
  </script>
  <canvas id="processing-canvas"> </canvas>
</html>

Finish documentation

  • Finish docs for all functions, especially the 'options' argument
  • Function-specific constants
  • Generic constants (ready)

Function-specific constants

// Array of all phonemes used by RiTa
RiTa.PHONES; // document in phones() function

// Constant for use in Tokenizer
RiTa.SPLIT_CONTRACTIONS = false; // documented in tokenize() function

// Constants for the conjugator (document in conjugate() function only)
RiTa.FIRST;
RiTa.SECOND ;
RiTa.THIRD;
RiTa.PAST_;
RiTa.PRESENT;
RiTa.FUTURE;
RiTa.SINGULAR ;
RiTa.PLURAL;
RiTa.NORMAL;  // is used
RiTa.INFINITIVE; // is used
RiTa.GERUND;  // is used
RiTa.IMPERATIVE;  // not used
RiTa.BARE_INFINITIVE;  // not used
RiTa.SUBJUNCTIVE;  // not used

Generic Constants

// Current version of RiTa
RiTa.VERSION;

// Silence all info/warnings from RiTa
RiTa.SILENT = false;

// Silence warnings for words not in lexicon
RiTa.SILENCE_LTS = false;

// Set to false to reduce memory (likely slower)
RiTa.CACHING = true;

Remove all TODOs from code

Search for 'TODO' in codebase. For each TODO you find, create a ticket. If you can solve the ticket relatively quickly, then do so. Otherwise just note what needs to be done. Replace the TODO comment with the issue number, as follows:

// TODO: see #14

Conjugate test failing

there was a return; on line 79 so the test below was not running.

now fail on this case

args = {
  number: RiTa.PLURAL,
  person: RiTa.SECOND_PERSON,
  tense: RiTa.PAST_TENSE
};
equal(RiTa.conjugate("barter", args), "bartered");
equal(RiTa.conjugate("run", args), "ran");

s = ["compete", "complete", "eject"];
a = ["competed", "completed", "ejected"];
ok(a.length === s.length);
for (let i = 0; i < s.length; i++) {
  c = RiTa.conjugate(s[i], args);
  equal(c, a[i]);
}

Further minimize number of dictionary entries

if we have a 'jj 'and and a 'rb' which simply adds the -ly, can we remove the rb

eg 'triumphant/triumphantly', 'autonomous/autonomously' etc

what about 'sympathetic/sympathetically'?

Rita.STOP_WORDS not found

Rita.STOP_WORDS seems haven't defined yet

in concorder.js

  _parseOptions(options) {
...
      this.wordsToIgnore = this.wordsToIgnore.concat(RiTa.STOP_WORDS);
...

Port examples to new API

I've copied the old examples here but now they need to be made to work. You can find a simple index page at /examples/index.html

  • /examples/dom
  • /examples/p5

This will likely require questions to me as the documentation is not quite finished. But this is useful in that we can update the documentation files as we go. So do ask if anything is not fully obvious

Center buttons in KWIC example

  • rita2js/examples/dom/KWICmodel/ (ready)
  • rita2js/examples/p5/KWICmodel (need to do #25 first)

Also great if we can do some simple css styling to make the buttons and text look better

Implement phoneme features ?

From https://github.com/aparrish/nonsense-verse-pycon-2020/blob/master/pincelate-tutorial-and-cookbook.ipynb

Characteristics of phonemes are called "features." Can provide a function to list features for specific phonemes. For example, to get the features for the vowel /UW/ (vowel sound in "toot"):

RiTa.phoneFeatures('UW');
OR
RiTa.phones.UW.features;

-> ('hgh', 'bck', 'rnd', 'vwl')

The features are referred to here with short three-letter abbreviations. Here's a full list:

alv: alveolar
apr: approximant
bck: back
blb: bilabial
cnt: central
dnt: dental
fnt: front
frc: fricative
glt: glottal
hgh: high
lat: lateral
lbd: labiodental
lbv: labiovelar
lmd: low-mid
low: low
mid: mid
nas: nasal
pal: palatal
pla: palato-alveolar
rnd: rounded
rzd: rhoticized
smh: semi-high
stp: stop
umd: upper-mid
unr: unrounded
vcd: voiced
vel: velar
vls: voiceless
vwl: vowel

Fix revSort function (exercise)

The test at /test/util-tests.js::Line39 is failing. Fix the revSort() function (at /test/util-tests.js::Line46) so it correctly sorts in reverse order.

Make the changes in your own fork. Then make sure all tests are passing (should be ~180 tests), then submit a PR.

Implement phoneme regex search in lexicon ?

phones = RiTa.phonemes("sighs");
RiTa.matchPhones(phones); -> ['incise', 'incised', 'incisor', 'incisors', 'malloseismic']

stresses = RiTa.stresses("favorite");
RiTa.matchStresses(stresses);

or possibly an option to similarBy

should support regex search

Controls alignment in workbench

path: /web/workbench/index.html

Center-align div in red below with text area above (note that new buttons will be added, and it still should be centered):

image

Finish examples updates

General

Use setTimeout() rather than setInterval() (see code)
Use === or !== rather than == or !=

Rhymes / Markov / Grammar all good

KWICmodel

  • Redo using p5's createButton() function [p5]
  • Match layout in dom - button row is center-aligned on red word [p5]

Analysis

  • Redo with new class style: 'class Bubble { ... ' [p5]
  • Fix vertical alignment for large/small bubbles [dom]

image

Large slowdown on similarBy with webpack (dev build)

webpacked rita

✓ Should correctly call similarBy.letter (3435ms)
✓ Should correctly call similarBy.sound (2597ms)
✓ Should correctly call similarBy.soundAndLetter (3220ms)

regular rita

✓ Should correctly call similarBy.letter (453ms)
✓ Should correctly call similarBy.sound (560ms)
✓ Should correctly call similarBy.soundAndLetter (529ms)

Not a huge deal as it only happens in dev builds, but I can't IMAGINE why webpack bundling slows function execution time by nearly an order of magnitude !

RiTa v2 API

RiTa
RiTa.VERSION

RiTa.alliterations()
RiTa.concordance()
RiTa.conjugate()
RiTa.hasWord()
RiTa.env()
RiTa.pastParticiple()
RiTa.phonemes()
RiTa.posTags()
RiTa.posTagsInline()
RiTa.presentParticiple()
RiTa.stresses()
RiTa.syllables()
RiTa.isAbbreviation()
RiTa.isAdjective()
RiTa.isAdverb()
RiTa.isAlliteration()
RiTa.isNoun()
RiTa.isPunctuation()
RiTa.isQuestion()
RiTa.isRhyme()
RiTa.isVerb()
RiTa.kwic()
RiTa.pluralize()
RiTa.random()
RiTa.randomOrdering()
RiTa.randomSeed()
RiTa.randomWord()
RiTa.rhymes()
RiTa.evaluate()
RiTa.similarBy()
RiTa.singularize()
RiTa.sentences()
RiTa.stem()
RiTa.tokenize()
RiTa.untokenize()
RiTa.words()

RiGrammar
load()
addRule()
expand()
expandFrom()
removeRule()
toString()

RiMarkov
RiMarkov.fromJSON()
addText()
addSentences()
generate()
generateSentences()
completions()
probability()
probabilities()
toJSON()
size()
toString()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.