Giter Site home page Giter Site logo

unfoldingword / uw-content-validation Goto Github PK

View Code? Open in Web Editor NEW
0.0 4.0 0.0 109.09 MB

Functions for checking user&format errors in content fields/files

Home Page: https://unfoldingword.github.io/uw-content-validation

License: Other

JavaScript 99.70% Shell 0.02% Python 0.29%
tsv usfm scripture-open-components

uw-content-validation's Introduction

unfoldingWord

uw-content-validation's People

Contributors

ancienttexts-net avatar dependabot[bot] avatar jag3773 avatar mandolyte avatar photonomad0 avatar richmahn avatar robh123 avatar

Watchers

 avatar  avatar  avatar

uw-content-validation's Issues

Add React button to demo

For clearCache in the demos, you have to change 'N' to 'Y'. That's only because my first attempt at putting a clickable button there failed.

If you'd like to do this for a bit of relaxed pleasure sometime @PhotoNomad0, feel free. :-)

Seeing messages with odd information

Two things in particular:

  1. This message appears to have JavaScript leftovers in it: (not '57-TIT.usfm') [object Object]
  2. This message is claiming that an invalid book abbreviation: Bad function call: should be given a valid book abbreviation

Here is screenshot:

image

checkTN_TSVDataRow() should warn if link has language code rather than *

This should give warning:

GEN\t1\t9\tzu6f\tfigs-activepassive\t\t0\tLet the waters…be gathered\tThis can be translated with an active verb. This is a command. By commanding that the waters gather together, God made them gather together. Alternate translation: “Let the waters…gather” or “Let the waters…come together” (See: [[rc://en/ta/man/translate/figs-activepassive]]
`

Code directly fetches unconditionally original language files

The snippet below is from quote-check.js. It unconditionally fetches a new copy of the original language text.

If this were in tC Create, it would be fetching a new fresh copy for each row in a tN TSV (if I understand what it is doing correctly).

Propose that this code be refactored to take originalUSFM as an input parameter. Additionally, we should review any other cases where this module is fetching data. All data should be provided to it -- it should never fetch any on its own. Fetching must be left to the app using this code.

Is this agreeable?

        let originalUSFM;
        // console.log(`Need to check against ${originalLanguageRepoCode}`);
        const getFile_ = (optionalCheckingOptions && optionalCheckingOptions.getFile) ? optionalCheckingOptions.getFile : getFileCached;
        if (originalLanguageRepoCode === 'UHB') {
            try {
                originalUSFM = await getFile_({ username, repository: originalLanguageRepoName, path: filename, branch });
                // console.log("Fetched file_content for", repoName, filename, typeof originalUSFM, originalUSFM.length);
            } catch (gcUHBerror) {
                console.log("ERROR: Failed to load", username, originalLanguageRepoCode, filename, branch, gcUHBerror.message);
                addNotice6({ priority: 601, message: "Failed to load", filename, location: `${ourLocation}: ${gcUHBerror}`, extra: originalLanguageRepoName });
            }
        } else if (originalLanguageRepoCode === 'UGNT') {
            try {
                originalUSFM = await getFile_({ username, repository: originalLanguageRepoName, path: filename, branch });
                // console.log("Fetched file_content for", repoName, filename, typeof originalUSFM, originalUSFM.length);
            } catch (gcUGNTerror) {
                console.log("ERROR: Failed to load", username, originalLanguageRepoCode, filename, branch, gcUGNTerror.message);
                addNotice6({ priority: 601, message: "Failed to load", filename, location: `${ourLocation}: ${gcUGNTerror}`, extra: originalLanguageRepoName });
            }
        }

Export problem from NPM package

Version 0.8.10 when used in the content validation app will not compile. The error message is:

Module not found: Can't resolve './checkRepo' in 'C:\Users\mando\Projects\feature-cn-28-add-cv-to-bpa\content-validation-app\node_modules\uw-content-validation\dist\demos\repo-check'

I have seen this or similar several times and keep fixing it locally. Now it has happened again, so I'm making an issue of it (LOL).

The above case implies that the demos folder is being published to NPM. Is this desired? @RobH123 @PhotoNomad0

Use of IndexedDB

I deleted all my indexed DB databases and ran the app. It created 3 different stores.

  • the zip store only had one entry (for tq)
  • the web cache had lots of individual files: Greek, ult, and ust usfm files; ta articles; and tn TSV files

Two things:

image

checkTN_TSVDataRow() does not validate tA links

checkTN_TSVDataRow() does not give an error on invalid tA links. Example that should return error:

GEN\t1\t6\turb3\tfigs-imperative\t\t0\tLet there be an expanse…let it divide\tThese are commands. By commanding that the expanse should exist and that it divide the waters, God made it exist and divide the waters. (See: [[rc://en/ta/man/figs-imperative]])

Another example that should return an error for doublet link:

RUT\t2\t12\tgnn5\tfigs-parallelism\tוּ⁠תְהִ֨י מַשְׂכֻּרְתֵּ֜⁠ךְ שְׁלֵמָ֗ה מֵ⁠עִ֤ם יְהוָה֙֙\t1\tmay your full wages come from Yahweh This is a poetic expression that is very similar to the previous sentence. Alternate translation: “May Yahweh fully give to you everything that you deserve” (See: [[rc://en/ta/man/translate/figs-parallelism]], [Doublet](../figs-doublet/01.md))

Untangle Yarn and Yalc

While I like the idea of keeping pure code separate in the same repo and only pushing the pure (core) code to NPM, we do need to make easier to test the package locally without tripping on styleguidist. Here is what happens...

When I publish locally using yalc publish, all sorts of stuff happens, when the only thing needed to build the dist folder!

$ yalc publish
Running prepublishOnly script: rm -fr ./dist & babel ./src --out-dir 
./dist -s inline
yarn run v1.22.4
$ rm -fr ./dist & babel ./src --out-dir ./dist -s inline
Successfully compiled 41 files with Babel (7769ms).
Done in 8.45s.
Running postpublish script: yarn deploy && git tag $npm_package_version && git push origin $npm_package_version
yarn run v1.22.4
$ yarn deploy && git tag $npm_package_version && git push origin $npm_package_version
$ yarn build
$ styleguidist build
Building style guide...
 WARN  Compiled with warnings
... warnings elided ...

Style guide published to:
C:\Users\mando\Projects\feature-cn-28-add-cv-to-bpa\uw-content-validation\styleguide
$ gh-pages -d styleguide
Published
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com:unfoldingWord/uw-content-validation.git
 * [new tag]         $npm_package_version -> $npm_package_version    
Done in 91.71s.
[email protected]+7a9a5c6a published in store.
$

Original Languages should not be validated

In the general case, original languages (Greek and Hebrew) should not be validated, since a GL organization does not own them and are not responsible for maintaining them.

The only exception might be when the organization that owns the repos being validated is unfoldingWord, which actually does own them and is responsible.

Version 0.8.16: Ambiguous error message

for BCS Hindi, the tQ repo exists (altho it uses an uppercase "Q"), but it has no manifest. And the "content" folder uses an uppercase "C", so things aren't in a standard location either. So at least two things wrong.

Here is the message as shown:
image

Remove programmer's asserts for parameter checks

Many asserts were added to check programmer errors, esp. as the function parameters were changing. Once it's more stable, these can be removed.

Presumably should give a speed up.

Maybe should leave some on functions called from the outside???

checkTN_TSVDataRow() gives Duplicate errors on invalid links

Here is the test line:

"GEN\t1\t9\tzu6f\tfigs-activepassive\t\t0\tLet the waters…be gathered\tThis can be translated with an active verb. This is a command. By commanding that the waters gather together, God made them gather together. Alternate translation: “Let the waters…gather” or “Let the waters…come together” (See: [[rc://en/ta/man/translate/figs-activepassive]] and [[rc://en/ta/man/translate/figs-imperativez]])"

here are the duplicated entries returned by checkTN_TSVDataRow():

Object {
  "noticeList": Array [
    Object {
      "C": "1",
      "V": "2",
      "bookID": "GEN",
      "extract": "[[rc://en/ta/man/translate/figs-imperativez]]",
      "fieldName": "OccurrenceNote",
      "location": " that was supplied translate/figs-imperativez/01.md: Could not find __tests__/fixtures/unfoldingWord/en_ta/translate/figs-imperativez/01.md",
      "message": "Error loading OccurrenceNote TA link",
      "priority": 885,
      "rowID": "zu6f",
    },
    Object {
      "C": "1",
      "V": "2",
      "bookID": "GEN",
      "extract": "[[rc://en/ta/man/translate/figs-imperativez]]",
      "fieldName": "OccurrenceNote",
      "location": " that was supplied translate/figs-imperativez/01.md",
      "message": "Unable to find OccurrenceNote TA link",
      "priority": 886,
      "rowID": "zu6f",
    },
  ],
}

Add documentation of API parameters

  • Need to document API parameters and return value in styleguidist. For example https://unfoldingword.github.io/content-validation/#/Core%20Checking%20Functions?id=section-tsv-table-line-check does not include documentation of parameter to be passed:
    Screen Shot 2020-08-17 at 10 36 16 AM

  • Also need to add jsdocs to for API methods in js files (such as checkTN_TSVDataRow). An example would be:
    Screen Shot 2020-08-17 at 10 40 57 AM

For reference, at https://datatable-translatable.netlify.app/#/DataTable is an example for REACT components that could probably be adapted for API methods:
Screen Shot 2020-08-18 at 8 25 46 AM

Update USFM Grammar package

Discover why we get package errors

Check that USFM line numbers now account for blank lines. Could be that they're currently wrong???

Miscellaneous

Just some notes and things that aren't that important...

  • The GlBookPackageCheck demo: it would be nice to include the documented "wait" parameter in the code, but commented out (or set to 'N'; It wasn't immediately obvious what was being asked.
  • The AllBookPackagesCheck demo shows this react error:
Error: Minified React error #301; visit https://reactjs.org/docs/error-decoder.html?invariant=301 for the full message or use the non-minified dev environment for full errors and additional helpful warnings.

At another point, I thought I saw a different message about recursion being too deep (or to that effect)

Problems with using NPM component

Here are a list of things I have/will change in my branch.

  • As far as I know, there is no way to "export" a JSON file. Thus, I added the books.json into the books.js file itself.
  • Exported some of the demo components... maybe this shouldn't be done -- please advise. If I don't use them, then I will need to study the core API to acquire and populate the data.
  • Changed bookId to TIT so it doesn't take so long to run
  • Removed capital letter from package name (not allowed and doesn't work anyway)
  • Updated use of Material Table in RenderProcessedResults... but it still doesn't take of the warning messages from a dependency
  • Uncommented "useBuiltIns" in the babel config file so the corejs warnings are suppressed

Convert console assertions into validation messages

At present, at least for me, failed assertions cause the app to stop and go into the debugger. If the failures are already validation messages, then maybe convert them to ordinary log messages? Here are the two example I ran into today:

Assertion failed: lineNumber is repeated in location in {"priority":276,"message":"Missing OrigQuote field","bookID":"JUD","C":"1","V":"1","lineNumber":3,"location":" with ID 'ek3q' en JUD book package from unfoldingWord","extra":"TN"}
processNoticesCommon @ notice-processing-functions.js:131
processNoticesToSevereMediumLow @ notice-processing-functions.js:396

Assertion failed: lineNumber is repeated in location in {"priority":276,"message":"Missing OrigQuote field","bookID":"JUD","C":"1","V":"3","lineNumber":8,"location":" with ID 'yfa8' en JUD book package from unfoldingWord","extra":"TN"}

Data orientation of validation results

Basic idea: return raw atomic data in an object and let the app rendering make it into something fit for human consumption. For example consider the location field:

location: " with ID 'r3jx' in line 67 en jud book package from unfoldingword"
  • Put the line number in the dedicated field for that
  • Don't need "jud" since we already have bookID field
  • You can add a langId and ownerId and remove both "en" and "unfoldingword"

After this, then "location" is left with only "r3jx". But for human consumption, it would be trivial to reconstruct the longer messages based on the data fields.

Here is a picture from the console log:
image

Fix non-English bible names?

CV is Currently using GLT and GST for languages other than english. We need to make sure that will always work going forward.

    const ULT = languageCode === 'en' ? 'ULT' : 'GLT';
    const UST = languageCode === 'en' ? 'UST' : 'GST';

checkTN_TSVDataRow() does not validate verse links:

This example should generate an error for the verse link:

"GEN\t1\t9\tha33\t\t\t0\tIt was so\t“It happened like that” or “That is what happened.” What God commanded happened just as he said it should. This phrase appears throughout the chapter and has the same meaning wherever it appears. See how you translated it in Genesis 1:7.;

Make various columns narrower or wider

Many columns in the results table are too wide. Reduce the size of the following:
Resource
Priority
Chapter
Verse
Line
RowID
Character POS

Increase the width of:
Message

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.