Giter Site home page Giter Site logo

translationdatabaseweb's Introduction

translationDatabase

Build Status Coverage Status

Goals

The goals for translationDatabase are to manage and track data for languages and the progress of getting unrestricted biblical content into every language.

For more information on the unfoldingWord project, see the About page.

Data Sources

A lot of the sources of data are pull into and managed as repo as part of the Debian project called simply, ISO Codes.

In Use

Other Potential Sources

Getting Started

To setup a new working environment of this project, several items are needed:

  • Python (consult the requirements.txt for specific libraries/packages)
  • Redis
  • Postgres
  • Node

Building Static Media

npm install
npm run watch     # run a watcher on the static folder
npm run build     # builds static and exits
npm run buildprod # builds for production (uglify/minification)

Initialize the Database

After installing requirements (via pip) within your environment or virtualenv:

  • python manage.py migrate
  • python manage.py loaddata sites
  • python manage.py loaddata uw_network_seed
  • python manage.py loaddata uw_region_seed
  • python manage.py loaddata uw_title_seed
  • python manage.py loaddata uw_media_seed
  • python manage.py loaddata additional-languages
  • python manage.py reload_imports

At this point, the basic country and language datasets will be populated but without many optional fields or extra data.

Updating the /exports/langnames.json and /exports/langnames_short.json endpoints

When languages are added or updated, run this command to update the data locally:

python manage.py rebuild_langnames

Switch to the master branch and run this command to update the data on the server:

ec run web python manage.py rebuild_langnames

Docker Deployments

translationDatabase was previously built using the Heroku-18 stack and deployed on Heroku dynos.

It is now being deployed using Heroku's Docker container support.

This was configured via:

heroku stack:set container -a ${HEROKU_APP_NAME:-translation-database-demo}

(and repeated using HEROKU_APP_NAME=translation-database).

Deploying via heroku.yml

This application can be deployed to Heroku via Git.

Heroku's documentation on Git / GitHub deployments can be found here:

To deploy the master branch to the translation-database-demo site:

git checkout master
heroku git:remote -a translation-database-demo
git push heroku master:main

For additional documentation, see Building Docker Images with heroku.yml

Building the Docker image manually

These instructions are provided as a convenience; the application should be deployable following Deploying via heroku.yml above.

NOTE: This assumes that you have a version of Docker installed.

  1. Build the production image
rm -Rf archive archive.tgz
git archive HEAD > archive.tgz
mkdir -p archive
tar -xvf archive.tgz -C archive
cd archive

docker build --platform=linux/amd64 -f Dockerfile -t td .
cd ..
rm -Rf archive
  1. Run via
# assumes environment variables populated in
# .dev-env file
docker run --name=td --rm -d --env-file ./.dev-env -p 8000:8000 td

Push the Docker image to Heroku manually

If you wanted to deploy the pre-built image to Heroku, you would need to:

  1. Tag and push for $APP_NAME:
docker tag td registry.heroku.com/${HEROKU_APP_NAME:-translation-database-demo}/web
docker tag td registry.heroku.com/${HEROKU_APP_NAME:-translation-database-demo}/worker
docker push registry.heroku.com/${HEROKU_APP_NAME:-translation-database-demo}/web
docker push registry.heroku.com/${HEROKU_APP_NAME:-translation-database-demo}/worker
  1. Release to Heroku
heroku container:release web worker -a ${HEROKU_APP_NAME:-translation-database-demo}

Repeat the steps above with the HEROKU_APP_NAME variable set for the production environment:

export HEROKU_APP_NAME=translation-database

translationdatabaseweb's People

Contributors

cdaield avatar jacobwegner avatar jag3773 avatar paltman avatar phillip-hopper avatar rbnswartz avatar robh123 avatar smcoll avatar swilcox avatar vleong2332 avatar yakob-aleksandrovich avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

translationdatabaseweb's Issues

No Scripture Report

There is some interest in having a "no scripture" flag for the languages that have no scripture. @jag3773 , could this be generated from the "resources" block - those with empty resources would be flagged as "no scripture." Would that be close to accurate, or do we have massive holes in our data?
--Perry

Offline Progress Form

We need a form for GL teams to record offline progress. Such a form should include:

  • language name
  • resource name (drop down)
  • Status (drop down)
  • Comment (plain text)

The list of status options needs to be defined.

Add Missing 'Translation Suggestions' Section

From @jag3773 on July 29, 2015 20:16

The translationNotes are now utilizing a new section, 'Translation Suggestions', this needs to be added to the publish utility https://github.com/unfoldingWord-dev/tools/blob/master/obs/json/json_tn_export.py and probably https://github.com/unfoldingWord-dev/tools/blob/master/uwb/tn-kt_export.py. Note that @swilcox may be in the process of importing these to uW Admin site?

Copied from original issue: unfoldingWord-dev/uwadmin#50

Bible Publishing

From @jag3773 on February 9, 2015 18:1

We need to add an interface and guts for publishing Scripture.

This should be based on USFM input. Ideally, the administrator could upload a zip file of a USFM formatted text and then add publishing information about it.

Some cobbled together scripts that are currently publishing from our Etherpad texts are:
https://github.com/Door43/tools/blob/master/uwb/ep_export.py
https://github.com/Door43/tools/blob/master/uwb/api_publish.py
https://github.com/Door43/tools/blob/master/uw/update_catalog.py

Note that ep_export should stay as it is, that just takes the text out of Etherpad and puts it into a Github repo as USFM files.

Copied from original issue: unfoldingWord-dev/uwadmin#5

Allow ordering of networks by the user

The order of the networks seems to be the order in which they are selected. Allow the user to drag network names left or right without having to delete and reselect them in order to change the order.

the user might want to list the networks in an order of importance or use.

Create Visualizations for Gateway Languages

@jag3773 mentioned in previous conversations that he would like to have something like the dendrogram or Node-Link Tree.

We should have visualizations:

  • By Country
  • By Gateway Language

Considerations:

  • We should also consider the Tri-Fold Tree where the part you click on rotates to the 3 O'Clock position.
  • English is in the center as the source text
    • The next layer of nodes should be the 47 gateway languages.
    • See the attached Gateway Languages PDF. Note that the DB needs seeded with this information based on region. In other words, every language in the DB needs to have it's gateway set based on its region.
    • Note that we'll need to have some meta gateway languages. For instance, for now Papua New Guinea languages should show up with a gateway of "English/Tok Pisin". We know that the languages in PNG should be able to read one or the other of those, but we don't necessarily know the exact mapping at this point.
  • The outer layer is the rest of the languages.
  • Every language should be color-coded based on the region it's in (see page 2 of the attached PDF).

The visualization URL should be http://td.unfoldingword.org/gl/

The ring style visualization doesn't seem to scale with the sheer number of languages (over 7500) split over only 47 gateway languages:

screenshot_2015-02-07_21 56 32

Fix Anticipated Completion Date

  • The calendar beginning with year 0001 isn't very helpful. Maybe start the calendar at today().
  • Clicking on the right arrow advances the year by one but does not rapidly advance the years.
  • Show the allowed format after the field name.

Harvest Ethnologue Maps

The Ethnologue makes low-res, watermarked language maps available for free download on their website, located at consistent URLs based on the ISO 3166 country code (VN = Vietnam, etc.), e.g. http://www.ethnologue.com/country/VN/maps

Given that we know the full list of ISO 3166 country codes, and if the structure is consistent , we need a script to download all the maps available for every country and show them in tD.

Then, generate a PDF with a TOC for each country code (and name) and map thereof.

Allow multiple books to be selected

Allow for multiple books to be selected.

might even have selections such as Pentatuch (G, E, L, N, D), History, Poetic, Major Prophets, Minor Prophets, Gospels, Synpotic Gospels, Gospels and Acts, Pauline, yada, yada. Not sure of the best way but Dave especially saw a need for selection of multiple books.

Update Scripture Field

Given the P3 we are using, the first field under Scripture should be “Is there a desire for translation?” or words to that effect with Yes, No, Unknown/Not Available options.

Rationale: Before capturing the rest of the information about the Scripture, we should document if the local church (network) and people want or have a translation. If not, why spend time now capturing the information? If we want to know what networks don't desire a translation, the system can easily display/print that list for whatever purpose like inputting the available info if someone has the time when the local network isn't interested.

Add a “Year” field to describe the Yes/No/NA field in #1 above.

Add a “How much/What part” field.

Either typed in descriptive field or selection list

Add a “What format” field.

Written, video, etc.

Networks translating

I've noticed some glitches in the tD: under "Networks translating" there is a long dropdown list of bizarre entities, and SIL is not there and can't be typed in, although they do most of the translation work in the world.

Check out l'Observatoire linguistique as a data source

Jesse,

I am excited to see this slowly coming together, and I’m convinced it will be a very good tool in the end. You mentioned that you currently import data from the Ethnologue, SIL and Wikipedia, with ‘glottolog’ being “worked on” right now. I strongly suggest we also draw information from the Linguasphere Register.

This register covers not only all languages, and where they are spoken, but also indicates here and there which script is in use (e.g. the Cyrillic script for some Caucasian languages). I think the Linguasphere coding system is the one that WCD uses (World Christian Database). (I took just a very quick look on this, so this would need to be verified).

At any rate, we should utilize the huge amount of data that Linguasphere provides.

Blessings

Ralph

Add Alternate Language Names to langnames.json

We need to add all of the known alternate language names to the langnames.json export.
First priority:

Secondary priority:

  • See if we can find and import other "alternate" names for languages. Possibly scraping ethnologue.

Database Backups

We need to be making daily database backups to our offsite backup server. @swilcox , what do we have on the Gondor side to facilitate this?

Add ability to submit a request for a new Living Language

We realize this is the ISO identifier. If we find a language without an identifier, can we (should we) add it to the tD? If so, we need an Add option.

Is the tD only interested in capturing data about “living” languages? This might be a “duh” question; just clarifying.

Version 1 and 2 Export

From @jag3773 on February 9, 2015 18:15

Translations with a source text version number that is lower than the latest source text version number, should be listed in an API endpoint of some sort. This will allow a plugin on our Door43 server to place a notice message for that language that provides information on how to update to the latest version.

Note that English version should be considered the "latest source text version".

Copied from original issue: unfoldingWord-dev/uwadmin#9

Publish tA vol1 to uW Catalog

From @jag3773 on July 29, 2015 20:5

Related to #41, unfoldingWord-dev/tools#113, and unfoldingWord-dev/tools#112.

We need to write a publishing routing to get tA vol1 into the catalog.

Each entry should have an ID that is the "slug" field in the source YAML matter. It's possible that we'll need to prefix the slug with vol1-slug and/or vol2-slug in order to avoid collisions (please first check to see if there are any existing collisions before implementing that though). This comment replaces #72 .

Copied from original issue: unfoldingWord-dev/uwadmin#49

Versioned Data Dumps

We need to discuss versioned data dumps of information we are allowed to dump.

Add Geographic Data langnames.json

We need the country code and the geographic region added to the langnames.json end point. An entry should look something like this:

{  "lc": "aa",
   "cc": "NG",
   "lr": "Africa",
   "ln": "Afaraf"
}

Update OBS Available Translations

This mockup shows the general direction that I would like to see for the static HTML rendering of the "Available Translations of Open Bible Stories" that is embedded on the page unfoldingWord.or/stories.

obs_-_available_translations

Some points about it:

  • the HTML bullet should be the level checking indicator for each respective translation. Note the new icon for "in progress" (fourth language in the list).
  • the language names are self-names where we have them
  • the language codes are IETF-compliant
  • icon 1: low-res icon - links to SD in-browser display
  • icon 2: hi-res icon - links to HD in-browser display
  • icon 3: download icon - links directly to PDF download
  • icon 4: app icon - links directly to unfoldingWord mobile app in Google Play (eventually we will add different icons for Android, iOS, etc.) Note: this icon could be moved to a larger icon at the top of the page for "get all these languages in the mobile apps"
  • note the "in progress" icon which should link directly to the translation draft for that language in Door43. Note: this link could also be provided for checked translations as well.

Copyright saving issue

And when you type something in the "copyright" field and click "Save," it doesn't save it.

Merge "Is Gateway" Field

In the language lists the "Is Gateway" column should be merge into the Gateway Language column to conserve space.

Import America's Data

Hi Jesse and Patrick

I mentioned that the Americas has data we'd appreciate if you import into the tD. This email has all that info. In addition, we believe we have a list of additional languages to be added. Would you also help with that effort?

I've attached two files. The one whose name begins with "Languages" is a spreadsheet compiled by a WA person working his deputation (as I understand). His name is Todd, hence you'll read "Todd's spreadsheet" in various places. This spreadsheet ("ss" in places) is the source of the info I'd like you to import into the tD.

The other attachment (2 db - 2 "database" - comparison) has 3 tabs. The first is my field by field comparison of Todd's ss with the fields in the tD. On the left side of this ss, I've listed Todd's field name (exactly from the other attachment) and then my best guess at the definition for the field. In fields where it made sense, I have also listed the current values in Todd's ss.

In the first column on the right side of the first tab of the 2 db comparison ss, I have listed the 4 or 5 data fields which I believe are a match (the data in the field means exactly the same thing in Todd's ss as in the tD). Then, I have listed the fields (without an attempt at definitions) of the tD in a hierarchy read from left to right.

The second tab of the "2 db comparison" ss holds a new topic altogether: identification of new languages. I've listed possible new languages in Guinea-Bissau and have compared the list of new languages to those currently listed for G-B. Dave Byron received this list from a Brazilian woman and believes the list does represent as yet unidentified languages. I'm not sure what the procedure is to "add new languages" since there likely will be some official comparison to the ISO and, if new, assignment of new ISO codes.

Please let me know if there's anything I can do to help or clarify with either of these efforts.

Thanks! Karen

Single Text Box Entry Form

Create a form that is a single entry box, similar to twitter, that allows people to dump information into it.

The text box should be hashtag aware so that the entry can be tagged with various topics. Once inserted, the entry should become a comment on the relevant page and the data should be mined and inserted into the appropriate fields in the appropriate page.

See example mock up.

Published Date Issue

Also, under "Published date" if you type in a year, it adds another number that seems to be the current month. I think most of the time we will only know the year and that's really all we care about, so I would recommend getting rid of that month feature.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.