The jbovlaste from lojban

From Nicholas Pinney via email:

The dictionary http://jbovlaste.lojban.org//lookup.pl performs case
sensitive queries. A search for "Green" finds no words, bit a search for
"green" does. I'm pretty sure this isn't intended behavior; it certainly
took me a few tries to work out. (I'm typing on a phone and the keyboard
automatically capitalises the first letter of a sentence)

Full email notifications

Right now, the only way to find out about the newest words and discussions on jbovlaste is to go to http://jbovlaste.lojban.org/recent.html
The only emails I get are when someone edits my words or adds a comment.

I would like to have the option to receive email containing as much information as I like, including the full content of every new discussion (whether or not I'm involved in it) and even all new definitions added.

Chinese pdf export doesn't use necessary fonts

I remember discussing it more than a year ago. We need to use East Asia fonts in Chinese/Korean/Japanese pdf exports. See yourself. It shows squares. also it'd be better to embed fonts into the resulting pdf.

Please add words that are incompatible with vlatai

From gleki:

Since vlatai has a bug could you please add the following words
with any definition (even empty) in Test Language?

februari, martio, prilio, madjio, djunio, djulio, matriocka, patriarka, minstreli

Denote experimental lujvo and fu'ivla as such

Experimental gismu and cmavo are clearly denoted as such, so there’s no reason why experimental lujvo and fu'ivla aren’t.
Maybe this could be automated?

A lujvo is called “experimental” if at least one of its rafsi is derived from an experimental gismu or cmavo.
A fu'ivla is called “experimental” if it is in stage 3 and the used “topic rafsi” is derived from an experimental gismu or cmavo.

jvs 2.0 should have Chinese Simple<=>Traditional decoder

In future jvs 2.0 we should have a switch to change from viewing text in traditional chinese characters and in simplified chinese characters.

There no need in having them as separate languages as those are just two alternative orthographies (as if we had a switch for reading Lojban words in Tengwar).

lojban should be first language in prompt

tsani wrote:

Seeing as how important it is, could "lojban" be placed as the first language
in the list of languages in which one can define a word in jbovlaste?

https://groups.google.com/forum/#!topic/lojban/E1LxnaiMQoc

Change titles in pdf export titles to lojban

https://github.com/lojban/jbovlaste/blob/master/export/latex-export.html#L163
\title{). $escapedlang . q( To lojban and lojban To ). $escapedlang .q( Dictionary}
As you can see English words are hardcoded. I suggest to change the relevant part to the following:

lo vlaste noi fanva fi lo escapeall($langrealname) la .lojban. gi'e fi la .lojban. escapeall($langrealname)

We also have
\chapter{lojban to ). escapeall($langrealname) .q(}).

This can be translated as
fanva fi la lojban. fo lo escapeall($langrealname)

\author{Lojbanic Community} => \author{lo lojbo cecmu}
\date{\today} => Should be in Lojban format!

What are your opinions?

Etymology

Etymology field is awkward. There are two possible solutions:

Move the etymology field into the editdef.html page.
Retain etymology only for Lojban-Lojban dictionary. Indeed, if you are so interested in Lojban as you want to know where words come from then you must be able to read in Lojban. Having etymology explanations translated to all languages is tiresome and not productive. We should have as much as we can written in Lojban itself.

Expose edit history

From selpa'i:

I know some people wanted to be able to easily see who created a word

Make LaTeX/PDF errors more visible

From rlpowell in #6:

It would be super nice to have the definition submission make a
micro latex file to test that you didn't screw up, and show the user errors as
necessary.

In the absence of fixing on a per-definition basis, the export should at least report the
LaTeX output to the user.  :P:P:P  But: The errors don't tell you where you are without
grepping teh latex.  -_-

"Word listings for ''"

Clicking on natlang (all or preferred) in the menu yields a defective header:

Word listings for ""

Please, delete from the jvs database the words i added by mistake

From gleki:

word - definition id
bamju - 20732 
bamju - 36750
balnu - 39121 
balnu - 39123 
balnu - 39122 
balnu - 56024 

vetli - 36700
tamsi - 36698 
tamsi - 36699

Fix bad LaTeX

LaTeX support seems to be broken. Here’s the example for the official definition of dekto (English, definition ID 1): “x1 mathend000# is ten [10; 1*101 mathend000#] of x2 mathend000# in dimension/aspect x3 mathend000# (default is units).”

LaTex math breaks PDF export and errors are swallowed

From rlpowell:

Something to work on: Math, in particular ^, breaks PDF export if it's not wrapped 
properly in $...$.  It would be super nice to have the definition submission make a
micro latex file to test that you didn't screw up, and show the user errors as
necessary.

In the absence of fixing on a per-definition basis, the export should at least report the
LaTeX output to the user.  :P:P:P  But: The errors don't tell you where you are without
grepping teh latex.  -_-

New valsitype: "bu letteral"

Valsi such as {denpa bu} and {slaka bu} are currently identified as "cmavo", although this is not entirely accurate. These should be reassigned a new category, "bu letteral", and the morphology check should assign similar compounds to this category.

Public vote information

I think the votes should be public information, and possibly with required commentary. For instance, if your word gets downvoted, it would be helpful to know what is wrong and who thinks so.

xml export includes words with insignificant vote scores

Currently it exports all the words (even downvoted) but it should export only words that have the vote balance >=1. It seems that it worked exactly like this until recently. Something got broken.

optimize XML exports

dag reports that his XML exports in haskell are dramatically quicker than jbovlaste.

Weird voting system regarding multiple definitions

Currently, it is possible to make exactly one vote only for one definition per language, either positive or negative. If there are multiple definitions, it is therefore not possible to cast votes for each defininition.
I understand this makes sense for positive votes, to indicate a preference.

But what if you think all definitions for a given language are bad? You have as well to choose one, while being neutral to all other definitions.

Plus, you can’t vote positive if you voted negative and vice-versa.

This voting system is weird.
Maybe the voting system should be rethought.

rename "cmavo cluster" valsitype to "cmavo compound"

"cmavo clusters" are referred to as "compound cmavo" in CLL. xorxes points out that this is misleading, since the compounds do not form a single cmavo, so "cmavo compound" is more accurate.

CLL should probably be updated to use "cmavo compound", but even without such an update, if jbovlaste uses "cmavo compound" instead of "cmavo cluster", it's more likely that users of jbovlaste and CLL will correctly identify them as the same thing.

verify morphology: fu'ivla

Per #26 and #37, there are issues with vlatai (part of jbofi'e) which are preventing some words from being entered. As part of a process of evaluating camxes as a replacement, I'm auditing the classification of words that are already entered into jbovlaste, to ensure that the classification performed by camxes conforms to expectations.

This issue identifies words that are marked as "fu'ivla" in jbovlaste, which are not identified as such under camxes.

Non-lojban words ("nonLojbanWord" per camxes)

balnkpi
bliardo
blueta
cidjrbentou
cidjrsurstrmi
cipnrbuteo
ckoala
cmiai
driomedeida
etreo
fiprkti'omizo
fiprpsefuru
fiprpseudoskafirinku
fiprtetrapleurodo
freskeo
infarkta
kriofla
ku'urpicea
melpsita
mudrfselia
mudrpterokarpu
pervu'ui
ricrbeaukarne'a
ricrpterokarpu
ridrdverga
rozrbanksi
rutrskuamosa
samcrarkti
skientia
sparanakampti
spararktantemu
spararkti
spareukomi
sparipeastru
sparksitropi
sparnimfea
sparpersea
sparpsikopsi
sparpsofokarpu
spatrleoxari
stagrleoxari
tanxebentou
tarksako
tirxpardu
trueno
xarjrngiri

Other non-fu'ivla per camxes

aierne: "cmavo + fu'ivla" per camxes
selda'ergau: "lujvo" per camxes
siukrida: "cmavo + gismu" per camxes

Restricted marking of gismu and cmavo as obsolete

per gleki:

we have "experimental gismu" in jbovlaste. in pdf export they are marked with a
triangle. I think we should have a similar class of gismu and cmavo called "obsolete".
However, placing this or that word into this category would be done only by directly
editing the database and on behalf of Robin and nobody else. in pdf export they
should be marked as [obsolete] adter the word (or something similar )

Use/display of definition #s is inconsistent

definitions.definitionnum seem to be intended as a serial number for ordering the definitions for a particular valsi and language. But it appears that sometimes they are set to the same value as definitionid, e.g. one of the chinese definitions for finti

Either jbovlaste should always display the definitionid, or definitionnum as displayed should be set to a different, meaningful value.

Global score threshold

A global score threshold would filter out any words or definitions that fail to meet a certain scoring threshold. It could be deactivated on a per user or per session basis by users who wish to browse all content, even low-scoring content.

New words should start out with more than 1 point

acolotl: Sometimes it seems people downvote words for no reason whatsoever.
acolotl: Even with completely normal words.
durka42: this happens anywhere a downvote button is provided :p
acolotl: Yeah, but here it causes the words not to show up.

durka42: a word could start with more than 1 vote initially
durka42: like, I dunno, 5
acolotl: Yeah, at this point even 3 would make a big difference.

Automatically redirect to lowest-scoring lujvo form

Here’s an usabilty suggestion:
If an user searches for a lujvo, but it is not in its lowest-scoring form, jbovlaste should still be able to recognize the lujvo and redirect the user to the lowest-scoring form. If there is no entry, the request fails, but if there is, the user finds the lujvo he/she/it requested. This would be helpful because it is unlikely an user does the score calculation in the head. ;-)

Currently, jbovlaste simply would not find the entry, although the user entered a perfectly valid lujvo.

Adding this feature would be also be closer to Lojban, because according to the CLL, all valid forms of a lujvo are equal.
It may be neccessary to add support for manually added redirects for the extremely rare cases a lujvo has two or more lowest-scoring forms.

New valsitype: "zei lujvo"

Lujvo formed with {zei} are currently split by jbovlaste, with the various parts parsed by vlatai. If the parsing is approved, the word is entered as a lujvo. This issue proposes to reclassify those lujvo as "zei lujvo" and to change the morphology check so that it assigns this new category as appropriate.

Separate field for 'See also'

Currently in order to place links to similar words one has to use "Notes" field. However, I propose splitting it into two fields:

"Notes" field for more precise explanations of the meaning of words.
"Tags" field for hyperlinks to other words.

verify morphology: lujvo

Per #26 and #37, there are issues with vlatai (part of jbofi'e) which are preventing some words from being entered. As part of a process of evaluating camxes as a replacement, I'm auditing the classification of words that are already entered into jbovlaste, to ensure that the classification performed by camxes conforms to expectations.

This issue identifies words that are marked as "lujvo" in jbovlaste, which are not identified as such under camxes. They are all {zei} lujvo. Jbovlaste currently partially bypasses vlatai when a word contains {zei}.

cirla zei burgere (gismu, cmavo, fu'ivla)
cu'a zei fancu (cmavo, cmavo, gismu)
dropanra zei ionti (lujvo, cmavo, fu'ivla)
ga'e zei lerfu (cmavo, cmavo, gismu)
gregori zei nanca (fu'ivla, cmavo, gismu)
jgitrviolino zei konceto (fu'ivla, cmavo, fu'ivla)
jgitrviolono zei konceto (fu'ivla, cmavo, fu'ivla)
konceto zei pagbu (fu'ivla, cmavo, gismu)
ma'u zei ionti (cmavo, cmavo, fu'ivla)
ni'u zei ionti (cmavo, cmavo, fu'ivla)
pipnrpiano zei konceto (fu'ivla, cmavo, fu'ivla)
simfoni zei pagbu (fu'ivla, cmavo, fu'ivla)
tarbi zei asna (gismu, cmavo, fu'ivla)
te'o zei dugri (cmavo, cmavo, gismu)
to'a zei lerfu (cmavo, cmavo, gismu)
ubutycys zei cacra (cmene, cmavo, gismu)
ze'i zei seldejni (cmavo, cmavo, lujvo)
ze'u zei seldejni (cmavo, cmavo, lujvo)
zo si si zei fa'o (cmavo x 5)

Add field for cmavo terminators

From gleki:

we need a new field for cmavo in jbovlaste 2.0: a field for the terminator for each
cmavo (if applicable). Just like we have now with selma'o.

ofc. they also must be clickable but that's too easy. just dont forget

Menu links break in subdirectories

In a subdirectory such as ...

http://jbovlaste.lojban.org/needed/

... the menulinks render as relative, and thus are broken, e.g

http://jbovlaste.lojban.org/needed/export/latex.html

They also don't include port number, for those who wish to run jbovlaste on a non-standard port.

Report of words missing in XML export of some languages

From guskant via gleki:

btw guskant worte that xml export doesnt export all the db in simple english direction
and probably in chinese

verify morphology: cmene

Per #26 and #37, there are issues with vlatai (part of jbofi'e) which are preventing some words from being entered. As part of a process of evaluating camxes as a replacement, I'm auditing the classification of words that are already entered into jbovlaste, to ensure that the classification performed by camxes conforms to expectations.

This issue identifies words that are marked as "cmene" in jbovlaste, which are not considered as lojban words ("nonLojbanWord") under camxes:

altfor
arktik
bivmast
cibmast
daumast
djordj
feimast
gaimast (previously entered as "geimast" but may be a transcription error)
guanJOUS
maskFAS
mumymast
pavmast
relmast
slovensk
sozmast
tai,UAN
tcesk
vonmast
xavmast
xrvatsk
zelmast

Some natural language word votes are assigned to wrong words

There are currently more than 350 natural language votes which attribute natural language words (glosswords or place keywords) from one language to definitions from another language:

select count(*)
from natlangwordvotes v
join natlangwords w on w.wordid = v.natlangwordid
join definitions d on d.definitionid = v.definitionid
where w.langid <> d.langid;

It seems likely this is at least largely due to "dict/votebits" looking up natlangwords without including langid in the query.

vote information in xml export

jvs xml export should include "the number of upvotes minus the number of downvotes" for each word.

verify morphology: cmavo and cmavo clusters

Per #26 and #37, there are issues with vlatai (part of jbofi'e) which are preventing some words from being entered. As part of a process of evaluating camxes as a replacement, I'm auditing the classification of words that are already entered into jbovlaste, to ensure that the classification performed by camxes conforms to expectations.

This issue identifies words that are marked as "cmavo" or "cmavo clusters" (CLL: "compound cmavo") in jbovlaste, which are not identified as such under camxes.

cmavo

y: "initialSpaces" per camxes

cmavo cluster

denpa bu: "gismu + cmavo" per camxes
slaka bu: "gismu + cmavo" per camxes
ybu: single cmavo per camxes

double UI can't be entered into JVS

{.ua .uu} can't be entered into JVS. But this poses an interesting question: if compound cmavo can be entered why any other compound like tanru or even complete bridi can't be entered?

In fact I can enter {do du} but can't enter {do klama}!

Fix morphological check

I am not actually sure about this one, but I heard jbovlaste uses vlatai (an (almost?) undocumented tool which comes with jbofi'e) to check the morphology of words. The problem with that is that vlatai is buggy and does not correctly check the morphology.

I am not sure if I am correct here. Somebody please verifiy.

http://jbovlaste.lojban.org/languages.html not seen from the frontend

I can't find a link for http://jbovlaste.lojban.org/languages.html from the frontend.
It must be hidden somewhere which is a bit silly. May be just place it to the main (side) menu?

Can not specify single keywords ot a position greater than 1

It is not possible to skip a place key word. If you want to specify play keyword 2 only, you also have to specify place keyword 1. If you just specify keyword 2 anyways, jbovlaste will actually put this keyword to place 1.

A current ugly workaround exists. Simply fill the places to be skipped with some nonsense character, like “-”.

"Data inconsistencies" in natlangwords

Natural language words that are entered with both a "default" sense as well as at least one explicit sense are displayed with an error:

Data Inconsistency
There appears to be both a default meaning for this word, and one or more
specialized meanings. This makes little sense, and should be corrected.

e.g. http://jbovlaste.lojban.org/natlang/en/extreme

There are currently more than 800 language/word pairs in the database which exhibit this problem, as indicated by this query:

select w.langid, w.word, count(*)
from natlangwords w
where exists
    (select 1 from natlangwords ww
     where ww.langid = w.langid and ww.word = w.word
     and ww.meaning is null)
and exists
    (select 1 from natlangwords www
     where www.langid = w.langid and www.word = w.word
     and www.meaning is not null)
group by w.langid, w.word;

Either this rule should be enforced and the problem cases cleaned up, or the rule should be scrapped along with this error message.

Syntactic parser in JVS (first and general ideas)

We should think of implementing camxes.js in JVS or JVS 2.0 Obviously it shouldn't be just a link to http://ilmen.tk/lojban/camxes.html

May be you enter a sentence, it parses it, the result gives links to entries in the database, a switch would allow you to changes the output language in the glossing.

"obsolete" as a new class of Lojban words

we have "experimental gismu". in pdf export
they are marked with a triangle. we should have a
similar class of gismu and cmavo called "obsolete". However,
placing this or that word into this category would be done only
by directly editing the database and on behalf of Robin and
nobody else. in pdf export they should be marked as [obsolete]
after the word (or something similar )

Multilingual interface

Vlasisku has all the interface in English. It's completely unacceptable. Vlasisku.ru is running with the same interface although it is to be used by speakers of Russian.

Can't export gua\spi dict. pdf

I can't generate gua\spi pdf.
Looks like the name of gua\spi needs to be escaped for TeX

Reclassify morphologically invalid fu'ivla and cmene

Per #39 and #40, some words entered into jbovlaste as fu'ivla and cmene are not valid according to camxes. In anticipation of replacing vlatai (jbofi'e) as the morphological classifier in jbovlaste (per #26), these words will be assigned new types, "obsolete fu'ivla" and "obsolete cmene". These types may be the first steps to address #9.

Move "Test Language" definitions into separate instance

There's no need to mix test data with production data. Let's make a test instance of jbovlaste, then remove all of the test data from the production instance.

Images for JVS 2.0

It would be nice to illustrate some words with images taken from Wikimedia and/or properly licensed for us (free modification and copying even commerially would be the best option). I guess any of us can upload an image illustrating {plise}.

lojban / jbovlaste Goto Github PK

jbovlaste's People

Contributors

Stargazers

Watchers

Forkers

jbovlaste's Issues

Recommend Projects

Recommend Topics

Recommend Org