Giter Site home page Giter Site logo

Comments (7)

stanhebben avatar stanhebben commented on August 16, 2024

还没注意中文😀. I can take care of this, but what would be the best way to determine if text is scriptio continua?

We could do so based on the characters and recognize character ranges. Maybe we can use Character.isIdeographic() for that, which can check for Chinese, Japanese, Korean and Vietnamese, but would that handle all cases of scriptio continua?

Alternatively, it could be done according to the book's language, but then foreign language snippets inside such books would not be formatted properly.

from patchouli.

3TUSK avatar 3TUSK commented on August 16, 2024

Off-topic: that BookTextParser is more like a BookTextTokener to me.

Character.isIdeographic might work - but...

  1. Modern Vietnamese uses a romanized script. So the issue described does not apply to Vietnamese. One less thing to consider, which is good.
  2. Modern Korean uses hangul - usually it uses whitespace to separate words; otherwise there is no readability at all. Another thing crossed out.
  3. Within the set of languages that Vanilla Minecraft supports - the only case that Character.isIdeographic may not work is probably Thai. After all, modern languages rarely use scriptio continua. Unfortunately, I have zero knowledge on how Thai language actually works, and more worse - I am not sure if vanilla Minecraft can handle the rendering of Thai scripts...

I personally believe that this issue must be solved via boundary analysis (for example java.text.BreakIterator can do that; Mojang also uses icu4j, but I can't get that work).
Few months ago, I wrote this because I feel that vanilla isn't doing line wrapping correctly either, and I also wrote the explanation on what I did and why I did so. I was thinking adapting my work into BookTextParsr, but I soon realized that command handling makes the situation even trickier...
At least, I hope that can provide some hints.

from patchouli.

stanhebben avatar stanhebben commented on August 16, 2024

Yeah, command handling makes it difficult to implement a solution with BreakIterator, since text is expanded on the fly (and styles need to be applied to words) so we can't simply feed the input text to a BreakIterator. Even a custom CharacterIterator may be difficult to implement, but I'm thinking about it.

I don't understand your reply concerning Hangul; is it crossed out because Korean does use spaces; or because Hangul isn't assumed Ideographic by Character.isIdeographic?

from patchouli.

stanhebben avatar stanhebben commented on August 16, 2024

I think I may have a solution in mind with the BreakIterator; if processing of commands and positioning of text is performed in separate steps: command handlers can first generate a list of annotated spans. Once these spans are determined, I can have a custom character iterator iterate over these spans, splitting them into lines and performing positioning. Since the BreakIterator doesn't insert or delete characters, I can look up the spans in the original list using the character indices, applying styles appropriately.

from patchouli.

3TUSK avatar 3TUSK commented on August 16, 2024

I don't understand your reply concerning Hangul; is it crossed out because Korean does use spaces; or because Hangul isn't assumed Ideographic by Character.isIdeographic?

You can safely ignore my comments regarding hangul - all I want to say is that "Modern Korean does not have this issue because of the use of whitespace". Apologize for the confusion.

from patchouli.

stanhebben avatar stanhebben commented on August 16, 2024

A fix for this has been implemented and should be available in the next release.

from patchouli.

3TUSK avatar 3TUSK commented on August 16, 2024

For future reference: fixed by #17.

from patchouli.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.