Giter Site home page Giter Site logo

Comments (4)

vanzytay avatar vanzytay commented on May 27, 2024 1

The brackets are required. This issue has been fixed internally and we are going to push an update to this codebase soon (In Jan). Since this public version acts as a mirror, we have not sync-ed for awhile now.

from long-range-arena.

adamsolomou avatar adamsolomou commented on May 27, 2024

Thanks for clarifying!

So the correct way to go would be to tokenize according to this vocabulary

( ']', '1', '4', '7', '9', '8', '2', '5', '3', '0', '6', '[MIN', '[SM', '[MAX', '[MED'])?

from long-range-arena.

cifkao avatar cifkao commented on May 27, 2024

I suppose the fix would be to pass something like reserved_tokens=['[', ']'] to tfds.deprecated.text.Tokenizer to prevent it from ignoring these?

from long-range-arena.

adamsolomou avatar adamsolomou commented on May 27, 2024

If you address the issue this way it gives rise to the following vocabulary

('MED', '5', '8', '4', '0', ']', '1', '3', '6', '7', '2', 'MIN', '9', '[', 'MAX', 'SM')

which differs from this

( ']', '1', '4', '7', '9', '8', '2', '5', '3', '0', '6', '[MIN', '[SM', '[MAX', '[MED')

in the sense that [ is encoded as a token itself rather than being part of '[MIN', '[SM', '[MAX', '[MED'.

@vanzytay could you please clarify which is the right way to go?

from long-range-arena.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.