Comments (3)
TokenMonster doesn't support this format. It supports either it's own TokenMonster .vocab format, or a YAML file, example here. You can write a simple script to convert your vocabulary into the YAML format and then import it using exportvocab
executable, or the Go or Python libraries.
from tokenmonster.
Thanks. This is what I thought but the error msg was confusing. Is it possible to separate permission error from format error?
Would close this issue then.
from tokenmonster.
It would be better for the error message to say exactly what the issue is, however the Python script doesn't know the exact error because the error is originating from the tokenmonsterserver subprocess. That's why you get this vague "cannot open or save" error message. I don't think this is important enough to update everything with new error codes. But I'll keep it in mind if I do a larger update at some point.
If you use the original Go implementation, instead of Python (which wraps the Go implementation), you'll get more detailed error messages.
from tokenmonster.
Related Issues (20)
- Spacecode: extend Capcode idea to composite words HOT 11
- panic: assignment to entry in nil map HOT 1
- Wrapping lib in a go cli client HOT 2
- Meaning of C and D HOT 1
- C implementation HOT 1
- hello! HOT 1
- Idea: Wouldn't it be possible for Tokenmonster to stop when it reaches the idea vocab size? HOT 2
- code-65536 models cannot decode HOT 1
- What is the difference between `50256-consistent-oneword` and `50256-consistent`? HOT 1
- Implemented in the new AI framework Zeta HOT 1
- Humble question regarding JS performance HOT 1
- Special tokens not showing up correctly when tokenized. HOT 1
- Inquiry on Extending Algorithm to Other Languages HOT 2
- Update on multilingual
- "vocab.load_multiprocess_safe" doesn't work while multi-processing. HOT 1
- Hangs with PyTorch data loaders when `num_workers > 0`
- Question/issue about uppercase HOT 3
- Continuous training: Deleted 0 of 0 tokens; Remaining 0 tokens; reachedMidway withinVocabX2 reachedVocab HOT 1
- Tokenize strings of only N-types of characters? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tokenmonster.