Comments (14)
Awww, hell yeah. I’m very happy to see this.
You can’t instance objects at all here unless you’re certain they're v4 compatible, because the object definitions for most of the resources changed between godot versions.
This is the reason I wrote ResourceLoaderCompat
; I needed to be able to load binary resources without actually instancing any of the objects and then convert them to text. I ended up using it to extract properties from resources for exporting (like textures) without having to instance them.
Since ResourceLoaderCompat
doesn't instance any of the objects, you wouldn't have access to a "real" resource with all its functions, but you'd have access to the properties. So you'd be able to get the messages that way.
So there are two options for the solution:
- You have to manually extract the properties from translations by loading the resources with
ResourceLoaderCompat
and getting them from that, then extracting themessages
property and whatever else you need, then outputting them as CSVs. For an example of how to do this, take a look at what I'm doing intexture_loader_compat.cpp
for loading v2 textures and bitmaps. You'll have to add whatever class your thing is in (I suggest making a new one, likeTranslationLoaderCompat
) as a friend toResourceLoaderCompat
because I don't currently make the internal resource properties openly available, though I should when it comes time to refactor.
OR, you can modify the "real load" in ResourceLoaderCompat
so that it loads a compatibility class that is just backported from v3 and v2, respectively. This part is probably harder than the above, and I don't actually use "real loading" in ResourceLoaderCompat for anything yet, so I don't recommend it unless you feel like taking on a challenge
If you'd like, you can PR your current changes to see what you're currently doing and give you tips.
from gdsdecomp.
Also, if you want to take a look at how the translation resources are structured when stored, you can use the bin to text option in the GDRE tools menu; it's good to have a reference to just look back at. I'd recommend doing that for each major version; v2, v3, and v4, so you can see what the differences are.
edit: this is unnecessary, the structure didn't change, see below.
Here is an example of a bin to text .translation
file from v3:
[gd_resource type="PHashTranslation" format=2]
[resource]
hash_table = PoolIntArray( -1, -1, -1, -1, -1, -1, -1, 0, 6, 16, 2, <...>
bucket_table = PoolIntArray( 1, 1, -558281573, 507, 50, 76, 2, 1, <...>
strings = PoolByteArray( 254, 80, 33, 3, 3, 71, 117, 6, 22, 36, 18, <...>
from gdsdecomp.
Taking a look at the history of PHashTranslation
, it doesn't actually have the messages
property, it's just an optimized hash table. However, we got lucky here in that there aren't any actual changes to the underlying structure from v2 to v4, it was just pointlessly renamed to OptimizedTranslation
. So all you would have to do is create an object pointer that is instantiated with the type OptimizedTranslation
, set it with the properties extracted from ResourceLoaderCompat
, then reference it as an actual OptimizedTranslation
.
Example:
Object *obj = ClassDB::instantiate(type);
if (!obj) {
return ERR_PARSE_ERROR;
}
// set properties
//Properties in optimizedtranslation:
// Vector<int> hash_table;
// Vector<int> bucket_table;
// Vector<uint8_t> strings;
obj->set("hash_table", hash_table);
<etc..>
Ref<OptimizedTranslation> ref = Ref<OptimizedTranslation>(Object::cast_to<OptimizedTranslation>(obj));
Then get the messages that way.
However, looking at the function implementations here, there doesn't seem to be a way to dump all the messages at once, and it's not a real HashMap
so you can't dump the keys and values that way. You may have to create a child class of OptimizedTranslation
and cast the OptimizedTranslation
object, and write custom functions to get the individual elements.
But in either case, I'd try get_message_list
and see what happens; it may be empty since it's not actually implemented in OptimizedTranslation
and the parent function Translation::get_message_list()
references the translation_map
, which doesn't seem to be set in OptimizedTranslation.
from gdsdecomp.
calling
ClassDB::add_compatibility_class("PHashTranslation", "OptimizedTranslation");
ahead of time causes a segfault when I try to load it, so they don't seem to be compatible
btw, I tried to reproduce this using your examples, but I couldn't do so. I think you may have added this to the inner loop and added it multiple times, causing it to overflow and cause a seg fault. Try adding it outside of it.
If that works, then this becomes a lot easier. You can do a real load using ResourceFormatLoaderCompat
(which is recommended because ResourceFormatLoader
can pollute the path cache):
Error ImportExporter::export_translation(const String &output_dir, Ref<ImportInfo> &iinfo) {
Error err;
ResourceFormatLoaderCompat rlc;
// translation files are usually imported from one CSV and converted to multiple "<LOCALE>.translation" files
for (String path : iinfo->dest_files) {
Ref<Translation> tr = rlc.load(path, "", &err);
ERR_FAIL_COND_V_MSG(err != OK, err, "Could not load translation file " + iinfo->get_path());
ERR_FAIL_COND_V_MSG(!tr.is_valid(), err, "Translation file " + iinfo->get_path() + " was not valid");
List<StringName> messages;
tr->get_message_list(&messages);
for (const StringName &s : messages) {
print_line(s, tr->get_message(s));
}
}
return OK;
}
BTW, I did test get_message_list
and it does not work, unfortunately. the unit test even checks to make sure it doesn't work. So, you will have create a child class of OptimizedTranslations and figure out how to get the individual elements out of the hash map; take a look at struct Bucket in optimized_translations.h
from gdsdecomp.
thanks for looking into this and explaining everything
I wasn't using ResourceFormatLoaderCompat
, just regular ol' ResourceLoader::load
I have bad news though: the developer gave me the imported translation CSV, so this went from the top of my priority list to the bottom...
from gdsdecomp.
😭
from gdsdecomp.
I decided to implement it anyway. Give the standalone
build artifacts from the CI run a try once they're finished building. https://github.com/bruvzg/gdsdecomp/actions/runs/3317312034
from gdsdecomp.
wow, nice!
when I click to download "GDRE_tools-standalone-linux" on that page, the little blue progress bar at the top just slowly crawls but it never loads. when I curl
it, it says HTTP request sent, awaiting response... 404 Not Found
shame that we're not always able to recover the keys :(
from gdsdecomp.
wow, nice! when I click to download "GDRE_tools-standalone-linux" on that page, the little blue progress bar at the top just slowly crawls but it never loads. when I
curl
it, it saysHTTP request sent, awaiting response... 404 Not Found
You have to be logged into download it; try opening it up in a new tab.
shame that we're not always able to recover the keys :(
Yeah, and there’s no real way to do it programmatically either. You can’t recover them from the hash values, and because the key can be literally anything and stored as any member value, there’s no way to search the project for it.
The best we could do is a Translation editor, where people could edit in new translations and we then store them as a new OptimizedTranslation with the hash values from other translations. That’s a lot of work though, which is why I just tell people in the warning message to ask the creator.
from gdsdecomp.
just tried the build and the .assets/translations.csv
output is correct! it says they're missing keys but the game uses one of the languages as the keys and it either found that or that's the default translation or something (I didn't entirely understand the default_messages
guessing code). if there's ever a discrepancy between the sheet I have and the game assets, this will help
from gdsdecomp.
How that works is: We search for the locale/fallback
setting in the pck's project.godot
to determine what the default language is. If it's not set, then it defaults to English. Then we retrieve the message values for each translation, and if one of them is the default fallback language, we store the message values for that language as default_messages
. This is because it is likely that the message value for the default language will be the key or part of the key.
We then cycle through all the message values in the default translation, and try get_message(key)
to determine the key by matching the message value with the message retrieved from get_message
. The keys that we try are the message value itself, and several permutations thereof (appending $$, TL_, stripping punctuation, etc.) For example, the key for the message displayed in a "Password" box may be "$$Password". If one of them results in us getting a message value that matches what we have, we use that. If we can't find it, we store it as <MISSING KEY [message]>
It sounds like the locale/fallback
language may be set to something other than what they actually intended to be the default language. What language is the game in by default when you open it? I might want to look at the project to see if I can improve that.
from gdsdecomp.
the game I'm datamining has frequent updates and happens to ship the translation keys as a random language (yi_US
) to help translators see where the keys are rendered in-game. so it's actually very helpful to extract just the strings (which happen to have the keys for this game)!
from gdsdecomp.
regarding your comment on the PR,
When you import a translation CSV, it gets stored as OptimizedTranslation files that only store the hashes of the keys, rather than the keys themselves. It's not possible to recover the keys from the hashes, and we can't programatically get them from project resources since they can be any string value and stored in anything.
are you saying that the original strings are in the project resources and we just don't know which one it is? what if we just hash every string and look for matches?
from gdsdecomp.
I had thought about that, but for any project with a non-trivial amount of scripts and resources, that would be an huge amount of strings and would be insanely slow. That might be justified if the object is to recover the translation.csv, so it could be an optional thing, but there's a lot of modifications I would have to make to script/resource loading and parsing to make that happen. I'd have to load and parse every single resource and script and capture every string.
from gdsdecomp.
Related Issues (20)
- xx.png.xxx.etc2.stex cannot recover to png file HOT 8
- feature:GUI->resources-> convert mp3str to mp3 HOT 1
- Can you recreate the .exe directly? HOT 2
- script decompiler isn't working HOT 1
- It becomes slower and slower when creating pck (7000+files) HOT 1
- file/folder drag and drop support or native file system HOT 1
- does not support high dpi display scaling HOT 3
- Failing to recover pak from exe. HOT 3
- Application not running HOT 1
- Not working in a virtual machine HOT 2
- Fail to open pck with correct encryption key HOT 3
- Recursively decompile .GDC files
- Failed to recover pck in 0.5.3 and failed to decompile .gde file HOT 1
- Conversion of Resource of type AudioStreamOggVorbis not implemented HOT 10
- Can't restore ctex png files? HOT 1
- Conversion for Resource of type FontFile and format ttf not implemented
- Decompiled game cannot be compiled again HOT 2
- Extremely small UI for Mac HOT 1
- Decompression not implemented yet for texture format BPTC_RGBA (png) HOT 1
- Godot says "Load failed due to missing dependencies: <Path to code.gd>", but code can be opened in any text editor HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gdsdecomp.