For one, the persisted data should be compatible between 32bit and 64bit platforms. Th

and save all potentially 64 bit values as 32 bit <p dir

Crossplatform compatibility about eris HOT 9 CLOSED

fnuecke commented on May 24, 2024

Crossplatform compatibility

from eris.

Comments (9)

SirVer commented on May 24, 2024

I use this bug report to comment on the project itself as well as on the bug. This is AWESOME! The Widelands project will greatly benefit from this work when we move away from Lua 5.1 to Lua 5.2. I also like the approach to just bundle Lua completely with Eris - a wise choice that will make a lot of stuff easier. We would probably just dump your whole repo into our own - given you public domain Eris as Pluto was (I do not think you state a license, do you?). You can see our use of Lua and the heavily modified version of Pluto that we use here: http://bazaar.launchpad.net/~widelands-dev/widelands/trunk/files/head:/src/scripting/.

One thing that I needed to change to get Pluto working for us was to add in persistent that is system agnostic, so our pluto takes:

void pluto_persist(lua_State * L, Widelands::FileWrite & fw);

And FileWrite implements Signed8(), Unsigned8(), i.e. methods to write values in a system agnostic way (32/64 bit and endianess agnostic). A interface to hook up a C based writing engine is CRUCIAL to get a wide adoption in embedded scenarios imho. Eris should implement a similar interface, i.e. by adding a eris_persist() method that take a struct of function pointers that writes literals (like fw) or a function pointer to a write_atom(const void* patom, some_enum type_of_atom, void* user_data) function that the user can implement the way she likes. The user can then decide the serializing format and if/how they implement endian independence and/or 32/64 bit compatibilities.

For reference, here is our pluto implementation:
http://bazaar.launchpad.net/~widelands-dev/widelands/trunk/view/head:/src/scripting/pluto.h
http://bazaar.launchpad.net/~widelands-dev/widelands/trunk/view/head:/src/scripting/pluto.cc

This is used a lot in Widelands which is a pretty big project (~1e6 downloads so far) and it has served us great. We are very happy with our pluto version but we want to move to Lua 5.2 in the future and are super happy that you made Eris. What do you need it for btw?

from eris.

fnuecke commented on May 24, 2024

Thanks for the positive feedback! I think I actually stumbled across Widelands in my search on Lua persistence at some point. Small world. Regarding the license, I just forgot to put a file in there; it's already in the header, though: Eris is MIT licensed, so that shouldn't be a problem.

I'd like Eris to be as standalone as possible, meaning for it to take care of endianness itself; at which point the writer/reader really shouldn't have to care about the raw data anymore. It already has eris_dump/eris_undump which takes a lua_Writer/lua_Reader, so in your case you could probably do something like this:

static int writer(lua_State *L, const void *p, size_t sz, void *ud) {
    Widelands::FileWrite *fw = (Widelands::FileWrite*)ud;
    fw->Data(p, sz);
}

eris_dump(L, writer, &fw); /* Instead of pluto_persist(L, fw) */

And the equivalent for reading.

I'm just messing around with Minecraft modding a little, somewhat inspired by ComputerCraft; little blocks running Lua programs. Hence my desire for it to be standalone: it's pretty much the only C code in the project (aside from JNLua). Native libraries in Java are oh so much fun...

from eris.

fnuecke commented on May 24, 2024

Eris will now persist all data in little endian (converting the values if the host is big endian) and save all potentially 64 bit values as 32 bit. This generalization has some limitations, obviously, but I think it will cover most cases, and all the special corner-cases would need some sort of adjustment anyway. I've added a section to the readme file detailing some such special cases I could think of off the top of my head.

I don't have a big endian machine anywhere to test this with, so I could only test it by forcing it store the data as big endian instead of little ending, but that worked fine. If you run into any issues on big endian machines, please let me know.

from eris.

SirVer commented on May 24, 2024

and save all potentially 64 bit values as 32 bit

does this mean that all 64 bit values are truncated to 32bit? that seems very dangerous and wrong. Or do you mean that you persist 64bit values as two 32bit values?

from eris.

fnuecke commented on May 24, 2024

It means 64 bit values are truncated, yes. I do ensure that there's no actual loss of information, though. If there would be, I generate an error, so it should be impossible to output invalid data.

Usually the only 64 bit values will be size_t - the checks for int and Instruction (which is "at least 32 bits") are there just in case, since from what I've read there are systems where int can be 64 bit (not the case for any of my machines, though). For the size_t to exceed the 32 bit boundary you either need an insanely high stack, really big userdata that you persist literally, or, and this may be the only realistically problematic one, a light userdatum using more than the first 32 bits.

I also updated the readme a bit regarding this topic.

from eris.

SirVer commented on May 24, 2024

Just curious: Why did you not opt to serialize 32bit values as 64 bits instead. There would never be any loss then and the only thing that would change is that the number of bytes written to the file would be slightly high. Not so important imho.

from eris.

fnuecke commented on May 24, 2024

Well, my reasoning was that in most cases it'd be used in 32-bit applications (even if running on a 64-bit machine), so there'd be the least overhead (checking for truncation) there like this. Plus, as I said, I don't really expect these values to ever really become that large. And it actually makes quite a difference size-wise, from what I can tell: it's somewhat unrepresentative, but for the testsuite the size of the persisted data is only 80% the size when saving all size_ts as uint32_t instead of uint64_t. That was really just the testcase not being representative.

However, I'm currently pondering whether it might be feasible to write the size of the "variable" types into the header and always write the currently "native" size to the file. This way there'd be (theoretically!) no truncation checking when persisting / unpersisting in the same environment, only when transferring data between different architectures, and the generated data would only ever be as large as it needs to be. But that's only if the compiler is clever and optimizes away things like

uint32_t pvalue =  read_uint32_t();
size_t value = (size_t)pvalue;
if ((uint32_t)value != pvalue) { error } else return value;

in cases where size_t == uint32_t. Plus there'd always be an if necessary to determine what int type to read. For each read. So I'm not sure how that would affect performance.

I'll see if I can build some (very basic) kind of benchmark-ish testcase and experiment with this.

Update: all right, that seems to work pretty well, and there was no measurable difference in performance in my pseudo-benchmark. So let's go with that, since it means no errors when persisting and unpersisting on the same machine. Which admittedly seems... a lot cleaner.

Update 2: oh, and thanks for the continued feedback! If not for that I wouldn't really have spent another thought on this, but I think this solution is a lot nicer now, so yeah. Thanks again.

from eris.

SirVer commented on May 24, 2024

I doubt that the compiler is able to optimize things like this away, really. And performance should not be that much of an issue in a first implementation.

Also, I still think always writing the bigger nbits (like 64) is acceptable, if the user is really concerned about the size of the file, he can provide a writer that gzip's or zippys the stream before writing it out (that is what Widelands does), then those zeros should compress nicely.

from eris.

fnuecke commented on May 24, 2024

I just had a look at some fdump output of gcc, and it actually does optimize it away. The truncation check for the 'local' size_t, that is - the if that is always false. For writing it even optimizes the whole write_size_t function away. So that's not an issue and I'm still happy with how it is right now ;-)

from eris.

Crossplatform compatibility about eris HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent