Giter Site home page Giter Site logo

Large file support about dumb HOT 12 CLOSED

kode54 avatar kode54 commented on August 16, 2024
Large file support

from dumb.

Comments (12)

SimonN avatar SimonN commented on August 16, 2024

I'd put this on the backburner because you'd fix an issue that will probably never hit.

Instead, I'd focus on defining and stabilizing the 2.0.0 API, and it would be perfectly fine for me to have all 2.y.z without 2-GB-module support.

But I can really appreciate this desire to fix obscure corner cases. This would likely get merged if you made a thoroughly-tested PR.

from dumb.

Rondom avatar Rondom commented on August 16, 2024

Indeed it is quite unlikely. But at least it should not crash. I have not checked whether it does, but right now the -1 that the ftell will probably return is saved into the filesize.
I am not sure where it is used, but it might create problems.

Maybe I will create a PR that -- as a first step -- extends the API, but still only supports 2GB files for the file-based-backend, i.e. will cleanly error. Otherwise we would need yet another ABI-bump (or write wrapper code for compatibility) to support the function-pointers with a different signature.

from dumb.

SimonN avatar SimonN commented on August 16, 2024

Sounds good to me -- have this in the 2.0 API where breakage okay, and treat small modules as before. Then add full support later.

from dumb.

kode54 avatar kode54 commented on August 16, 2024

Fixing the ftell so it handles errors is sort of important, but many things in the new interface will automatically fail if the position is greater than the size, and with a negative size, this is always the case. At least I think I made that happen. Not totally sure without actually looking at it again.

from dumb.

SimonN avatar SimonN commented on August 16, 2024

@Rondom, could you propose the new API within 2-3 days? This seems to be the final change to the 2.0 API.

I'd love to adapt my A5 pull request to any changes here, then get it merged. The A5 team plan a release in early-mid October, otherwise music support would wait another 3 months.

from dumb.

kode54 avatar kode54 commented on August 16, 2024

I don't mind changing certain APIs and variables to the declared LONG_LONG macro type. I just need to change all of the relevant loaders and parsers to use that variable type to receive file offsets.

Just note that we may only need to support rejecting files with offsets that large. It is still possible to use the current API to create a 32-bit or smaller window within a large file, and keep your own private base file offset for seeking operations, as well as a private size for the file size operation.

Note that quite a number of the formats are limited to 32 bit addressing anyway. At least Impulse Tracker modules contain 32 bit file offsets to most of the structures. And the three RIFF based formats are limited to RIFF 32 bit, and have never been seen in RF64 format.

from dumb.

SimonN avatar SimonN commented on August 16, 2024

I'm fine with either strategy:

  • Keep the 32-bit API and reject large files/ignore everything after 2 GB.
  • Make the API 64-bit and silently truncate for all 32-bit formats. (Should be equivalent to a value out-of-bounds.)

The publicly visible typedef LONG_LONG is not prefixed with DUMB_. Should it be renamed to DUMB_OFF_T? I don't have enough experience across many different C projects to judge whether there are other meaningful definitions than ours. We guard it with #ifndef, but then it still depends on header order. Expert advice welcome.

Do we even need a macro anymore? C99 defines long long in the C spec as a 64-bit-or-longer signed integer.


If you still prefer the 64-bit API, here's my take:

#define DUMB_OFF_T long long // assumes C99, that's OK. off_t would be signed, too
// remove the define for LONG_LONG or make it internal

typedef struct DUMBFILE_SYSTEM
{
    void *(*open)(const char *filename);
    int (*skip)(void *f, DUMB_OFF_T n);
    int (*getc)(void *f);
    DUMB_OFF_T (*getnc)(char *ptr, DUMB_OFF_T n, void *f);
    void (*close)(void *f);
    int (*seek)(void *f, DUMB_OFF_T n);
    DUMB_OFF_T (*get_size)(void *f);
}
DUMBFILE_SYSTEM;

DUMB_OFF_T dumbfile_pos(DUMBFILE *f);
int dumbfile_skip(DUMBFILE *f, DUMB_OFF_T n);
int dumbfile_seek(DUMBFILE *f, DUMB_OFF_T n, int origin);
DUMB_OFF_T dumbfile_get_size(DUMBFILE *f);
DUMB_OFF_T dumbfile_getnc(char *ptr, DUMB_OFF_T n, DUMBFILE *f);
DUMBFILE *dumbfile_open_memory(const char *data, DUMB_OFF_T size);

DUH *make_duh(DUMB_OFF_T length, /* ... */);
DUMB_OFF_T duh_get_length(DUH *duh);
void duh_set_length(DUH *duh, DUMB_OFF_T length);

from dumb.

Rondom avatar Rondom commented on August 16, 2024

I agree with everything you said. Defining LONG_LONG bothered me as well. I might have a look tonight or tomorrow.

from dumb.

SimonN avatar SimonN commented on August 16, 2024

@kode54: any comments on my suggested API? That's a release candidate of the 2.0 API, please scrutinize: Did I miss functions that need 64-bit offsets? Did I put 64-bit arguments in functions that don't need them?

If you're OK with it, and unless @Rondom has something within 12 hours, I'll implement the above API tomorrow, and make a PR in about 24 hours.

I'll watch the issues and PRs in case anything happens in-between, and I'll also sit in IRC for chat or questions: irc.freenode.net #allegro, or irc.quakenet.org #lix.

from dumb.

Rondom avatar Rondom commented on August 16, 2024

I posted a PR #59. Your suggested changes are fine except that getnc should not return and take off_t, but rather size_t. On a 32-bit system sizeof(off_t)>sizeof(size_t)

Please review carefully, because I agree with you, that it is easy to miss things or change too much.

from dumb.

SimonN avatar SimonN commented on August 16, 2024

Thanks for the work, most of it looked very good already. PR #60 is a smaller change for the remainder.

@kode54: What is a DUH, exactly? How exactly is its length used throughout the code? It's the length of what? Should DUH.length be:

  • size_t (length in memory), probably not because DUMB calls it with -1,
  • dumb_ssize_t (length in memory or error code), or
  • dumb_off_t (length inside a file on disk, negatives possible, e.g., to indicate error)?

PR #60 keeps it as dumb_off_t. My gut instinct is that's correct, it probably describes stuff on disk. But I'm really unsure. While DUH is private, make_duh is public and takes a parameter of length's type. We should get this right. :-)

from dumb.

Rondom avatar Rondom commented on August 16, 2024

Closed by PR #59 and PR #60

from dumb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.