Giter Site home page Giter Site logo

Comments (6)

gen2brain avatar gen2brain commented on August 16, 2024

Close() should close and drop the document, context, and stream, i.e. it should clean all the memory. How large are PDFs? I am guessing that it is related to the context created with FZ_STORE_UNLIMITED hard-coded. You didn't show all your code, but it looks like you are doing more things, extracting text, etc. I know for example ImageMagick can use a LOT of memory when dealing with PDF and can be controlled with the -limit option, but here it is hard-coded.

You can try to change that to FZ_STORE_DEFAULT (256 << 20), or just set some value. I should probably allow optional parameters and fallback to the default value, unlimited is not a good idea.

from go-fitz.

conor-nsurely avatar conor-nsurely commented on August 16, 2024

@gen2brain
Hey,
Thanks for getting back to me.

The pdfs are about 2-3 MiB on average, largest being 7-8MiB.
The pdf pages are converted to images and then sent to the Google vision api to perform OCR.

Once that is set will it clean up the old memory once it reaches the limit or what happens?

So in order to change FZ_STORE_UNLIMITED I would have to build the package myself right?

from go-fitz.

gen2brain avatar gen2brain commented on August 16, 2024

I have added the MaxStore global variable, so you can set the size you need, default is 256 << 20.
You can check the header files for docs, i.e. https://github.com/gen2brain/go-fitz/blob/master/include/mupdf/fitz/context.h#L306.

from go-fitz.

conor-nsurely avatar conor-nsurely commented on August 16, 2024

Hey @gen2brain thanks for fixing this so quickly.

Do you know when this might be released?

from go-fitz.

gen2brain avatar gen2brain commented on August 16, 2024

When the new mupdf libraries are built, probably will not happen during summer, just use @latest for now.

from go-fitz.

conor-nsurely avatar conor-nsurely commented on August 16, 2024

Okay will do.

Thanks

from go-fitz.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.