Giter Site home page Giter Site logo

friedrichfroebel / didjvu Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jwilk/didjvu

1.0 1.0 2.0 1.59 MB

DjVu encoder with foreground/background separation (Python 3 fork)

Home Page: http://jwilk.net/software/didjvu

License: GNU General Public License v2.0

Python 98.75% Shell 0.53% Makefile 0.72%
djvu djvulibre

didjvu's People

Contributors

dependabot[bot] avatar friedrichfroebel avatar jwilk avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Forkers

rmast jsbien

didjvu's Issues

Replace `nose` for tests

We currently use nose for running the tests. As this package is officially unmaintained, our tests fail on Python 3.10:

python3 didjvu --test --verbose tests/
Traceback (most recent call last):
  File "/home/runner/work/didjvu/didjvu/didjvu", line 25, in <module>
    didjvu.main()
  File "/home/runner/work/didjvu/didjvu/lib/didjvu.py", line 203, in __init__
    parser.parse_args(actions=self)
  File "/home/runner/work/didjvu/didjvu/lib/cli.py", line 308, in parse_args
    o = argparse.ArgumentParser.parse_args(self)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/argparse.py", line 1825, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/argparse.py", line 1858, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/argparse.py", line 2067, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/argparse.py", line 2007, in consume_optional
    take_action(action, args, option_string)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/argparse.py", line 1935, in take_action
    action(self, namespace, argument_values, option_string)
  File "/home/runner/work/didjvu/didjvu/lib/cli.py", line 129, in __call__
    nose.main(argv=argv)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/core.py", line 118, in __init__
    unittest.TestProgram.__init__(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/unittest/main.py", line 100, in __init__
    self.parseArgs(argv)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/core.py", line 179, in parseArgs
    self.createTests()
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/core.py", line 193, in createTests
    self.test = self.testLoader.loadTestsFromNames(self.testNames)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/loader.py", line 481, in loadTestsFromNames
    return unittest.TestLoader.loadTestsFromNames(self, names, module)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/unittest/loader.py", line 220, in loadTestsFromNames
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/unittest/loader.py", line 220, in <listcomp>
    suites = [self.loadTestsFromName(name, module) for name in names]
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/loader.py", line 431, in loadTestsFromName
    return self.loadTestsFromModule(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/loader.py", line 354, in loadTestsFromModule
    tests.extend(self.loadTestsFromDir(module_path))
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/loader.py", line 183, in loadTestsFromDir
    yield self.loadTestsFromName(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/loader.py", line 431, in loadTestsFromName
    return self.loadTestsFromModule(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/loader.py", line 359, in loadTestsFromModule
    return self.suiteClass(ContextList(tests, context=module))
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/suite.py", line 428, in __call__
    return self.makeSuite(tests, context, **kw)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/suite.py", line 475, in makeSuite
    suite = self.suiteClass(
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/suite.py", line 159, in __init__
    super(ContextSuite, self).__init__(tests)
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/suite.py", line 53, in __init__
    super(LazySuite, self).__init__()
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/unittest/suite.py", line 22, in __init__
    self._tests = []
  File "/opt/hostedtoolcache/Python/3.10.2/x64/lib/python3.10/site-packages/nose/suite.py", line 106, in _set_tests
    if isinstance(tests, collections.Callable) and not is_suite:
AttributeError: module 'collections' has no attribute 'Callable'
make: *** [Makefile:53: test] Error 1

We should therefore have a look at the maintained alternatives (see nose-devs/nose#1099 as well). For now, we should probably be able to monkey-patch it using https://stackoverflow.com/a/70641487/.

XMP backends: Issue with GExiv2

Hi, I see you made two out of three xmp-backends work with the tests.

I'm now investigating the gexiv2/exiv2-issue. The most close to the finding is this one:

msys2/MINGW-packages#7167

I think the solution would be to hack something in for Ubuntu at this spot:
https://gitlab.gnome.org/GNOME/gexiv2/-/commit/f72dc06ecfedabb6bbd75dc538fd342b59325b26

The issue seems solved for Windows with two different compile options: EXV_UNICODE_PATH or G_OS_WIN32.

I think there should be a compile option for getting it to work on Ubuntu as well. There are discussions about getting it in a stream instead and that seems to be a hard path to follow, so no easy road.

Use ipc.Subprocess as context manager

Running the tests results in some ResourceWarnings being reported:

python3 -m unittest discover --start-directory tests/
...................................................................../home/runner/work/didjvu/didjvu/tests/test_ipc.py:82: ResourceWarning: unclosed file <_io.BufferedWriter name=4>
  self._test_signal(name)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/home/runner/work/didjvu/didjvu/tests/test_ipc.py:82: ResourceWarning: unclosed file <_io.BufferedWriter name=4>
  self._test_signal(name)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
/home/runner/work/didjvu/didjvu/tests/test_ipc.py:82: ResourceWarning: unclosed file <_io.BufferedWriter name=4>
  self._test_signal(name)
ResourceWarning: Enable tracemalloc to get the object allocation traceback
.................sssssssssssssssssssss..
----------------------------------------------------------------------

The fix should be similar to ocrodjvu which lead to similar issues, id est to use ipc.Subprocess as a context manager everywhere: FriedrichFroebel/ocrodjvu#8

Analyze memory write operations

Running didjvu bundle -o out.djvu *.tif on a directory which contains 413 images with a total size of 58 MB results in an output file of size 12 MB. The task monitor reports about 12.5 GB of write operations for the didjvu process at the end, which is more than 200 times the input size.

This seems to be quite some overhead which might receive some further analysis on which operations cause this and whether there is anything we can optimize here.

The above observations were made with some older commit of this repository running on Python 3.6.

Support for setup.py

The package uses some custom handling for generating Python packages, based upon a Makefile. It probably would be better to use the standard setup.py approach and define the current didjvu executable file as an entry point.

Deprecation warnings from GExiv2

Running the tests for the GExiv2 backend raises some deprecation warnings:

test_empty (test_xmp.MetadataTestCase) ... /home/runner/work/didjvu/didjvu/didjvu/xmp/gexiv2_backend.py:106: DeprecationWarning: GExiv2.Metadata.generate_xmp_packet is deprecated
  self._meta.generate_xmp_packet(
test_new (test_xmp.MetadataTestCase) ... /home/runner/work/didjvu/didjvu/didjvu/xmp/gexiv2_backend.py:73: DeprecationWarning: GExiv2.Metadata.get_tag_string is deprecated
  value = self._meta.get_tag_string(f'Xmp.{key}')
/home/runner/work/didjvu/didjvu/didjvu/xmp/gexiv2_backend.py:83: DeprecationWarning: GExiv2.Metadata.set_tag_string is deprecated
  self._meta.set_tag_string(f'Xmp.{key}', value)
/home/runner/work/didjvu/didjvu/didjvu/xmp/gexiv2_backend.py:101: DeprecationWarning: GExiv2.Metadata.set_xmp_tag_struct is deprecated
  self._meta.set_xmp_tag_struct('Xmp.xmpMM.History', GExiv2.StructureType.SEQ)

Corresponding upstream docs:

This will require at least GEXiv2 0.12.2.

Additionally, it seems like https://gnome.pages.gitlab.gnome.org/gexiv2/docs/gexiv2-Functions-related-to-XMP-metadata.html#gexiv2-metadata-register-xmp-namespace is deprecated since 0.14.0 as well and might should be replaced, while this version is the latest one, but already available in our build environment:

Get:19 http://azure.archive.ubuntu.com/ubuntu jammy/main amd64 libgexiv2-2 amd64 0.14.0-1build1 [81.4 kB]

Nevertheless, this seems to break Ubuntu 20.04 compatibility, which is still under official support until April 2025, so we might want to guard the second change with a compatibility layer: https://packages.ubuntu.com/search?suite=default&section=all&arch=any&keywords=libgexiv&searchon=names For Ubuntu 18.04, I do not see this issue, as it will be EOL in three months anyway.

Remove numpy fix

Friedrich,

In an ancient time @jwilk made this commit jwilk@78fbbcf to

didjvu (0.2.2) unstable; urgency=low

  • Prevent Numpy from being imported by Gamera (loading it takes noticeable
    amount of time, even though it's never needed).

In the now called gamera_support.py concerning the line

lib/gamera_support.py:    sys.modules['numpy'] = None

I'm working on a PDF-compressor for OCRmyPDF making use of some didjvu-libraries with quite good results, for the moment pointing to your python3-branch.
I don't know if the fix still applies for Python2.7, but I don't see any speeddifference when I comment it out in the Python3-version. Can you remove that line?

A Debian problem

I wanted to give your version a try, but got
No module named PIL; please install the python-pil package
I'm on Debian, so I installed a package which seems appropriate, i.e.
python3-pypillowfight, but it not helped. pip install python-pil and pip install python3-pil also doesn't work. (I'm not a programmer, so I work by trial and error).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.