dottedmag / archmage Goto Github PK
View Code? Open in Web Editor NEWCHM format converter
License: GNU General Public License v2.0
CHM format converter
License: GNU General Public License v2.0
Hi, thank you for this wonderful work. I triy to convert a chm file in Chinese to pdf. It works fine but the whole converted pdf is with unreadable characters. I think it might be the codec issue. So, I am wondering is it possible and how to accomplish Chinese chm conversion? Thank you in advance.
A Chinese chm file for test:
chm in chinese.zip
Hi,
as user Finkregh commented on 14 Apr 2015
"the DHTML menu in the left frame does not properly encode special characters (e.g. umlauts: Einführung instead of Einführung)"
Example attached...
i_view32_deutsch.chm: IrfanView CHM-File German Language as i_view32_deutsch.zip
Generated file arch_contents.html showing Einfόhrung instead auf Einführung....
<html>
<head>
<title>i_view32_deutsch.chm</title>
<LINK rel="Stylesheet" type="text/css" href="arch_css.css">
</head>
<body onload="setInterval('getLoc()', 500);">
<script>
var lastDoc;
var contents =
[
["Einfόhrung", "", "1",
["Was ist IrfanView?", "hlp_irfanview.htm", "1"],
["Installation", "hlp_installation.htm", "1"],
["IrfanView deinstallieren", "hlp_uninstall.htm", "1"],
["Hδufig gestellte Fragen (FAQs)", "hlp_frequently_asked.htm", "1"],
["Unterstόtzte Formate", "hlp_supported_file.htm", "1"],
["PlugIns", "hlp_plugins.htm", "1"],
["IrfanView Shell Extension", "hlp_shellextension.htm", "1"],
["Hotkeys/Tastenkombinationen", "hlp_hotkeys.htm", "1"],
["Kommandozeilen-Optionen", "hlp_command_line.htm", "1"],
["Symbolleiste", "hlp_toolbar.htm", "1"]],
["Menό: Datei", "", "1",
Inside HTML-Content the translation works fine: Menü is shown right...
Generated file hlp_open.htm
`
Klicken Sie auf das Datei-Menü, dann auf Öffnen (oder benutzen Sie das Öffnen-Werkzeug auf der Symbolleiste). Ein Dialog ermöglicht Ihnen, eine Datei zum Gebrauch in IrfanView zu öffnen.
`
Could you have a look please?
Thanks,
ramon
Hello,
with archmage version 0.3.1 you used to be able to create html files not only from a compiled chm but also from its source.
For example I used to build the html helm from following chm source files:
https://salsa.debian.org/dotnet-team/keepass2/tree/master/Docs
With:
import archmod.CHM
archmod.CHM.CHMDir("Docs").process_templates("Docs/Chm")
It appears this does not work anymore in the version 0.4.1:
import archmage.CHM;
archmage.CHM.CHMFile("Docs")
Traceback (most recent call last):
File "test.py", line 3, in <module>
archmage.CHM.CHMFile("Docs")
File "/usr/lib/python3/dist-packages/archmage/CHM.py", line 71, in __init__
self.topicstree = self.topics()
File "/usr/lib/python3/dist-packages/archmage/CHM.py", line 143, in topics
self.cache['topics'] = self._topics()
File "/usr/lib/python3/dist-packages/archmage/CHM.py", line 147, in _topics
for e in self.entries():
File "/usr/lib/python3/dist-packages/archmage/CHM.py", line 81, in entries
self.cache['entries'] = self._entries()
File "/usr/lib/python3/dist-packages/archmage/CHM.py", line 92, in _entries
if chmlib.chm_enumerate(self._chm, chmlib.CHM_ENUMERATE_ALL, get_name, out) == 0:
File "/usr/lib/python3/dist-packages/chm/chmlib.py", line 44, in chm_enumerate
return _chmlib.chm_enumerate(h, what, enumerator, context)
ValueError: Expected valid chmlib object
This is not mentioned in the NEWS file, was this feature intentionally removed?
Hi @dottedmag
This package is a bit obsolete, I don't know if there is a way to replace it or update it?
https://pypi.org/project/sgmllib3k
Cheers,
This package needs a new maintainer.
While CHM format does not change, and few if any new CHM documents are created, any package dealing with untrusted input has to be maintained.
Archmage installed but doesn't work well on Windows, here is the output after running it:
For instance archmage -x afile.chm outputdir
and with other params too, it returns:
Traceback (most recent call last):
File "C:\Python27\Scripts\archmage-script.py", line 11, in
load_entry_point('archmage==0.3.1.post6.dev89962024', 'console_scripts', 'archmage')()
File "C:\Python27\lib\site-packages\pkg_resources__init__.py", line 565, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "C:\Python27\lib\site-packages\pkg_resources__init__.py", line 2589, in load_entry_point
return ep.load()
File "C:\Python27\lib\site-packages\pkg_resources__init__.py", line 2249, in load
return self.resolve()
File "C:\Python27\lib\site-packages\pkg_resources__init__.py", line 2255, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
File "build\bdist.win32\egg\archmod\cli.py", line 63, in
File "build\bdist.win32\egg\archmod\CHM.py", line 43, in
File "build\bdist.win32\egg\archmod\chmtotext.py", line 27, in
AttributeError: 'module' object has no attribute 'SIGPIPE'
I built chmlib and pychm with this batch file and tested pychm with this python script which works, but Archmage doesn't.
I am on Ubuntu 16.04. I installed archmage
package from default repository. I use archmage to serve a chm
file which has non ASCII characters, some Turkish characters which are not ASCII such as ı
,ğ
, ç
, ö
etc. Non ASCII characters are replaced by strange letters such as A%
in the titles of the linked pages. But the Non ASCII characters correctly rendered in the text in the pages. Is it related to python environment or does it stem from html
rendering failure in Non ASCII characters, but the text in the pages are well.
If a CHM file without ToC is passed to Archmage, it produces a HTML page with empty ToC and content panes, so there is no way to navigate to content of CHM file.
If ToC is not found by Archmage, it should hide the ToC pane in the output and display a meaningful page in the content pane.
archmage -d
does not include images and other non-HTML leaf files in the output.
They can be included using data:
URI scheme.
python3 compatible?
If a file with a broken CHM directory file entry is passed to Archmage, it prints a traceback:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x88 in position 3: invalid start byte
These malformed files should be detected and ignored, and a warning presented to the user to notify that some links in the output will be broken.
If a non-CHM file is passed to Archmage it outputs a traceback:
ValueError: Expected valid chmlib object
Archmage should catch the problem and print a meaningful error.
can you add me as a collaborator
At least the following:
the DHTML menu in the left frame does not properly encode special characters (e.g. umlauts: Einführung
instead of Einführung
)
I'm getting an a "ValueError: signal only works in main thread" in my error_log when attempting to view a .chm file with the archmage mod_python component in apache. FYI I am running 64-bit gentoo and the following versions of software:
apache-2.2.14
mod_python-3.3.1
beautifulsoup-3.0.7
chmlib-0.40
pychm-0.8.4
archmage-0.2.4
This FAQ seems to have information on the problem, but not being a programmer I'm not sure how to fix it:
http://twistedmatrix.com/trac/wiki/FrequentlyAskedQuestions#Igetexceptions.ValueError:signalonlyworksinmainthreadwhenItrytorunmyTwistedprogramWhatswrong
here is the complete output in error_log:
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] mod_python (pid=2971, interpreter='localhost', phase='PythonHandler', handler='archmod.mod_chm'): Application error, referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] ServerName: 'localhost', referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] DocumentRoot: '/var/www/localhost/htdocs', referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] URI: '/reference/pages/prog/perl/The_Perl_CD_Bookshelf_4.0_oreilly_2004.chm/', referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] Location: None, referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] Directory: None, referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] Filename: '/nfs/media/reference/pages/prog/perl/The_Perl_CD_Bookshelf_4.0_oreilly_2004.chm', referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] PathInfo: '/', referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] Traceback (most recent call last):, referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] File "/usr/lib64/python2.6/site-packages/mod_python/importer.py", line 1537, in HandlerDispatch\n default=default_handler, arg=req, silent=hlist.silent), referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] File "/usr/lib64/python2.6/site-packages/mod_python/importer.py", line 1202, in _process_target\n module = import_module(module_name, path=path), referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] File "/usr/lib64/python2.6/site-packages/mod_python/importer.py", line 304, in import_module\n return import(module_name, {}, {}, ['*']), referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] File "/usr/lib64/python2.6/site-packages/archmod/mod_chm.py", line 5, in \n from archmod.CHM import CHMFile, referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] File "/usr/lib64/python2.6/site-packages/archmod/CHM.py", line 26, in \n from archmod.chmtotext import chmtotext, referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] File "/usr/lib64/python2.6/site-packages/archmod/chmtotext.py", line 11, in \n signal.signal(signal.SIGPIPE, signal.SIG_DFL), referer: http://localhost/reference/gen/perl_programming.html
[Sun Jan 03 14:58:31 2010] [error] [client 127.0.0.1] ValueError: signal only works in main thread, referer: http://localhost/reference/gen/perl_programming.html
I have downloaded "Extended HTML Help" from http://www.php.net/download-docs.php
Then it wil be archmage started:
archmage -p 1111 /mnt/data/marek/Internet/help/php_ext/php_manual_en.chm
And after reload in http://localhost:1111/ there is following output:
localhost - - [26/Sep/2007 22:55:13] "GET / HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:14] "GET /arch_header.html HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:14] "GET /arch_frameset.html?page=_index.html HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:14] "GET /arch_contents.html HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:14] "GET /_index.html HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:14] "GET /arch_css.css HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:14] "GET /_script.js HTTP/1.1" 200 -
Exception happened during processing of request from ('127.0.0.1', 47811)
Traceback (most recent call last):
File "SocketServer.py", line 222, in handle_request
self.process_request(request, client_address)
File "SocketServer.py", line 241, in process_request
self.finish_request(request, client_address)
File "SocketServer.py", line 254, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "SocketServer.py", line 521, in init
self.handle()
File "BaseHTTPServer.py", line 316, in handle
self.handle_one_request()
File "BaseHTTPServer.py", line 310, in handle_one_request
method()
File "/usr/lib/python2.4/site-packages/archmod/CHM.py", line 414, in do_GET
self.wfile.write(self.server.CHM.get_entry_by_name(pagename))
File "/usr/lib/python2.4/site-packages/archmod/CHM.py", line 89, in get_entry_by_name
raise NameError, 'There is no %s' % name
localhost - - [26/Sep/2007 22:55:14] "GET /arch_css.css HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:17] "GET /icons/99.gif HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:17] "GET /icons/1.gif HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:17] "GET /icons/91.gif HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:17] "GET /icons/93.gif HTTP/1.1" 200 -
localhost - - [26/Sep/2007 22:55:17] "GET /icons/94.gif HTTP/1.1" 200 -
It looks like HTML code is missing completely CSS formating. (php_manual_prefs.js exist and is working - tested in Windows in the same directory)
Should this be placed in a bug category?
First off, thanks for maintaining this project. Thanks to this, i've been able to convert a .chm to HTML, works perfectly!
So $ archmage -x my_file.chm
works fine. But when i try to convert to PDF $ archmage -c pdf my_file.chm
, i get:
Traceback (most recent call last):
File "/home/kees/Projects/archmage/env/bin/archmage", line 33, in <module>
sys.exit(load_entry_point('archmage', 'console_scripts', 'archmage')())
File "/home/kees/Projects/archmage/archmage/cli.py", line 192, in main
source.htmldoc(options.output, options.mode)
File "/home/kees/Projects/archmage/archmage/CHM.py", line 402, in htmldoc
self.extract_entries(
File "/home/kees/Projects/archmage/archmage/CHM.py", line 352, in extract_entries
self.extract_entry(
File "/home/kees/Projects/archmage/archmage/CHM.py", line 327, in extract_entry
Entry(
File "/home/kees/Projects/archmage/archmage/CHM.py", line 496, in correct
data = re.sub("<div .*teamlib\\.gif.*\\/div>", "", data)
File "/usr/lib/python3.8/re.py", line 210, in sub
return _compile(pattern, flags).sub(repl, string, count)
TypeError: cannot use a string pattern on a bytes-like object
This looks like the kind of error that is the result of an incomplete conversion to Python 3. (I've also installed archmage in a Python 2 virtualenv, but that fails on a print statement: print(msg, file=outfp)
.)
I've installed archmage from source, commit 8b7f3cd. Python is 3.8.5. I also have Successfully installed archmage beautifulsoup4-4.9.3 pychm-0.8.6 sgmllib3k-1.0.0 soupsieve-2.1
Once again, thanks for your work maintaining this, and if i need to provide more info, let me know!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.