transpect / docx2tex Goto Github PK
View Code? Open in Web Editor NEWConverts Microsoft Word docx to LaTeX
License: BSD 2-Clause "Simplified" License
Converts Microsoft Word docx to LaTeX
License: BSD 2-Clause "Simplified" License
[low priority]
If I pull the docx2tex code from the repo now, the d2t script fails and I get the following errors in the .d2t.log
This does not happen if I download the release 1.1.
It is not a problem for me, but I tought I should report.
ERROR: http://transpect.github.io/../index.html:1:107:Not a pipeline or library: html
ERROR: err:XS0044:Unexpected step name: tr:load-cascaded
ERROR: It is a static error if any element in the XProc namespace or any step has element children other than those specified for it by this specification. In particular, the presence of atomic steps for which there is no visible declaration may raise this error.
Citavi 6 seems to store its references as base64-encoded JSON in field codes. There has been a request to transform them into BibTeX.
We will do this if we receive at least € 960 of external funding for it. The user who requested the feature is currently considering sponsorship (that is, committing to the full amount).
I have a docx file that lead to the following pdflatex error (I can provide the file by private email).
Underfull \hbox (badness 10000) in paragraph at lines 36--36
\OT1/cmr/bx/n/10.95 mun-the-ra-pie
Underfull \hbox (badness 10000) in paragraph at lines 36--36
\OT1/cmr/bx/n/10.95 nicht ge-eig-net
Underfull \hbox (badness 10000) in paragraph at lines 36--36
[]|\OT1/cmr/bx/n/10.95 nicht
Underfull \hbox (badness 10000) in paragraph at lines 36--36
\OT1/cmr/bx/n/10.95 quan-ti-fi-
Underfull \hbox (badness 10000) in paragraph at lines 36--36
[]|\OT1/cmr/m/n/10.95 Idelalisib/Rituximab f[]uhrt zu ei-ner
Underfull \hbox (badness 3354) in paragraph at lines 36--36
\OT1/cmr/m/n/10.95 Verl[]angerung der pro-gres-si-ons-frei-en und des
Underfull \hbox (badness 3536) in paragraph at lines 36--36
\OT1/cmr/m/n/10.95 Ge-samt[]uberlebenszeit so-wie zu ei-ner Stei-ge-
Underfull \hbox (badness 10000) in alignment at lines 36--36
[][][]
! LaTeX Error: There's no line here to end.
See the LaTeX manual or LaTeX Companion for explanation.
Type H for immediate help.
...
l.40 \newline
Dear Sirs/Madams,
I need to convert a number of docx files to LaTeX so I have downloaded your tool on my xubuntu 19.04 laptop. Regrettably, when I try to run your code an error message is displayed:
$ ./d2t ~/Documents/Introduction.docx
starting docx2tex
Errors encountered while running docx2tex. Please see /home/eidon/Documents/Introduction.d2t.log for details.
$ cat Introduction.d2t.log
./d2t: line 203: /home/eidon/packages/docx2tex-master/calabash/calabash.sh: No such file or directory
From this I understood that I needed to install calabash, which I did by running
$ java -jar xmlcalabash-1.1.27-99.jar
Despite this, the error is still there. Would you be so kind as to help me? Thank you very much!
Kind regards,
Eidon
Desiderata
Text that has font effect "hidden" is translated as normal text, even if it does not display in a pdf generated from the docx and it does not display in the document when the "formatting symbol" button (¶) is not active.
In the following example, the word "bbb" is "hidden".
Would it be possible to have it translated into something like \@gobble{bbb}
?
https://medialab.sissa.it/owncloud/index.php/s/VFtUaKfo3chdV82
./d2t "6 Interferometrische Sensoren/160228 6_Interferometrische Sensoren.docx"
./d2t: line 87: [: too many arguments
./d2t: line 121: [: too many arguments
./d2t: line 143: $LOG: ambiguous redirect
./d2t: line 146: [: too many arguments
starting docx2tex
./d2t: line 167: $LOG: ambiguous redirect
Errors encountered while running docx2tex. Please see /Users/ajung/src/docx2tex/6 Interferometrische Sensoren/6
Interferometrische
160228
6_Interferometrische
Sensoren.docx
.docx.d2t.log for details.
I think that the file JINST_001P_1018.docx translates correctly in
1ebf3cf (Oct 17 change mapping of x131)
but fails in
74c5b4d (Nov 7 update mml2tex)
and following versions (but I think with different errors).
Any chance of adding MathType support?
How would I configure the conf.xml to produce an article instead of a book? Is there a basic conf.xml for articles?
I have a document where I used Endnote to manage the references. The file is the same of #3.
What happens is that the superscripted numbers in the main text corresponding to the references are all replaced by \href{}{}
, which causes the resulting pdf to show nothing instead of the superscripted numbers.
hello,
When I run d2t, below error occurs:
cp: '../modelo-resumo-semana-conhecimento-2019.docx' e '/usr/src/modelo-resumo-semana-conhecimento-2019.docx' são o mesmo arquivo
INFO : xpl/docx2tex.xpl:197:38:No custom-font-maps loaded.
ERROR: xproc-util/load/xpl/load.xpl:0:load-error:Could not load file:/usr/src/docx2tex/conf/conf.csv (file:///usr/src/docx2tex/xproc-util/load/xpl/load.xpl) dtd-validate=false
ERROR: xproc-util/load/xpl/load.xpl:0:load-error:Could not load file:/usr/src/docx2tex/conf/conf.csv (file:///usr/src/docx2tex/xproc-util/load/xpl/load.xpl) dtd-validate=false
Message: Mode: insert-xpath
Message: Mode: docx2hub:preprocess-styles
Message: Mode: docx2hub:resolve-tblBorders
Message: Mode: docx2hub:add-props
Message: Mode: docx2hub:props2atts
Message: Mode: docx2hub:remove-redundant-run-atts
Message: Mode: docx2hub:join-instrText-runs
Message: Mode: docx2hub:field-functions
Message: Mode: wml-to-dbk
Message: Mode: docx2hub:join-runs
Message: Mode: hub:twipsify-lengths
Message: Mode: hub:split-at-tab
Message: Mode: hub:identifiers
Message: Mode: hub:tabs-to-indent
Message: Mode: hub:handle-indent
Message: Mode: hub:prepare-lists
Message: Mode: hub:lists
Message: Mode: hub:postprocess-lists
Message: Mode: docx2tex-preprocess
Message: Mode: docx2tex-postprocess
INFO : cascade/xpl/load-cascaded.xpl:43:59:load-cascaded: using file:/usr/src/docx2tex/xml2tex/xsl/xml2tex.xsl
INFO : cascade/xpl/load-cascaded.xpl:43:59:load-cascaded: using file:/usr/src/docx2tex/xml2tex/xsl/calstable2tabular.xsl
WARN : file:///usr/src/docx2tex/xslt-util/functx/xsl/functx.xsl:35:66:Stylesheet module http://transpect.io/xslt-util/functx/xsl/functx.xsl is included or imported more than once. This is permitted, but may lead to errors or unexpected behavior
INFO : cascade/xpl/load-cascaded.xpl:43:59:load-cascaded: using file:///usr/src/docx2tex/mml-normalize/xsl/mml-normalize.xsl
Message: Mode: mml2tex-grouping
Message: Mode: mml2tex-preprocess
INFO : cascade/xpl/load-cascaded.xpl:43:59:load-cascaded: using file:/usr/src/docx2tex/mml2tex/xsl/invoke-mml2tex.xsl
WARN : err:SXXP0005:The source document is in namespace http://docbook.org/ns/docbook, but none of the template rules match elements in this namespace (Use --suppressXsltNamespaceCheck:on to avoid this warning)
Message: Mode: escape-bad-chars
Message: Stylesheet compilation failed: 2 errors reported
Message: [FATAL ERROR]: XSLT mode 'escape-bad-chars' failed due to conversion errors.
ERROR: xproc-util/xslt-mode/xpl/xslt-mode.xpl:0:xslt-mode-escape-bad-chars:Stylesheet compilation failed: 2 errors reported
ERROR: xproc-util/xslt-mode/xpl/xslt-mode.xpl:0:xslt-mode-escape-bad-chars:Stylesheet compilation failed: 2 errors reported
ERROR: xproc-util/xslt-mode/xpl/xslt-mode.xpl:0:xslt-mode-escape-bad-chars:Stylesheet compilation failed: 2 errors reported
ERROR: Unknown error
Java version:
java -version
java version "1.8.0_201"
Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)
Please note that spaces after comments are lost.
Comment created with "Review" -> "New Comment" (Word 2010).
MWE: https://medialab.sissa.it/owncloud/index.php/s/bjGGPuQemdt6maH
I get ! Argument of \align* has an extra }
when I run d2t -d -p test.docx
.
There is a Stackexchange question related to this.
I’d probably go with the {$\begin{aligned}…\end{aligned}$}
approach proposed there.
Improper TeX output is being generated for the attached DOCX file .
Overfull \hbox (18.6093pt too wide) in paragraph at lines 65--66
\OT1/cmr/m/n/10.95 ovial-sarkome. Die h^^?aufigsten We-ichteil-sarkome des Erwa
ch-se-nen sind in Tabelle 1 aufgef^^?uhrt.
4 [5] [6]
Chapter 3.
! Undefined control sequence.
l.74 ... 3-gradige Klassifikationsschema der {\grq
}French Federation of Canc...
We have DOCX files where the authors often embed Powerpoint files.
This case is not handler properly.
! LaTeX Error: Unknown graphics extension: .emf.
See the LaTeX manual or LaTeX Companion for explanation.
Type H <return> for immediate help.
...
l.429 ...16t125157.docx.tmp/word/media/image1.emf}
? x
Ideally .emf files would converted to proper SVGs or PNGs.
If this is not possible they should be removed and not carried forward the LaTeX output
Perhaps removed image could be replace with a placeholder or a warning message.
(/opt/local/share/texmf-texlive/tex/latex/latexconfig/epstopdf-sys.cfg))
(/opt/local/share/texmf-texlive/tex/latex/hyperref/nameref.sty
(/opt/local/share/texmf-texlive/tex/generic/oberdiek/gettitlestring.sty))
Chapter 1.
! Undefined control sequence.
l.37 ...den Sie daf"{u}r die Formatvorlage {\glqq
}"{U}berschrift 1{\grqq}.
I can send you the related DOCX file by private email.
I tried to change word files into latex files.
but failed.
fail message is
"...FATAL: Failed to parse Saxon configuration file.
java.nio.file.InvalidPathException: Illegal char <*> at index 96: C:\Users\alpac\Documents\GitHub\docx2tex\calabash\extensions\transpect\javascript-extension\lib*..."
help me, please...OTL
Hi, I am looking for the source code that generated from google doc latex document.
I wonder if this is the script used by docx2latex.com, with this script I am not getting the same result and perhaps I am missing something.
Thanks in advance,
I would like to obtain that in the output tex file certain chars stays in utf8 (à) and are not translated into latex macros ('a). Is this possible?
I tried to look in the <charmap>
of conf.xml, but the chars that I want are not there.
I was looking at fontmaps, but, if I understand correctly, I want the opposite of what they would do.
thanks
aaa.docx
[][]
Underfull \hbox (badness 10000) in alignment at lines 90--90
[][][]
[1{/usr/local/texlive/2015/texmf-var/fonts/map/pdftex/updmap/pdftex.map}]
Overfull \hbox (1.19997pt too wide) in alignment at lines 102--102
[][]
Overfull \hbox (1.19997pt too wide) in alignment at lines 114--114
[][]
Overfull \hbox (1.19997pt too wide) in alignment at lines 128--128
[][]
Overfull \hbox (1.19997pt too wide) in alignment at lines 141--141
[][]
! Undefined control sequence.
<argument> \Micro
_{0}ɛ_{0}
l.154 \end{tabularx}
?
Hello,
After updating jvm, I can no longer use docx2tex to produce tex files from docx files with the ./d2t command. I use mac OS via terminal for conversions and currently I have the following version of java:
java version "1.8.0_211"
Java (TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot (TM) 64-Bit Server VM (build 25.211-b12, mixed mode).
I would like to know if the problem is really with the java version of my computer, if someone else has already encountered this problem and, if possible, what solution should I take to remedy the problem.
Thank you very much in advance.
Follow the log generated.
2 Exception in thread "main" java.lang.NoClassDefFoundError: javax/activation/ DataSource
3 at java.base/java.lang.Class.getDeclaredMethods0(Native Method)
4 at java.base/java.lang.Class.privateGetDeclaredMethods(Class.java:31 72)
5 at java.base/java.lang.Class.getMethodsRecursive(Class.java:3313)
6 at java.base/java.lang.Class.getMethod0(Class.java:3299)
7 at java.base/java.lang.Class.getMethod(Class.java:2112)
8 at com.xmlcalabash.core.XProcRuntime.initializeSteps(XProcRuntime.ja va:317)
9 at com.xmlcalabash.core.XProcRuntime.(XProcRuntime.java:272)
10 at com.xmlcalabash.drivers.Main.run(Main.java:100)
11 at com.xmlcalabash.drivers.Main.main(Main.java:83)
12 Caused by: java.lang.ClassNotFoundException: javax.activation.DataSource
13 at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Builti nClassLoader.java:583)
14 at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadCla ss(ClassLoaders.java:178)
15 at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
16 ... 9 more
The current version of the d2t.bat file doesn't work correctly it has 2 issues:
The attached patch fixes both these issues
windows_fixes.diff.txt
Hi, I'm trying to install docx2tex on my Mac running El Capitain. Towards the end I get this:
Submodule path 'mml2tex': checked out '03430be79a70b283679cfc1cb1529da5a044f41f'
Cloning into 'schema/hub'...
The authenticity of host 'github.com (192.30.252.130)' can't be established.
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'github.com,192.30.252.130' (RSA) to the list of known hosts.
Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Clone of '[email protected]:le-tex/Hub.git' into submodule path 'schema/hub' failed
I'm not sure if this is a problem on my end or not, as I'm something of a newbie to Github.
─diamon@diamon-ThinkPad-13 ~/projects/docx2tex ‹system› ‹master*›
╰─$ ./d2t test.docx
starting docx2tex
Errors encountered while running docx2tex. Please see /home/diamon/projects/docx2tex/test.d2t.log for details.
╭─diamon@diamon-ThinkPad-13 ~/projects/docx2tex ‹system› ‹master*›
╰─$ cat test.d2t.log 1 ↵
cp: 'test.docx' и '/home/diamon/projects/docx2tex/test.docx' - один и тот же файл
Exception in thread "main" java.lang.NoClassDefFoundError: javax/activation/DataSource
at java.base/java.lang.Class.getDeclaredMethods0(Native Method)
at java.base/java.lang.Class.privateGetDeclaredMethods(Class.java:3119)
at java.base/java.lang.Class.getMethodsRecursive(Class.java:3260)
at java.base/java.lang.Class.getMethod0(Class.java:3246)
at java.base/java.lang.Class.getMethod(Class.java:2065)
at com.xmlcalabash.core.XProcRuntime.initializeSteps(XProcRuntime.java:317)
at com.xmlcalabash.core.XProcRuntime.(XProcRuntime.java:272)
at com.xmlcalabash.drivers.Main.run(Main.java:100)
at com.xmlcalabash.drivers.Main.main(Main.java:83)
Caused by: java.lang.ClassNotFoundException: javax.activation.DataSource
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:190)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:499)
... 9 more
Hi have some errors translating the following file:
https://medialab.sissa.it/owncloud/index.php/s/R3Fc2PXHdlK2LEy
I think I have the latest version of docx2tex.
I get the attached log
JINST_079P_1018.d2t.log
I am trying to convert a .docx file that is an article with equations, figures, and even references introduced with Endnote.
Using the master (since with the last pre-release (0.3), I was getting the same reported error that was solved recently), and running:
$ docx2tex-master/d2t -o test test.docx
I am getting the following errors:
ERROR: docx2tex-master/xproc-util/load/xpl/load.xpl:0:load-error:Could not load file:/usr/people/jmdamas/docx2tex-master/conf/conf.csv (file:///usr/people/jmdamas/docx2tex-master/xproc-util/load/xpl/load.xpl) dtd-validate=false
ERROR: file:///usr/people/jmdamas/docx2tex-master/mml2tex/xsl/mml2tex.xsl:339:err:XPTY0004:An empty sequence is not allowed as the third argument of replace()
ERROR: An empty sequence is not allowed as the third argument of replace()
ERROR: cause: file:///usr/people/jmdamas/docx2tex-master/mml2tex/xsl/mml2tex.xsl:339:err:XPTY0004:An empty sequence is not allowed as the third argument of replace()
ERROR: An empty sequence is not allowed as the third argument of replace()
ERROR: cause: file:///usr/people/jmdamas/docx2tex-master/mml2tex/xsl/mml2tex.xsl:339:err:XPTY0004:An empty sequence is not allowed as the third argument of replace()
ERROR: Pipeline failed: An empty sequence is not allowed as the third argument of replace()
ERROR: Underlying exception: An empty sequence is not allowed as the third argument of replace()
In the first ERROR, I don't understand why the file can't be loaded, since it is there.
The other errors are all related to a replace function, but I can't understand the origin.
I tried to run with an a shorter version of the .docx (just the first 5 or 6 pages, with some equations), and I didn't get any errors. I tried to remove the Endnote references only (I thought they might be a problem) and tested it, and it gave me the errors again. I could go in a trial-and-error mode, trying to identify which part of the document is causing the problem, but I don't think that's a solution.
Can you give me some tips on how to solve this?
Oh, I am running this on an Ubuntu 12.04 with JAVA 1.7.0_80.
Meanwhile, I am using an older (?) version of this software in codeplex (https://docx2tex.codeplex.com/releases/view/19618), that is working well on Windows.
Why is there a default value of conf.csv when the pipeline actually expects an XML file (that it then loads with tr:load)?
The translation of the attached file results in a lot of macros in the following form:
\TimesNewRoman{41 43...
As soon as I open the docx in word and save it, the problem disappears.
Might be related to issues/25
(please do not distribute the attached file)
wj.docx
Hi guys,
I am having this issue. Does this sounds familiar to you?
./d2t -o tmpp ~/workspace/ets/phd/thesis/versions/v1.11.docx
starting docx2tex
Errors encountered while running docx2tex. Please see /Users/david/opt/docx2tex/tmpp/v1.11.d2t.log for details.
Log file:
Message: Mode: insert-xpath
ERROR: file:/Users/david/opt/docx2tex/docx2hub/xsl/insert-xpath.xsl:223:err:XTTE0780:An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: cause: file:/Users/david/opt/docx2tex/docx2hub/xsl/insert-xpath.xsl:223:err:XTTE0780:An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: cause: file:/Users/david/opt/docx2tex/docx2hub/xsl/insert-xpath.xsl:223:err:XTTE0780:An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: cause: file:/Users/david/opt/docx2tex/docx2hub/xsl/insert-xpath.xsl:223:err:XTTE0780:An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: cause: file:/Users/david/opt/docx2tex/docx2hub/xsl/insert-xpath.xsl:223:err:XTTE0780:An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: Pipeline failed: An empty sequence is not allowed as the result of function tr:theme-font()
ERROR: Underlying exception: An empty sequence is not allowed as the result of function tr:theme-font()
I am using the master branch.
In fact I don't really care about the system font. I don't if there is way to ignore this error and continue?
Hi, it is possible that some error crept in with commit e219cf8
I have a working d2t until commit 53d293b, but with the next one I get some errors.
Example:
git checkout --recurse-submodules 53d293b
d2t Untitled.docx
docx2tex finished.
git checkout --recurse-submodules e219cf8
d2t Untitled.docx
Errors encountered while running docx2tex. Please see Untitled.d2t.log for details.
I want to use docx2tex to test whether it can convert mathtype equation to latex.
I use the most recent docx2tex release.
I got error message as below:
FATAL: Failed to parse Saxon configuration file.
java.nio.file.InvalidPathException: Illegal char <*> at index 67: C:\docx2tex\calabash\extensions\transpect\javascript-extension\lib*
This is my docx file.
equation.docx
Please note that, in a fragment similar to the following,
the spaces after "Instrum." and after "88" are lost
and the boldface of "88" is also lost (or applied to the whole hyperlink maybe)
...<w:t xml:space="preserve">Rev. Sci. Instrum. </w:t></w:r>
<w:r w:rsidR="00A51604" w:rsidRPr="002E01AB"><w:rPr><w:rStyle w:val="af1"/><w:b/><w:bCs/><w:color w:val="auto"/><w:lang w:val="en-US"/></w:rPr>
<w:t xml:space="preserve">88 </w:t>...
TeX becomes:
Rev. Sci. Instrum.88(2017) 033504
MWE here:
https://medialab.sissa.it/owncloud/index.php/s/qPdO4qMBWdU28RH
This is a minor issue, but when I run doc2tex
with the -o option, the *.tmp folder, which contains stuff like the media folder with the images to be inserted, is placed outside the -o folder, so the path for the images is wrong and they are not loaded when the .tex is compiled.
Cheers
Following is the error I get when I run it in Windows 10 using this command
d2t.bat filename.docx
ERROR: There is a binding for the port 'conf' but the pipeline declares no such port.
Probably similar to issue #24
In the file below, the space after \overline{g} and \overline{n} is "lost".
(I think that docx processors should "trim" equations, so I'm not sure if this is a d2t error, but I'm reporting because the final appearance is different between docx and tex)
https://medialab.sissa.it/owncloud/index.php/s/t1f3dW0Vop7DZih
Please note that in the translation of the file given below, figure 4 and its caption become (line 89 in the tex file):
\等线{46 69 ...
There is something similar at line 330.
I tried to isolate the problematic part only, but I get a different output (below), so I'm providing the whole document, but please do not distribute it
\DengXian{46 69 ...
Problematic file and translation:
https://medialab.sissa.it/owncloud/index.php/s/4APWPgtO5slLkuA
[OT] I'm opening many issues, please feel free to stop me if I'm too pesky :-)
The reference file is the same as #3.
I am not sure of how doc2tex
should be dealing with these issues, but while symbols like the alpha or beta characters are converted to $\alpha$
or $\beta$
, other symbols are not being recognized. Some examples include the minus or times signs, apostrophe (for example, see «Kramers'» in the file), or tildes (which should be converted to \~{}
or \textasciitilde{}
).
Moreover, some accented characters, like in my name, are not being recognized, but comparing with the output from codeplex's doc2tex
, I identified this problem to be the lack of \usepackage[utf8]{inputenc}
in the preamble. Am I correct?
About the lone symbols, is there anything that can be done?
Also, can doc2tex
recognize the differences between types of dashes (see here)?
Cheers
I've added
< !-- soft hyphen -->
<char character="­" string="\-"/>
in conf.xml on l.613
and now the translation seems good
Please find a MWE here:
https://medialab.sissa.it/owncloud/index.php/s/9v5oRQ48oXZnZv8
P.S. I've just started using docx2tex and it seems very very good. Thanks a lot :-)
docx2text running on
http://public.zopyx.com/lungenkarzinom-nicht-kleinzellig-nsclc.docx
generates improper LaTeX....possibly an improper DOCX structure however the converter should
perhaps not generate improper output but add some logging message to the log.
[Loading MPS to PDF converter (version 2006.09.02).]
) (/opt/local/share/texmf-texlive/tex/latex/oberdiek/epstopdf-base.sty
(/opt/local/share/texmf-texlive/tex/latex/oberdiek/grfext.sty)
(/opt/local/share/texmf-texlive/tex/latex/latexconfig/epstopdf-sys.cfg))
(/opt/local/share/texmf-texlive/tex/latex/hyperref/nameref.sty
(/opt/local/share/texmf-texlive/tex/generic/oberdiek/gettitlestring.sty))
[1{/opt/local/var/db/texmf/fonts/map/pdftex/updmap/pdftex.map}] [2]
Chapter 1.
! LaTeX Error: Lonely \item--perhaps a missing list environment.
See the LaTeX manual or LaTeX Companion for explanation.
Type H for immediate help.
...
l.29 2.\item \chapter
{Grundlagen}
? c
Type to proceed, S to scroll future error messages,
R to run without stopping, Q to run quietly,
I to insert something, E to edit your file,
1 or ... or 9 to ignore the next 1 to 9 tokens of input,
H for help, X to quit.
Message: docx2hub error on unzipping.
Zip file seems to be corrupted: /infektionen-bei-haematologischen-und-onkologischen-patienten-uebersicht.docx (No such file or directory)
ERROR: err:XD0001:Only whitespace text nodes can appear at the top level in a document
ERROR: err:XD0001:Only whitespace text nodes can appear at the top level in a document
ERROR: err:XD0001:Only whitespace text nodes can appear at the top level in a document
ERROR: It is a dynamic error if a non-XML resource is produced on a step output or arrives on a step input.
I can provide the sample file by email since Github does not support DOCX uploads.
The issue appears to be specific to MacOSX. Converting the same file on Linux works.
I am receiving the following error for a given DOCX document (sorry, I can not provide the source
due to non-disclosure reasons).
Package hyperref Warning: Suppressing link with empty target on input line 59.
Package hyperref Warning: Suppressing link with empty target on input line 61.
! LaTeX Error: Lonely \item--perhaps a missing list environment.
See the LaTeX manual or LaTeX Companion for explanation.
Type H for immediate help.
...
l.61 \href{}{1.1}\item \href
{}{Besondere Darstellungen im Handbuch }
58
59 \href{}{1. Aufbau des Handbuchs }
60
61 \href{}{1.1}\item \href{}{Besondere Darstellungen im Handbuch }
62
63 \href{}{1.2}\item \href{}{Zielgruppe }
64
65 \href{}{1.3}\item \href{}{Die Themenabschnitte des Handbuchs im "{U}berblick }
Fresh installation using:
git clone https://github.com/transpect/docx2tex --recursive
Any conversion with d2t gives me the same error
cp: ‘/tmp/docx-samples/160229_Wolff_Sensor_Technologien/all/1_Einleitung.docx’ and ‘/tmp/docx-samples/160229_Wolff_Sensor_Technologien/all/1_Einleitung.docx’ are the same file
ERROR: xml2tex/xpl/xml2tex.xpl:71:65:err:XS0052:Cannot import: http://transpect.io/mml2tex/xpl/mml2tex.xpl
ERROR: cause: I/O error reported by XML parser processing http://transpect.io/mml2tex/xpl/mml2tex.xpl: http://transpect.github.io/mml2tex/xpl/mml2tex.xpl
ERROR: It is a static error if the URI of a p:import cannot be retrieved or if, once retrieved, it does not point to a p:library, p:declare-step, or p:pipeline.
ERROR: Underlying exception: I/O error reported by XML parser processing http://transpect.io/mml2tex/xpl/mml2tex.xpl: http://transpect.github.io/mml2tex/xpl/mml2tex.xpl
In the two documents from the link below, the translation looses most (all?) newlines:
https://medialab.sissa.it/owncloud/index.php/s/dCH1jidpubcSFQM
I tried a simple edit, save and translate again without success.
The reference file is the same as #3.
I am experiencing some issues with subscripts and superscripts in the reference file.
\textit{k}$_cat$/
instead of \textit{k}$_{cat}$/
.$^−1$
(maybe related with issue #5?)C$^$\alpha$$
. This screws things and a lot of the non-equation text following it is pulled into the equation mode.Cheers
In the following docx file, the space between "40" and "MHz" is lost:
https://medialab.sissa.it/owncloud/index.php/s/zkxFGDvNAehVatl
I'm not sure if it is an error, but I'm reporting it because the appearance of the tex/pdf and docx file differ.
Bug Report:
My OS: Linux Gentoo Base System release 2.24.1.12 64 bit PC desktop
Java: 1.8.0_66
Shell: bash 4.3.42 (x86_64-pc-linux-gnu)
Install: cd /home/el/bin; git clone https://github.com/transpect/docx2tex --recursive
The input docx has a few unicode shenanigans, but nothing too out of band: http://www.filedropper.com/examplefail
Run you code: cd /home/el/bin/docx2tex; ./d2t ExampleFail.docx
Failure .log File: http://www.filedropper.com/examplefaild2t
What I expected: I expected some kind of output file ExampleFail.tex
output containing latex code.
Quarantining the bug, proving the bug isn't on my side:
Use libreoffice version 5.2.3.3
-writer to create an new empty .docx document containing the ascii text asdf
.
Save the above file as Untitled.docx
using format Microsoft Word 2007-2013 XML (.docx)
format.
Openoffice -writer produces this Untitled.docx: http://www.filedropper.com/untitled_22
Run the code: cd /home/el/bin/docx2tex; ./d2t Untitled.docx
docx2tex works as expected, the contents of Untitled.tex
render by pdflatex to a similar looking pdf:
The problem is in the table layouts.
The reference file is the same as #3.
doc2tex
is recognizing this and using the \underset{<limits>}{\sum}
instead of the more common \sum\nolimits
. In fact, this is also happening in the equations not in-text, but I am guessing that's because they are inside the tabular environment.\substack
(maybe \mathclamp
could also help here)Cheers
In the following file, a newline is lost between "...Fast ICA Algorithm" and "FastICA disintegrates...":
https://medialab.sissa.it/owncloud/index.php/s/PXm3ktw0LFYXVti
However, please note that in the original docx, there are a couple of spaces missing in "3.1Denoisingby" (probably a typo), but when I tried to add them (to get "3.1 Denoising by") and save the file, the missing newline magically appears in the TeX translation and I can see no error.
Hi, i have tons of docx files to transform but this project is very time-consuming. I found it generates some temporary files in the process, i guess this may be the problem. I am not good at shell, could you please offer a solution for me? many thanks.
I get the following error when I try to translate this file:
INFO : file:..docx2tex/xpl/docx2tex.xpl:188:38:No custom-font-maps loaded.
ERROR: file:...docx2tex/xproc-util/load/xpl/load.xpl:0:load-error:Could not load file:...docx2tex/conf/conf.csv (file:...docx2tex/xproc-util/load/xpl/load.xpl) dtd-validate=false
...
https://medialab.sissa.it/owncloud/index.php/s/BZFHlref5mB3uAS
(I think my installation is ok, because I can translate other documents)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.