Giter Site home page Giter Site logo

texmath's Introduction

texmath

CI tests

texmath is a Haskell library for converting between formats used to represent mathematics. Currently it provides functions to read and write TeX math, presentation MathML, and OMML (Office Math Markup Language, used in Microsoft Office), and to write Gnu eqn, typst, and pandoc's native format (allowing conversion, using pandoc, to a variety of different markup formats). The TeX reader and writer supports basic LaTeX and AMS extensions, and it can parse and apply LaTeX macros. The package also includes several utility modules which may be useful for anyone looking to manipulate either TeX math or MathML. For example, a copy of the MathML operator dictionary is included.

You can try it out online here.

By default, only the Haskell library is installed. To install a test program, texmath, use the executable Cabal flag:

cabal install -fexecutable

By default, the executable will be installed in ~/.cabal/bin.

Alternatively, texmath can be installed using stack. Install the stack binary somewhere in your path. Then, in the texmath repository,

stack setup
stack install --flag texmath:executable

The texmath binary will be put in ~/.local/bin.

Macro definitions may be included before a LaTeX formula.

Running texmath as a server

texmath will behave as a CGI script when called under the name texmath-cgi (e.g. through a symbolic link). The file cgi/texmath.html contains an example of how it can be used.

But it is also possible to compile a full webserver with a JSON API. To do this, set the server cabal flag, e.g.

stack install --flag texmath:server

To run the server on port 3000:

texmath-server -p 3000

Sample of use, with httpie:

% http --verbose localhost:3000/convert text='2^2' from=tex to=mathml display:=false Accept:'text/plain'
POST /convert HTTP/1.1
Accept: text/plain
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 64
Content-Type: application/json
Host: localhost:3000
User-Agent: HTTPie/3.1.0

{
    "display": false,
    "from": "tex",
    "text": "2^2",
    "to": "mathml"
}


HTTP/1.1 200 OK
Content-Type: text/plain;charset=utf-8
Date: Mon, 21 Mar 2022 18:29:26 GMT
Server: Warp/3.3.17
Transfer-Encoding: chunked

<math display="inline" xmlns="http://www.w3.org/1998/Math/MathML">
  <msup>
    <mn>2</mn>
    <mn>2</mn>
  </msup>
</math>

Possible values for from are tex, mathml, and omml. Possible values for to are tex, mathml, omml, eqn, and pandoc (JSON-encoded Pandoc).

Alternatively, you can use the convert-batch endpoint to pass in a JSON-encoded list of conversions and get back a JSON-encoded list of results.

Generating lookup tables

There are three main lookup tables which are built form externally compiled lists. This section contains information about how to modify and regenerate these tables.

In the lib direction there are two sub-directories which contain the necessary files.

MMLDict.hs

The utility program xsltproc is required. You can find these files in lib/mmldict/

  1. If desired replace unicode.xml with and updated version (you can download a copy from here
  2. xsltproc -o dictionary.xml operatorDictionary.xsl unicode.xml
  3. runghc generateMMLDict.hs
  4. Replace the operator table at the bottom of src/Text/TeXMath/Readers/MathML/MMLDict.hs with the contents of mmldict.hs

ToTeXMath.hs

You can find these files in lib/totexmath/

  1. If desired, replace unimathsymbols.txt with an updated version from here
  2. runghc unicodetotex.hs
  3. Replace the record table at the bottom of src/Text/TeXMath/Unicode/ToTeXMath.hs with the contents of UnicodeToLaTeX.hs

ToUnicode.hs

You can find these files in lib/tounicode/.

  1. If desired, replace UnicodeData.txt with an updated verson from here.
  2. runghc mkUnicodeTable.hs
  3. Replace the table at the bottom of src/Text/TeXMath/Unicode/ToUnicode.hs with the output.

Editing the tables

It is not necessary to edit the source files to add records to the tables. To add to or modify a table it is easier to add modify either unicodetotex.hs or generateMMLDict.hs. This is easily achieved by adding an item to the corresponding updates lists. After making the changes, follow the above steps to regenerate the table.

The test suite

To run the test suite, do cabal test or stack test.

In its standard mode, the test suite will run golden tests of the individual readers and writers. Reader tests can be found in test/reader/{mml,omml,tex}, and writer tests in test/writer/{eqn,mml,omml,tex}. Regression tests linked to specific issues are in test/regression.

Each test file consists of an input and an expected output. The input begins after a line <<< FORMAT and the output begins after a line >>> FORMAT.

If many tests fail as a result of changes, but the test failures are all because of improvements in the output, you can pass --accept to the test suite (e.g., with --test-arguments=--accept on stack test), and the existing golden files will be overwritten. If you do this, inspect the outputs very carefully to make sure they are correct.

If you pass the --roundtrip option into the test suite (e.g., using --test-arguments=--roundtrip with stack test), round-trip tests will be run instead. Many of these will fail. In these tests, the native inputs in test/roundtrip/*.native will be converted to (respectively) mml, omml, or tex, then converted back, and the result will be compared with the starting point. Although we don't guarantee that this kind of round-trip transformation will be the identity, looking at cases where it fails can be a guide to improvements.

Authors

John MacFarlane wrote the original TeX reader, MathML writer, Eq writer, and OMML writer. Matthew Pickering contributed the MathML reader, the TeX writer, and many of the auxiliary modules. Jesse Rosenthal contributed the OMML reader. Thanks also to John Lenz for many contributions.

texmath's People

Contributors

0xflotus avatar cartazio avatar despresc avatar hagb avatar hvr avatar jgm avatar jkr avatar megakite avatar meimax avatar micket avatar minoki avatar mpickering avatar rekka avatar rwst avatar siemanko avatar sjakobi avatar tarleb avatar wilx avatar xou avatar zopa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

texmath's Issues

Line break elements omitted converting mathml to latex

Hello there. I found there is an issue about converting mathml to latex.When the mathml marks contain a line break like

<mspace linebreak="newline"/>

. There is no corresponding line break in the latex marks, so the latex marks will not break the line.
Will you fix it? THX!

Render amsmath symbols in mathml output

I use pandoc to write notes during lectures and make heavy use of inline Tex math. While I use kokoi for live preview during editing, which produces html5 files including MathML formulas, the end result shall be saved as pdf using XeLaTex.
I encountered that math containing symbols from amsmath isn't redered to MathML.

Example:
$5 \equiv 1 (\mod{2})$ where \mod{} is part of amsmath.

First, is it even possible to render amsmath symbols in MathML or doesn't MathML itself support it? Second, are there plans to support MathML output of amsmath in texmath?

Regards

`\rightarrow` should become `\to` in limits

ping @mpickering

ESymbol Rel \8594 is interpreted as \rightarrow in limits. But for the LaTeX to be typeset correctly, it should be translated to \to:

> let Right x = readTeXMath "\\lim_{x \\to \\infty} y"
> x
[EDown (EMathOperator "lim") (EGrouped [EIdentifier "x",ESymbol Rel "\8594",ESymbol Ord "\8734"]),EIdentifier "y"]
> toTeXMath DisplayInline x
"{\\lim}_{{x}\\rightarrow \\infty }{y}"

This probably only needs to be implemented for

EDown (EMathOperator "lim") ...

although there might be some uses for implementing it for

EDown (EMathOperator s) ... | s `elem` ["lim", "max", "min"]

(That would take care of the main things that should be typeset like that, even if off the top of my head I can't make any useful sense of "max as n approaches...")

I'd submit a pull request, but I'm not sure of the workflow here. @jgm -- would you prefer these things go through your pr queue, or to maintainer and then up to you, kernel style?

Handling decimal point

Summary

Decimal point isn't properly handle.

Steps To Reproduce

$ pandoc --mathml <<EOF
\$0.1\$
EOF

Actual Results

<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0</mn><mo>.</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">0.1</annotation></semantics></math></p>

Expected Results

<p><math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>0.1</mn></mrow><annotation encoding="application/x-tex">0.1</annotation></semantics></math></p>

Instead of <mn>0</mn><mo>.</mo><mn>1</mn> we should have <mn>0.1</mn>.

Environment Information

$ uname -a
Linux pupunha 3.19.3-3-ARCH #1 SMP PREEMPT Wed Apr 8 14:10:00 CEST 2015 x86_64 GNU/Linux
$ cd /path/to/texmath
$ git log -1
commit f0499d0652c9e68f0e2c8ea5601a217b88528de5
Author: John MacFarlane <[email protected]>
Date:   Sun Apr 12 20:35:34 2015 -0700

    Updated changelog.
$ cd /path/to/pandoc
$ git log -1
commit 4b2f469994143728c2ffe22aa37370b90a054e15
Merge: 55b7afc 7031748
Author: John MacFarlane <[email protected]>
Date:   Wed Apr 29 16:17:40 2015 -0700

    Merge pull request #2123 from hellofloat/master

    Added woff2 to MIME types

texmath tests fail with xml 1.3.12

Hi,

The texmath tests run fine with xml 1.3.10. With xml 1.3.12
the texmath tests fail, the results are included below (for
ghc 7.2.2, they also fail with ghc 7.4.1-rc1). This is with:

LANG=en_AU.UTF-8
LC_CTYPE=en_AU.UTF-8

  • Package: dev-haskell/texmath-0.5.0.4
  • Repository: gentoo-haskell
  • Maintainer: [email protected]
  • USE: amd64 consolekit doc elibc_glibc hoogle hscolour kernel_linux multilib policykit profile test userland_GNU
  • FEATURES: nostrip sandbox splitdebug test userpriv usersandbox

    Unpacking source...
    Unpacking texmath-0.5.0.4.tar.gz to /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work
    Source unpacked in /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work
    Preparing source in /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work/texmath-0.5.0.4 ...
    Source prepared.
    Configuring source in /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work/texmath-0.5.0.4 ...

  • Using cabal-1.11.1.20110721.
    /usr/bin/ghc -package Cabal-1.11.1.20110721 --make /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work/texmath-0.5.0.4/Setup.hs -dynamic -o setup
    [1 of 1] Compiling Main ( /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work/texmath-0.5.0.4/Setup.hs, /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work/texmath-0.5.0.4/Setup.o )

/var/tmp/portage/dev-haskell/texmath-0.5.0.4/work/texmath-0.5.0.4/Setup.hs:1:1:
Warning: In the use of `runTests'
(imported from Distribution.Simple, but defined in Distribution.Simple.UserHooks):
Deprecated: "Please use the new testing interface instead!"
Linking setup ...
./setup configure --ghc --prefix=/usr --with-compiler=/usr/bin/ghc --with-hc-pkg=/usr/bin/ghc-pkg --prefix=/usr --libdir=/usr/lib64 --libsubdir=texmath-0.5.0.4/ghc-7.2.2 --datadir=/usr/share/ --datasubdir=texmath-0.5.0.4/ghc-7.2.2 --with-haddock=/usr/bin/haddock --enable-library-profiling --ghc-option=-optl-Wl,-O1 --ghc-option=-optl-Wl,--as-needed --disable-executable-stripping --docdir=/usr/share/doc/texmath-0.5.0.4 --verbose --flags=-cgi --flags=test
Configuring texmath-0.5.0.4...
Flags chosen: test=True, cgi=False
Dependency base ==4.*: using base-4.4.1.0
Dependency containers -any: using containers-0.4.1.0
Dependency parsec >=2: using parsec-3.1.2
Dependency syb -any: using syb-0.3.6
Dependency xml -any: using xml-1.3.12
Using Cabal-1.11.1.20110721 compiled by ghc-7.2
Using compiler: ghc-7.2.2
Using install prefix: /usr
Binaries installed in: /usr/bin
Libraries installed in: /usr/lib64/texmath-0.5.0.4/ghc-7.2.2
Private binaries installed in: /usr/libexec
Data files installed in: /usr/share/texmath-0.5.0.4/ghc-7.2.2
Documentation installed in: /usr/share/doc/texmath-0.5.0.4
Using alex version 3.0.1 found on system at: /usr/bin/alex
Using ar found on system at: /usr/bin/ar
Using c2hs version 0.16.3 found on system at: /usr/bin/c2hs
Using cpphs version 1.12 found on system at: /usr/bin/cpphs
No ffihugs found
Using gcc version 4.5.3 found on system at: /usr/bin/gcc
Using ghc version 7.2.2 given by user at: /usr/bin/ghc
Using ghc-pkg version 7.2.2 given by user at: /usr/bin/ghc-pkg
No greencard found
Using haddock version 2.9.4 given by user at: /usr/bin/haddock
Using happy version 1.18.8 found on system at: /usr/bin/happy
No hmake found
Using hsc2hs version 0.67 found on system at: /usr/bin/hsc2hs
Using hscolour version 1.19 found on system at: /usr/bin/HsColour
No hugs found
No jhc found
Using ld found on system at: /usr/bin/ld
No lhc found
No lhc-pkg found
No nhc98 found
Using pkg-config version 0.26 found on system at: /usr/bin/pkg-config
Using ranlib found on system at: /usr/bin/ranlib
Using strip found on system at: /usr/bin/strip
Using tar found on system at: /bin/tar
Using texmath found on system at: dist/build/texmath/texmath
Using texmath-cgi found on system at: dist/build/texmath-cgi/texmath-cgi
No uhc found

Source configured.
Compiling source in /var/tmp/portage/dev-haskell/texmath-0.5.0.4/work/texmath-0.5.0.4 ...
./setup build
Building texmath-0.5.0.4...
Preprocessing executable 'texmath' for texmath-0.5.0.4...
[1 of 6] Compiling Text.TeXMath.Types ( Text/TeXMath/Types.hs, dist/build/texmath/texmath-tmp/Text/TeXMath/Types.o )
[2 of 6] Compiling Text.TeXMath.MathML ( Text/TeXMath/MathML.hs, dist/build/texmath/texmath-tmp/Text/TeXMath/MathML.o )
[3 of 6] Compiling Text.TeXMath.Parser ( Text/TeXMath/Parser.hs, dist/build/texmath/texmath-tmp/Text/TeXMath/Parser.o )
[4 of 6] Compiling Text.TeXMath.Macros ( Text/TeXMath/Macros.hs, dist/build/texmath/texmath-tmp/Text/TeXMath/Macros.o )
[5 of 6] Compiling Text.TeXMath ( Text/TeXMath.hs, dist/build/texmath/texmath-tmp/Text/TeXMath.o )
[6 of 6] Compiling Main ( texmath.hs, dist/build/texmath/texmath-tmp/Main.o )
Linking dist/build/texmath/texmath ...
Preprocessing library texmath-0.5.0.4...
[1 of 5] Compiling Text.TeXMath.Macros ( Text/TeXMath/Macros.hs, dist/build/Text/TeXMath/Macros.o )
[2 of 5] Compiling Text.TeXMath.Types ( Text/TeXMath/Types.hs, dist/build/Text/TeXMath/Types.o )
[3 of 5] Compiling Text.TeXMath.MathML ( Text/TeXMath/MathML.hs, dist/build/Text/TeXMath/MathML.o )
[4 of 5] Compiling Text.TeXMath.Parser ( Text/TeXMath/Parser.hs, dist/build/Text/TeXMath/Parser.o )
[5 of 5] Compiling Text.TeXMath ( Text/TeXMath.hs, dist/build/Text/TeXMath.o )
[1 of 5] Compiling Text.TeXMath.Macros ( Text/TeXMath/Macros.hs, dist/build/Text/TeXMath/Macros.p_o )
[2 of 5] Compiling Text.TeXMath.Types ( Text/TeXMath/Types.hs, dist/build/Text/TeXMath/Types.p_o )
[3 of 5] Compiling Text.TeXMath.MathML ( Text/TeXMath/MathML.hs, dist/build/Text/TeXMath/MathML.p_o )
[4 of 5] Compiling Text.TeXMath.Parser ( Text/TeXMath/Parser.hs, dist/build/Text/TeXMath/Parser.p_o )
[5 of 5] Compiling Text.TeXMath ( Text/TeXMath.hs, dist/build/Text/TeXMath.p_o )
Registering texmath-0.5.0.4...
./setup haddock --hyperlink-source
Running Haddock for texmath-0.5.0.4...
Running hscolour for texmath-0.5.0.4...
Preprocessing executable 'texmath' for texmath-0.5.0.4...
Preprocessing library texmath-0.5.0.4...
Warning: The documentation for the following packages are not installed. No
links will be generated to these packages: ffi-1.0, rts-1.0, syb-0.3.6
Preprocessing executable 'texmath' for texmath-0.5.0.4...
Preprocessing library texmath-0.5.0.4...
Haddock coverage:
75% ( 3 / 4) in 'Text.TeXMath.Macros'
20% ( 1 / 5) in 'Text.TeXMath.Types'
25% ( 1 / 4) in 'Text.TeXMath.MathML'
100% ( 2 / 2) in 'Text.TeXMath.Parser'
33% ( 1 / 3) in 'Text.TeXMath'
Documentation created: dist/doc/html/texmath/index.html
./setup haddock --hoogle
Running Haddock for texmath-0.5.0.4...
Warning: The documentation for the following packages are not installed. No
links will be generated to these packages: ffi-1.0, rts-1.0, syb-0.3.6
Preprocessing executable 'texmath' for texmath-0.5.0.4...
Preprocessing library texmath-0.5.0.4...
Haddock coverage:
75% ( 3 / 4) in 'Text.TeXMath.Macros'
20% ( 1 / 5) in 'Text.TeXMath.Types'
25% ( 1 / 4) in 'Text.TeXMath.MathML'
100% ( 2 / 2) in 'Text.TeXMath.Parser'
33% ( 1 / 3) in 'Text.TeXMath'
Documentation created: dist/doc/html/texmath/texmath.txt
Source compiled.

  • Test phase [cabal test]: dev-haskell/texmath-0.5.0.4
    ./setup test
    Test 01 FAILED (< expected, > actual):
    13c13
    < −


        <mo>−</mo>
    

    15c15
    < ±

        <mo>±</mo>
    

    22c22
    < −

            <mo>−</mo>
    

    Test 02 FAILED (< expected, > actual):
    18c18
    < −

            <mo>−</mo>
    

    22c22
    < ×

          <mo>×</mo>
    

    27c27
    < −

          <mo>−</mo>
    

    Test 03 PASSED
    Test 04 FAILED (< expected, > actual):
    18c18
    < −

    <mo>−</mo>
    

    24c24
    < −

          <mo>−</mo>
    

    Test 05 FAILED (< expected, > actual):
    10c10
    < ∫

      <mo>∫</mo>
    

    18c18
    < ∫

      <mo>∫</mo>
    

    34c34
    < ∫

      <mo>∫</mo>
    

    44c44
    < −

    <mo>−</mo>
    

    Test 06 FAILED (< expected, > actual):
    10c10
    < ∑

      <mo>∑</mo>
    

    16c16
    < ∞

      <mo>∞</mo>
    

    19c19
    < ∑

      <mo>∑</mo>
    

    25c25
    < ∞

      <mo>∞</mo>
    

    Test 07 FAILED (< expected, > actual):
    10c10
    < ʺ

    <mo>ʺ</mo>
    

    17c17
    < ʹ

    <mo>ʹ</mo>
    

    Test 08 FAILED (< expected, > actual):
    9c9
    < ∣

    <mo stretchy="false">∣</mo>
    

    12c12
    < ‾

      <mo accent="true">‾</mo>
    

    14c14
    < ∣

    <mo stretchy="false">∣</mo>
    

    16c16
    < ∣

    <mo stretchy="false">∣</mo>
    

    18c18
    < ∣

    <mo stretchy="false">∣</mo>
    

    20c20
    < ∣

    <mo stretchy="false">∣</mo>
    

    24c24
    < ‾

      <mo accent="true">‾</mo>
    

    30c30
    < ∣

    <mo stretchy="false">∣</mo>
    

    32c32
    < ∣

    <mo stretchy="false">∣</mo>
    

    35c35
    < ∣

      <mo stretchy="false">∣</mo>
    

    Test 09 FAILED (< expected, > actual):
    13c13
    < →

        <mo>→</mo>
    

    Test 10 FAILED (< expected, > actual):
    10c10
    < φ

      <mi>φ</mi>
    
    14c14

Matrix/Array should be substack when in EUnder

At the moment, when there is in an equation array underneath an operator (say, a sum over two variables), it is written as an array or matrix, e.g.

\[\sum_{\begin{matrix}
{0\leq\ i\ \leq\ m} \\
{0<j<n\ } \\
\end{matrix}}^{}{P\left( {i,j} \right)}\]

That produces the ranges in full-size. amsmath provides \substack for this, which makes the output a lot nicer.

Integer values for EScaled and ESpace

It would be nice (esp. for reliable round-tripping) not to have the Doubles. Perhaps ESpace should have its widths specified in mus (1 mu or "math unit" from tex = 1/18 em). EScaled could be a percentage, with 100 = actual size. Both Doubles could become Ints.

I'm not positive about this change. In any case it would be an API change, so I put it here to be done (if at all) with the addition of a border box element.

OMML Math: Incorrect accent parsing

The following input is rendered to \acute{F} when it is rendered with a hat (ie \hat{F}) in word.

(Personal line reference: 117148)

@jkr

<m:oMath>
<m:acc>
<m:accPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:i/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
</m:ctrlPr>
</m:accPr>
<m:e>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<m:t>
F</m:t>
</m:r>
</m:e>
</m:acc>
</m:oMath>

heads up: building texmath when ghc is configured to use Clang rather than GCC can cause HIGH memory usage

somehow on certain invocations of the build process for the texmath package, if GHC is configured to use clang rather than GCC, clang will consume >= 1gb of memory on assembling/compiling certain modules, whereas GCC seems to max out at needing ~ 1/2gb.

I realize this is probably an upstream problem, but making you aware of it none the less, because it will likely impact some of your users

MathML: cannot produce multi-letter math identifier

This is not exactly a bug, but I suspect there may be no workaround, and is related to #83 and #85.

I'm using the MathML writer, and I have need for mathematical constants such as id. I need this to render as a constant, not an operator, since it can serve as an argument. If I simply write $id$, then the result is <mi>i</mi><mi>d</mi>, which is as it should be. What I need is something that translates to <mi>id</mi>.

I have been using \DeclareMathOperator or \operatorname for this, since it does indeed (wrongly) produce <mi>id</mi>, but I think because of other bugs I've reported this will be changed to produce <mo>id</mo> in the future...? Which leaves me with the problem of how to produce <mi>id</mi>.

In LaTeX I get this effect by doing something like

\newcommand{\id}{\mbox{id}}

but this produces

<mstyle mathvariant="normal">
    <mi>i</mi>
    <mi>d</mi>
</mstyle>

in Pandoc.

Is there a way to do this, or is it a genuine issue?

greek math gets output as operator not identifier

The input in pandoc markdown

text $a+\alpha$

when fed to the command

pandoc -f markdown t --mathml -o t.xml

results in

<p>text <math display="inline" xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>a</mi><mo>+</mo><mo>&#945;</mo></mrow></math></p>

i.e., the alpha is presented as operator while the MathML 3.0 spec says in 3.2.3.3:

The names of symbolic constants should be represented as mi elements:

<mi> &#x3C0;<!--GREEK SMALL LETTER PI--> </mi>
<mi> &#x2148;<!--DOUBLE-STRUCK ITALIC SMALL I--> </mi>
<mi> &#x2147;<!--DOUBLE-STRUCK ITALIC SMALL E--> </mi>

MathML: \text strips initial whitespace

In the MathML writer,

$$a\text{ and }b$$

renders like

aand b

The MathJax writer correctly renders it like:

a and b

The MML output is:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <semantics>
    <mrow>
      <mi>a</mi>
      <mrow>
        <mtext mathvariant="normal">and</mtext>    
        <mspace width="0.333em"/>
      </mrow>
      <mi>b</mi>
    </mrow>
    <annotation encoding="application/x-tex">a\text{ and }b</annotation>
  </semantics>
</math>

It should have another mspace before the mtext.

Operator problem with docx

Using pandoc to convert

$$\left( \sum_i x_i \right) \sum_i x_i$$

to docx gives

screen shot 2016-07-24 at 7 20 31 pm

Entering the same expression into Word manually gives

screen shot 2016-07-24 at 7 20 24 pm

Texmath 0.6.5.2 not in repo

Not sure if it simply isn't ready to be commited to remote, but the head of pandoc won't build because this is newest version is not in public repo.

support for some symbols

Please add support for the following symbols which are available without an extra usepackage in LaTeX:

Symbol Code LaTeX
U+2196 \nwarrow
U+2197 \nearrow
U+2198 \searrow
U+2199 \swarrow

jgm/pandoc/issues/2815

MathML rendering of align, etc. environments only aligns one equation per line

texmath's MathML output will only align the first two columns in something like this:

$$\begin{align}
    a &= b & c &= d
\end{align}$$

That is, it will only align the "a" and "b" columns. It ought to align "c" and "d" as well.

Expected behavior: align & friends allow multiple equations per line, and align each column around alternate ampersands, like AMSLaTeX does.

Here is the MML output (produced by Pandoc) of the snippet above:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <semantics>
    <mtable>
      <mtr>
        <mtd columnalign="right">
          <mi>a</mi>
        </mtd>
        <mtd columnalign="left">
          <mo>=</mo>
          <mi>b</mi>
        </mtd>
        <mtd>
          <mi>c</mi>
        </mtd>
        <mtd>
          <mo>=</mo>
          <mi>d</mi>
        </mtd>
      </mtr>
    </mtable>
    <annotation encoding="application/x-tex">\begin{align}     a &amp;= b &amp; c &amp;= d \end{align}</annotation>
  </semantics>
</math>

I think the second two mtd's (and any others) ought also to have columnalign attributes set.

Reference: https://groups.google.com/forum/#!topic/pandoc-discuss/faM55dIJDI0

texmath-0.8.3 fails to install via cabal

Hi, I'm trying to install pandoc via cabal and the installation fails when trying to install texmath. Some possibly relevant version numbers:

$ cabal --version
cabal-install version 1.16.0.2
using version 1.16.0 of the Cabal library
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.6.3

I'm working on Ubuntu 14.04.

Here's what I get:

$ cabal install pandoc
Resolving dependencies...
Configuring texmath-0.8.3...
Building texmath-0.8.3...
Preprocessing library texmath-0.8.3...
[ 1 of 18] Compiling Text.TeXMath.Unicode.ToASCII ( src/Text/TeXMath/Unicode/ToASCII.hs, dist/build/Text/TeXMath/Unicode/ToASCII.o )
[ 2 of 18] Compiling Text.TeXMath.TeX ( src/Text/TeXMath/TeX.hs, dist/build/Text/TeXMath/TeX.o )
[ 3 of 18] Compiling Text.TeXMath.Compat ( src/Text/TeXMath/Compat.hs, dist/build/Text/TeXMath/Compat.o )
[ 4 of 18] Compiling Text.TeXMath.Readers.MathML.EntityMap ( src/Text/TeXMath/Readers/MathML/EntityMap.hs, dist/build/Text/TeXMath/Readers/MathML/EntityMap.o )
Failed to install texmath-0.8.3
cabal: Error: some packages failed to install:
pandoc-1.15.0.6 depends on texmath-0.8.3 which failed to install.
texmath-0.8.3 failed during the building phase. The exception was:
ExitFailure 139

Does anyone have some advice? Thanks for any light you can shed on this problem.

best,

Robert Dodier

Overbrace does not appear in docx output

The following LaTeX math does not render correct when converted through pandoc to docx:

\begin{equation} \label{eq:num} \frac{\delta^2 u}{\delta D \delta t} = \overbrace{\overbrace{u_R(\cdot)}^{+}\overbrace{R_{Dt}(\cdot)}^{?}}^{A}+ \overbrace{\overbrace{R_t(\cdot)}^{+}(\overbrace{\overbrace{u_{RR}(\cdot)}^{-}\overbrace{R_D(\cdot)}^{-}}^{+}+ \overbrace{\overbrace{K'(D)}^{+}\overbrace{u_{KR}(\cdot)}^{+}}^{+})}^{B} \end{equation}

The overbraces are missing, but the +/- signs do appear in the proper positions. I was asked in the Google Groups to submit this issue here.

LaTeX parser fails on nested math environments when $$ is used

This MWE compiles fine in latex.

\documentclass{article}

\begin{document}
$$
{\mbox{\boldmath $\theta $}}_2||^2
$$
\end{document}

However, pandoc fails when reading it:

$ pandoc -f latex problem.tex -o test.xml
pandoc: 
Error at "source" (line 5, column 33):
unexpected "^"
expecting "%", lf new-line, "\\", "{", "-", "``", "\8220", "\"`", "\"", "`", "\8216", "''", "\8221", "'", "\8217", "~", "$$", "$", "^^", "&" or \end{document}
{\mbox{\boldmath $\theta $}}_2||^2

If, however, the $$ delimiters are changed to \begin{displaymath}, then it works fine.

(from jgm/pandoc#3164)

\boldsymbol doesn't work

The following, which requires \usepackage{amssymb} in LaTeX itself, doesn't seem to work in texmath:

$\boldsymbol{\alpha}$

\hline in math matrix can not be convert to docx equation

$$\mathbf{V}_1 \times \mathbf{V}_2 =
   \begin{vmatrix}
    \mathbf{i} & \mathbf{j} & \mathbf{k} \\ \hline
    \frac{\partial X}{\partial u} & \frac{\partial Y}{\partial u} & 0 \\
    \frac{\partial X}{\partial v} & \frac{\partial Y}{\partial v} & 0 \\
   \end{vmatrix}$$

This matrix can be correctly displayed in latex and mathjax. But when converting to docx from markdown, it fails.

Without the \hline is OK.

OMML process ```\mathbfit``` as italic, not bold-italic

Here's the markdown clip for testing:

# Math Tests 

## Test 1


### Without any ```LaTeX``` commands converted to Unicode symbols 

$$ e = \int_\mathbb{R} f(x | \theta) \circ g(\mathbfit{z} |\mathbfit{\eta}) dx $$

## With ```\mathbfit{*}``` commands converted to Unicode symbols before fed to ```Pandoc``` 

$$ e = \int_\mathbb{R} f(x | \theta) \circ g(𝒛|𝜼) dx $$ 

## Test 2

### Without any ```LaTeX``` commands converted to Unicode symbols 

$$ \mathsf{e} = \mathbffrak{z} $$

## With ```\mathbffrak{z}``` command converted to Unicode symbols before fed to ```Pandoc``` 

$$ \mathsf{e} = 𝖟 $$

below screenshot is the output generated with -s -S -t docx option (whose LaTeX code are partially processed by Pandoc and TeXMath):
pandoc_docx_screenshot
In Test 1's case 1, the \mathbfit command is just ignored (or not function) where "z" and "η are not in boldface. In Test 2, both cases are displayed correctly, as \mathbffrak are well treated by TeXMath's renderStr function.

It seems that OMML ignores the "bf" part of \mathbfit.

add \choose

It was recently pointed out to me that my combinatorial LaTeX in http://www.gwern.net/sicp/Chapter%201.1#section-2 was apparently broken and readers were seeing raw LaTeX. The first formula goes

{3 \choose 2} = \frac{3!}{2! \times (3-2)!} = \frac{3!}{2! \times 1!} = \frac{3!}{2 \times 1} = \frac{3!}{2} = \frac{6}{2} = 3

I had checked in Texify that this worked but after some playing around in Pandoc, it seems that the problem is the \choose operator. I don't see any use of \choose in cgi/texmath.html, and an attempt to use it in texmath.hs fails:

$ runhaskell texmath.hs
3 \choose 2
"formula" (line 1, column 11):
unexpected "2"

It would be nice if this could be supported.

TeXMath adds space after decimal point

pandoc -f markdown -t html <<<'The probability is $p=0.7$.'
<p>The probability is <span class="math"><em>p</em> = 0. 7</span>.</p>

TeXMath always considers a dot to be punctuation (ESymbol Pun ".") even if it's inside a decimal number.

Speculation: I don't know what exactly LaTeX is doing -- if it simply never adds any space after a dot, the fix should be easy. If there are cases where it adds a space and cases where it doesn't, TeXMath might need to reproduce its algorithm.

Related: the space added after ESymbol Op also seems a bit fishy, though I'm not sure what's the intended behavior. For example, in $\sum_{k=0}^n 2k$ I would expect a space before the 2, not immediately after the . But I guess this is a separate issue from the decimal point problem, and depending on how complicated LaTeX's behavior is, it might not be worth emulating.

Text in math environment not parsed corrently

In the following example <w:sym w:font="Symbol" w:char="F0CE"/> is ignored. This is the same issue as I patched the docx reader in pandoc to fix I think.

@jkr


<m:oMath>
<m:nary>
<m:naryPr>
<m:chr m:val="∑"/>
<m:limLoc m:val="undOvr"/>
<m:supHide m:val="1"/>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:bCs/>
<w:i/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
</m:ctrlPr>
</m:naryPr>
<m:sub>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<m:t>
n</m:t>
</m:r>
<m:r>
<m:rPr>
<m:sty m:val="p"/>
</m:rPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:bCs/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<w:sym w:font="Symbol" w:char="F0CE"/>
</m:r>
<m:r>
<m:rPr>
<m:scr m:val="double-struck"/>
<m:sty m:val="p"/>
</m:rPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math" w:cs="Lucida Sans Unicode"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<m:t>
N</m:t>
</m:r>
</m:sub>
<m:sup/>
<m:e>
<m:f>
<m:fPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:bCs/>
<w:i/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
</m:ctrlPr>
</m:fPr>
<m:num>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<m:t xml:space="preserve">
 </m:t>
</m:r>
<m:sSup>
<m:sSupPr>
<m:ctrlPr>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:bCs/>
<w:i/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
</m:ctrlPr>
</m:sSupPr>
<m:e>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<m:t>
x</m:t>
</m:r>
</m:e>
<m:sup>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<m:t>
n</m:t>
</m:r>
</m:sup>
</m:sSup>
</m:num>
<m:den>
<m:r>
<w:rPr>
<w:rFonts w:ascii="Cambria Math" w:hAnsi="Cambria Math"/>
<w:sz w:val="24"/>
<w:szCs w:val="24"/>
</w:rPr>
<m:t>
n!</m:t>
</m:r>
</m:den>
</m:f>
</m:e>
</m:nary>
</m:oMath>

DeclareMathOperator with MathML does not work with styled text

Using the MathML writer, this works:

\DeclareMathOperator{\den}{den}

but this:

\DeclareMathOperator{\den}{\mathsf{den}}

produces visible occurrences of \operatorname{\mathsf{den}} all over the text. I thought this might be because the parser for \DeclareMathOperator wrongly expects only a string in the second brackets, but it works as expected with the MathJax writer: the operator gets rendered in sans-serif.

I also tested this in LaTeX to make sure \DeclareMathOperator really accepts something besides simple strings for the operator name, and it does. Here is proof:

\documentclass{article}
\usepackage{amsmath}
\DeclareMathOperator{\den}{\mathsf{den}}
\begin{document}    
$\den t = \den t'$    
\end{document}

\top should have mathclass ord(inary)

When you render

$\top \otimes t = t$

to MathML, \top outputs as <mo>⊤</mo> (an operator) rather than <mi>⊤</mi> (an identifier). According to Knuth's TeXbook, Appendix F, Section 4(page 435 in my edition), \top has type Ord(inary), so (according to my sketchy understanding of MML) it should output as mi.

You might want to double-check the rest of the macros/symbols in the table on that page as well, such as \bot. In particular, I also noticed that \backslash typesets differently than /, which might be due to the same issue; perhaps it is intended, but MathJax typesets both slashes in the same way. (n.b. \backslash is different from \setminus.)

John's comments:

This is all done by the texmath library (jgm/texmath on
github). I see the following in the table for control
sequences:

       , ("\\top", ESymbol Ord "\x22A4") 

An ESymbol Ord is rendered as an mo element.
We could change this to EIdentifier "\x22A4", and it would
always be an mi. This probably makes sense. You can submit
a bug report on jgm/texmath, so I don't lose track of the
issue.

(There is probably a reason it was ESymbol Ord, but I don't
recall what it was.)

You can submit another issue to support \mathord.

Both of these should be fairly simple changes.

Reference: https://groups.google.com/forum/#!topic/pandoc-discuss/faM55dIJDI0

OMML borderbox not implemented

The one OMML entity that can't be translated is <m:borderBox>, which draws a box around an expression. It's usually used for highlighting equations in textbooks and the like.

This has representations in both LaTeX and mathml. The standard LaTeX version would be \boxed{...}, available in amsmath. In mathml, the notation is <menclose notation="box">...</menclose>.

For the sake of completeness (so close) and because the translation is fairly straightforward, I'd love to see this in. (NB. that the OMML and mathml versions can both support only showing some edges of the box. Not sure of the best way to do this in LaTeX, or if implementing that is worth the trouble.)

Writers/OMML: Short nary is loaded incorrectly in LibreOffice

First off, apologies since this seems to be a bug in LibreOffice, but it's triggered by texmath's output format and has a relatively easy fix here.

The formula

\sum_{i=1}^100 x = \frac{100*101}{2}

Is rendered incorrectly in LibreOffice
Screenshot

Using x + 1 (in parentheses) renders to

Screenshot

The formula is rendered correctly when opened in MS Office.

TexMath generates the following XML:

    <m:nary>
      <m:naryPr>
        <m:chr m:val="" />
        <m:limLoc m:val="undOvr" />
        <m:subHide m:val="off" />
        <m:supHide m:val="off" />
      </m:naryPr>
      <m:e>
        <m:r>
          <m:rPr />
          <m:t>x</m:t>
        </m:r>
      </m:e>
      <m:sub>
(i = 1)
      </m:sub>
      <m:sup>
(100)
      </m:sup>
    </m:nary>

LibreOffice seems to expect the <m:e> part to appear after the <m:sub> and <m:sup> part, changing the order makes the formula render correctly:

    <m:nary>
      <m:naryPr>
        <m:chr m:val="" />
...
      </m:naryPr>
      <m:sub>
(i = 1)
      </m:sub>
      <m:sup>
(100)
      </m:sup>
      <m:e>
        <m:r>
          <m:rPr />
          <m:t>x</m:t>
        </m:r>
      </m:e>
    </m:nary>

Screenshot

Again, this seems to be a bug in LibreOffice, but the change required in texmath is relatively straightforward:

--- a/src/Text/TeXMath/Writers/OMML.hs
+++ b/src/Text/TeXMath/Writers/OMML.hs
@@ -223,7 +223,7 @@ makeNary props t s y z w =
                  , mnodeA "supHide"
                     (if z == EGrouped [] then "on" else "off") ()
                  ]
-               , mnode "e" $ showExp props w
                , mnode "sub" $ showExp props y
-               , mnode "sup" $ showExp props z ]
+               , mnode "sup" $ showExp props z
+               , mnode "e" $ showExp props w]

This breaks the testcases of course, but I can have a look at them & submit a pull request if you like.

Cheers,

Niko

binary distribution

hey, can you post binary distribution as installing all the dependencies is too much

Arguments flipped in OMML reader

I think the arguments are the wrong way round somewhere in the OMML reader.

pandoc -f tex -t omml | pandoc -f omml -t tex
In:
\overset{r}{\rightarrow}
Out:
\overset{\rightarrow}{r}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.