harvardnlp / im2markup Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 213.0 4.35 MB

Neural model for converting Image-to-Markup (by Yuntian Deng yuntiandeng.com)

Home Page: https://im2markup.yuntiandeng.com

License: MIT License

Python 30.30% JavaScript 6.78% Lua 62.92%

im2markup's People

Contributors

Stargazers

Watchers

Forkers

codeaudit qiugen benjamesbabala chenglongchen eriche2016 shubhamjain0594 ericustc dmartinalbo ssampang da03 edwardchu-studio chagge fengyangzhang zhongxingpeng linjm clear-datacenter esalesky aby0 ml-lab michael-r-zhang ntuanhung vyraun miradel51 robustfengbin fireae anton-4 widemeadows zimuw viet-nguyen davidtranno1 tanganyao shubaozhang zhangxd12 theotheo agilab yingning fresty himanshurepo evitself lngao aartibagul whrenstone inkimage dbanda wenyafei4 can-keklik ccywch yufish fendaq maggione mingchen62 stevenlol cosmoshua yunxinan zjiang4 gauravjain21 ikishorek longzaitianguo xlicz iywo femto futurewarning yu45020 w32zhong bygreencn handsomeboy tangyuan-liu cooravi channgo2203 lihongqiang amandalmia14 zoujuny bbsmp milescranmer crelon loovelj chloehq billyzju araleqiu shahid1993 10183308 mahshad92 diamondgloves gitbruce mbrukman gloddream diesel790529 sreenivasmr chaitan3 wslukaa ktj4820 beneo aisaturday todun zengpeizhi darmawanalbert mtrstudio ducanh841988 chuang0310 lijuny

im2markup's Issues

error while python scripts/evaluation/evaluate_bleu.py --result-path results/results.txt --data-path data/sample/test_filter.lst --label-path data/sample/formulas.norm.lst

(base) [centos@datascience-gpudev-1 im2markup]$ python scripts/evaluation/evaluate_bleu.py --result-path results/results.txt --data-path data/sample/test_filter.lst --label-path data/sample/formulas.norm.lst
2019-12-15 02:03:15,469 root INFO Script being executed: scripts/evaluation/evaluate_bleu.py
Traceback (most recent call last):
File "scripts/evaluation/evaluate_bleu.py", line 87, in
main(sys.argv[1:])
File "scripts/evaluation/evaluate_bleu.py", line 60, in main
labels[img_path] = labels_tmp[int(idx)]
KeyError: 32771
(base) [centos@datascience-gpudev-1 im2mark

can you explain src\modeel\cnn.lua

I am looking cnn.lua code and I have few questions.
what are

model:add(nn.AddConstant(-128.0))
model:add(nn.MulConstant(1.0 / 128))

and

model:add(nn.Transpose({2, 3}, {3,4})) -- (batch_size, H, W, 512)
model:add(nn.SplitTable(1, 3)) -- #H list of (batch_size, W, 512)

about?

Want to use pretrained CNN weights

Hi Authors,

The model trained by you is in Torch. I want to test my code on the pretrained weights of your CNN model. I am working in tensorflow and keras env. Can you help me in conversion of Torch model to HDF5 format. Not able to get it.

When testing, it requires ground truth labels to be specified to get reasonable results

Hi,

I tried to use the Math-to-LaTeX Toy Example pre-trained model and test on my own equation. I use the following commands to perform testing:
th src/train.lua -phase test -gpu_id 1 -load_model -model_dir model/latex -visualize
-data_base_dir data/sample/images_processed/
-data_path data/sample/test_filter.lst
-label_path data/sample/formulas.norm.lst
-output_dir results
-max_num_tokens 500 -max_image_width 800 -max_image_height 800
-batch_size 5 -beam_size 5

When I follow your provided steps and test on your test data, everything is fine. But when I change the "-data_base_dir" and "-data_path" to point to my own cropped equation (such as 9+9+8=26, all in printed font, no handwritten) and keep "-label_path" unchanged, the test output "results.txt" is still nearly same as those ground truth labels in your "formulas.norm.lst". Even I change my equations, as long as the "formulas.norm.lst" is not changed, the test output is the same.
But once I change the "formulas.norm.lst" to contain the correct Latex expressions of my equations, the test output starts to make sense. How come this is the case? I suppose the model should predict labels without the assistance of ground labels, right? The labels should be used to calculate loss and distance, etc. only.

Offset in dataset

It looks like there is an offset in the dataset provided:

In im2latex_train.lst, the first line is:

1 60ee748793 basic

Which corresponds to this equation:
\int_{-\epsilon}^\infty dl\: {\rm e}^{-l\zeta} \int_{-\epsilon}^\infty dl' {\rm e}^{-l'\zeta} ll'{l'-l \over l+l'} \{3\,\delta''(l) - {3 \over 4}t\,\delta(l) \} =0. \label{eq21}

But the image 60ee748793 doesn't match. This image matches with the equation of the next line:

2 66667cee5b basic

Which is:

ds^{2} = (1 - {qcos\theta\over r})^{2\over 1 + \alpha^{2}}\lbrace dr^2+r^2d\theta^2+r^2sin^2\theta d\varphi^2\rbrace -{dt^2\over (1 - {qcos\theta\over r})^{2\over 1 + \alpha^{2}}}\, .\label{eq:sps1}

I see in the sample date in the repo starts at line index 0 which would explain that you chose to consider the first line as line 0, and would explain the "offset".
However, it would mean that the first equation doesn't have a matching image.

Did I miss something or is there really an issue with the dataset ?

Thanks for your help !

found some data label unconsistence

51238 1a00a76d4e basic in im2latex_train.lst
latexs around line 51238 in im2latex_formulas.lst are not the latex content in pic 1a00a76d4e.
1a00a76d4e should point to line 51729 in im2latex_formulas.lst.
I have found some of this case, but not sure how many.
I download data from https://zenodo.org/record/56198#.XZ7yK_n_yHt.
Is anything wrong?

Something wrong with pytorch implement

Hi,

I tried to implement this model with pytorch, but I encountered some problems. As mentioned in your paper(http://arxiv.org/pdf/1609.04938v1.pdf),experiments are run on a 12GB Nvidia Titan X GPU, you train the model for 12 epochs and use the validation perplexity to choose the best model.But in my experiment, the accuracy of 20 epochs was still 0. After that, I try to use a small number of training samples(10 samples) to train to check the correctness of the code. I found that the loss would converge untill 500 epochs. My GPU is GTX 1080Ti * 2, in the case of batch_size = 16, an epoch takes about 30 minutes(batch_size=20 will OOM). But even if 500 epoch can complete the training, this time is too long to bear. I checked my code repeatedly but didn't find other erros. I am particularly curious as to what the problem is and I don't understand why there is such a big difference. Can you provide the code for the pytorch version to learn? Or what mistakes do you think might cause this problem?

Looking forward to your replay

Getting low accuracy using customized images for test.

Hello Authors:

We modified and trained your model on our PCs and got pretty high BLEU accuracy on test dataset. We use Transformer instead of RNN or LSTM. But When we try to use the trained model to predict some local images (for example, screenshot of a latex formula), the result is not so good. We did some data augmentation such as random downsample ratio or random Gaussian blur. But the test on local images still gets low accuracy. Would you share any thoughts about that? I would be very appreciated if you could give us any advice. Thanks!

Error when testing on 100k dataset

Hi,

I am trying to replicate your results. Although I have no issues loading the trained model and test it on the toy test samples (100), when I try to use the same model to get the accuracy on all test samples in the 100K dataset(10355) the test accuracy becomes NAN after some time and I get an error after 2000 samples. I do not understand this behavior. I changed the token length to get rid of warnings, but that is no help. Please let me know if you faced the same issue.
log.txt

[01/27/19 17:43:52] 1.046239
[01/27/19 17:43:52] Number of samples 2000 - Accuracy = nan
[01/27/19 17:43:54] 1.082996
[01/27/19 17:43:58] 1.228099
[01/27/19 17:44:00] 1.140648
[01/27/19 17:44:03] 1.131666
[01/27/19 17:44:06] 1.043551
[01/27/19 17:44:09] 1.162436
[01/27/19 17:44:11] 1.087319
[01/27/19 17:44:14] 1.575318
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-2331/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu line=163 error=59 : device-side assert triggered
/home/mxm7832/torch/install/bin/luajit: /home/mxm7832/torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (59) : device-side assert triggered at /tmp/luarocks_cutorch-scm-1-2331/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:163
stack traceback:
[C]: in function 'v'
/home/mxm7832/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'Sigmoid_updateOutput'
/home/mxm7832/torch/install/share/lua/5.1/nn/Sigmoid.lua:4: in function 'func'
.../mxm7832/torch/install/share/lua/5.1/nngraph/gmodule.lua:345: in function 'neteval'
.../mxm7832/torch/install/share/lua/5.1/nngraph/gmodule.lua:380: in function 'forward'
src/model/model.lua:360: in function 'feval'
src/model/model.lua:885: in function 'step'
src/train.lua:111: in function 'train'
src/train.lua:289: in function 'main'
src/train.lua:295: in main chunk
[C]: in function 'dofile'
...7832/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk

There is a bug in preprocess_latex.js

When I used your script to normalize the formula, I found that there was an illegal LaTeX symbol 'ule' generated. After checking, I found that the symbol was originally '\rule', but the program mistook the '\r' in front of the symbol for a carriage return. This problem was solved when I changed the "\ rule {" to "\ \rule {" in line 344 of the preprocess_latex.js.

Unable to train after data augmentation

I have taken formulas file from original dataset, and created 7 different renderings for each formula, now for training my files are as follows

vocab file : same as dataset
formulas and formulas.norm files generated by https://github.com/Miffyli/im2latex-dataset
images folder, with all images preprocessed , again using above repo
and also train_filter.lst, validation_filter.lst, and test_filter.lst

When I am running training th src/train.lua -phase train -gpu_id 1 -model_dir model -input_feed -prealloc -data_base_dir input_files/img5 -data_path input_files/train.lst -val_data_path input_files/validation.lst -label_path input_files/formulas.norm.lst -vocab_file input_files/latex_vocab.txt -max_num_tokens 150 -max_image_width 500 -max_image_height 160 -batch_size 10 -beam_size 1 , its throwing the following error

a
[06/28/18 11:52:50]  Command Line Arguments:	
[06/28/18 11:52:50]  -phase train -gpu_id 1 -model_dir model -input_feed -prealloc -data_base_dir input_files/img5 -data_path input_files/train.lst -val_data_path input_files/validation.lst -label_path input_files/formulas.norm.lst -vocab_file input_files/latex_vocab.txt -max_num_tokens 150 -max_image_width 500 -max_image_height 160 -batch_size 10 -beam_size 1	
[06/28/18 11:52:50]  End Command Line Arguments	
[06/28/18 11:52:50]  Using CUDA on GPU 1	
[06/28/18 11:52:50]  Building model	
[06/28/18 11:52:50]  Creating model with fresh parameters	
[06/28/18 11:52:50]  Loading vocab from input_files/latex_vocab.txt	
[06/28/18 11:52:50]  Switching on memory preallocation	
[06/28/18 11:52:50]  cnn_featuer_size: 512	
[06/28/18 11:52:50]  dropout: 0.000000	
[06/28/18 11:52:50]  encoder_num_hidden: 256	
[06/28/18 11:52:50]  encoder_num_layers: 1	
[06/28/18 11:52:50]  decoder_num_hidden: 512	
[06/28/18 11:52:50]  decoder_num_layers: 1	
[06/28/18 11:52:50]  target_vocab_size: 175	
[06/28/18 11:52:50]  target_embedding_size: 80	
[06/28/18 11:52:50]  max_encoder_l_w: 62	
[06/28/18 11:52:50]  max_encoder_l_h: 20	
[06/28/18 11:52:50]  max_decoder_l: 151	
[06/28/18 11:52:50]  input_feed: true	
[06/28/18 11:52:50]  batch_size: 10	
[06/28/18 11:52:50]  prealloc: true	
[06/28/18 11:52:50]  Number of parameters: 9255007	
[06/28/18 11:52:58]  Data base dir input_files/img5	
[06/28/18 11:52:58]  Load training data from input_files/train.lst	
[06/28/18 11:52:58]  Training data loaded from input_files/train.lst	
[06/28/18 11:52:58]  Load validation data from input_files/validation.lst	
[06/28/18 11:52:58]  Validation data loaded from input_files/validation.lst	
[06/28/18 11:52:58]  Lr: 0.100000	
/home/saurabh/torch/install/bin/luajit: bad argument #2 to '?' (out of range at /home/saurabh/torch/pkg/torch/generic/Tensor.c:913)
stack traceback:
	[C]: at 0x7f9cbfbef590
	[C]: in function '__index'
	/home/saurabh/torch/install/share/lua/5.1/image/init.lua:1840: in function 'rgb2y'
	src/data/data_gen.lua:78: in function 'nextBatch'
	src/train.lua:106: in function 'train'
	src/train.lua:289: in function 'main'
	src/train.lua:295: in main chunk
	[C]: in function 'dofile'
	...rabh/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

Can anyone tell me what might be causing this, training with original dataset works fine.

Recognition fails on images outside the dataset

Hi, I'm having a question about recognizing images that are not from the dataset. I tried to crop images from scientific papers, which has a similar size to the images in the dataset. However, the output would fail totally even for simple formulas. For debugging, I also took a screenshot of the images from the dataset, but the output for the screenshot would fail, even though the original images succeeded. The screenshot and the original image are almost the same, which really confuses me. Am I missing some preprocessing steps, or do I need to re-train the model with different image sizes? Hope this question doesn't sound too naive, but I really need some help. Thank you very much!

Do nothing,but BLEU value increased by 3%

HI! Recently I experimented with your model and found that the accuracy has improved by more than 3% compared to what you mentioned in the paper.I would like to to ask if you have modified the model, or you think there may be a problem.
2018-03-27 10:40:10,218 root INFO BLEU = 91.20, 96.9/94.0/91.3/88.6 (BP=0.984, ratio=0.985, hyp_len=537287, ref_len=545740)
2018-03-27 12:02:39,761 root INFO Accuracy (w spaces): 0.821012
2018-03-27 12:02:39,761 root INFO Accuracy (w/o spaces): 0.846854

What is the difference between this model and crnn with attn?

issue running preprocess_formulas

when I run

python scripts/preprocessing/preprocess_formulas.py --mode normalize --input-file data/sample/formulas.lst --output-file data/sample/formulas.norm.lst

I get

2016-10-05 05:29:51,614 root INFO Script being executed: scripts/preprocessing/preprocess_formulas.py
D(T)=\left(\begin{array}{cc}a(T)&0\0&a(T)^{-1}\end{array}\right) \ |a(T)|>1

{ [ParseError: KaTeX parse error: Expected 'EOF', got '' at position 68: rray}\right) \̲ |a(T)|>1 ] name: 'ParseError', position: 68 }
A_{ab} \stackrel\mathrm{ def}{\equiv} \frac{\partial ^2L_\mathrm{ q}}{\partial\dot{q}_a^{n_a}\partial \dot{q}_b^{n_b}}.
A _ { a b } \stackrel
[TypeError: Cannot read property 'type' of undefined]
2016-10-05 05:29:52,441 root INFO Jobs finished

I am running it on an ubuntu machine.

Let me know how to fix this.

Code for generating dataset from LaTeX sources

Hi,

Fabulous work here. I am trying to create a dataset for mathematical logic (set / proof / model theory etc). Is it possible to obtain the code you used to create the image dataset from the LaTeX sources?

Best,
Andrew

bad argument #1 to 'load' (string expected, got nil)

I need your help please!

env.lua

error when test the big dataset

when i use the model test the dataset that is big,I meet an error.How to deal with it?
/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/cutorch/Tensor.lua:14: cuda runtime error (59) : device-side assert triggered at /root/torch/extra/cutorch/lib/THC/generic/THCTensorCopy.c:18
stack traceback:
[C]: in function 'copy'
/root/torch/install/share/lua/5.1/cutorch/Tensor.lua:14: in function 'localize'
src/model/model.lua:598: in function 'feval'
src/model/model.lua:885: in function 'step'
src/train.lua:116: in function 'train'
src/train.lua:300: in function 'main'
src/train.lua:306: in main chunk
[C]: in function 'dofile'
/root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

High generalization error

Got high generalization error when predicting using latex formula picture in real word, for example, below is a predict for one formula picture:
\begin{array} { c c } { { { { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } & { } &

And this is my training result:
EM 14.03 - BLEU-4 74.61 - perplexity -1.42 - Edit 78.67

Has someone stuck in the same situation as me?

Want to train and test on CPU

I wish to test the model on a CPU, how can i achieve that? I have tried setting the gpd_id as 0 but it returns an error stating spatialconvolution requires cuda. Therefore, i included a line cudnn.convert(model[1], nn) in model.lua before line 55 to overcome the error. But now i'm getting another error:

 bad argument #2 to '?' (number expected, got userdata)
stack traceback:
        [C]: in ?
        [C]: in function '__add'
        ../im2markup/src/model/model.lua:593: in function 'feval'
        ../im2markup/src/model/model.lua:887: in function 'step'
        ../im2markup/src/train.lua:111: in function 'train'
        ../im2markup/src/train.lua:291: in function 'main'
        ../im2markup/src/train.lua:297: in main chunk

Any help is appreciated! Thanks

error importation cudnn

when launching :

th src/train.lua -phase train -gpu_id 1
-model_dir model
-input_feed -prealloc
-data_base_dir data/sample/images_processed/
-data_path data/sample/train_filter.lst
-val_data_path data/sample/validate_filter.lst
-label_path data/sample/formulas.norm.lst
-vocab_file data/sample/latex_vocab.txt
-max_num_tokens 150 -max_image_width 500 -max_image_height 160
-batch_size 20 -beam_size 1

cudnn is not found....
I tried" luarocks install cudnn"
still doesen't work

how to remove katex parser error

how do i remove this error in katex parsing

why downsample by 2 in preprocess

Hi, I'm trying to train this model in another mathematical expression real data.
while I doing that I have a question, Why you downsample images in preprocess_images.py ? can I change downsample_ration to 1.0 for better performance?
Thanks.

Can you provide a vocab dictionary?

My generated vocab dictionary has 556 tokens, would like to know how many you have? Can you provide your vocab dictionary?

The python version of the dataset resource is not working

I want to download the dataset at the http://lstm.seas.harvard.edu/latex/im2text_small.tgz URL, but it bounces to the https://lstmvis.vizhub.ai/, can you please tell me the new dataset URL, thank you very much

python package

If possible can you make a python package for this?
Will be very grateful!!

Typo in paper?

On page two:

produces a feature grid V of size D × H' × W', where c denotes the number of channels and H' and W' are the reduced sizes from pooling.

I think it should be "D denotes the number of channels"

[regarding real dataset] Please respond

Hello,

I can understand we can't generalize unless we don't have the real different types of images and their ocr, we, can provide that dataset, to get accuracy as mathpix. I don't have the hardware to train so need your little help for that. Can you share your email id for that if possible?

I got poor results on my own screenshots

I used the screenshots of my computer to intercept the formula in the papers. But none of them can be identified. Are there any special data processing methods for those data?

Why using lua instead of python?

Hi, I'm testing this model in two different programming languages, python and Lua.
But whenever I tested it in python its have not good enough result compared with Lua one (almost 8 percent in Edit Distance Accuracy).
Can you explain why you implemented in Lua?

'perl' and 'cat' is not recognized

I wanted to run code on windows 10 but i got this error and it's making an empty file for me, i installed Perl and it doesn't work for me...
python scripts/preprocessing/preprocess_filter.py --filter --image-dir data/sample/images_processed --label-path data/sample/formulas.norm.lst --data-path data/sample/validate.lst --output-path data/sample/validate_filter.lst

2022-05-12 07:39:23,971 root INFO Script being executed: scripts/preprocessing/preprocess_formulas.py
'perl' is not recognized as an internal or external command,
operable program or batch file.
2022-05-12 07:39:23,984 root ERROR FAILED: perl -pe 's|hskip(.*?)(cm|in|pt|mm|em)|hspace{\1\2}|g' ../dataset/im2latex_formulas.lst > ../dataset-preprocess/im2latex_formulas.norm.lst
'cat' is not recognized as an internal or external command,
operable program or batch file.
2022-05-12 07:39:23,997 root ERROR FAILED: cat ../dataset-preprocess/im2latex_formulas.norm.lst.tmp | node scripts/preprocessing/preprocess_latex.js normalize > ../dataset-preprocess/im2latex_formulas.norm.lst
2022-05-12 07:39:23,998 root INFO Jobs finished

-

THNN.lua:110: cuda runtime error (59)

Gpu memory usage keep increasing

When I trained the model on my own dataset with 300,000 images, the gpu memory usage kept increasing until it is used up which killed the training process.
I am new to torch. Need your help to figure out this problem @da03

Could you upload the code to generate handwriting formulas?

Hi~I want to generate some handwriting image, can you share your code?

I am getting None with intermediate weights

target vocab size

I found that the provided model has a vocabulary size 525, however, following the preprocessing, I got a vocabulary with size 496.

Model tuning duplicates generated in decoder

After training, I started to evaluate and I found the prediction interesting.
The trained model did good prediction on some more complicate Latex such as fraction or sqrt, it failed on some simpler formula.
For example,
ground truth is "y=x^+2x +1" but the prediction is "y=x^2+2x +2x + 1".
ground truth is "270" but the prediction is "2700".
The decoder duplicates last symbol(s).
Any hint on how to tune the model to alleviate the issue?

My training results looks reasonable:
Epoch: 11 Step 43142 - Val Accuracy = 0.923066 Perp = 1.137150
Epoch: 12 Step 47064 - Val Accuracy = nan Perp = 1.138024

How to make code show predicted mathematical expression in latex format

hello author,
how can I edit code suitable for test other dataset?
how can I edit code to show predicted mathematical expression on new test data?
Thanks.

[Please Respond] Can you help me training the model for to recognize the out of given data image set

Can you please help in the steps like generating the data by data augmenting and preparing the data for training

Experimental data set

I found that some of the experiments in your paper were tested on the CROHME dataset, but the CROHME dataset is not an image format. How do you deal with it? Thanks for you.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte

Hi, guys,
I am trying using the scripts in this repo to preprocess the im2latex dataset, but I met this error as,

2020-08-26 17:16:23,199 root INFO Script being executed: scripts/preprocessing/preprocess_formulas.py
Traceback (most recent call last):
File "scripts/preprocessing/preprocess_formulas.py", line 87, in
main(sys.argv[1:])
File "scripts/preprocessing/preprocess_formulas.py", line 65, in main
for line in fin:
File "/home/songyuc/software/python/anaconda/anaconda3/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte

So, how can I solve this?
Any answer or idea will be appreciated!

can you explain about value 'Accuracy'?

can you explain about the value 'Accuracy'?
Thanks.

can anyone share the trained model file which is genralized on any type of image like mathpix

not working for below type of images (other than given by you). I think we need to put images in particular format

I tested on below type of images and it is giving results like (one diterminant & one matrix)
\hspace { 0 . 5 c m }
\hspace { 5 m m }

Training with print frac data

I reproduced the result of printed and handwrited equations. Nice Results.
Besides, I generated a 2k printed 320x80 fraction eqns (e.g., \frac{1}{2}+\frac{1}{2} = 1) as training and val data.
The training step seems fine, but the testing result for arbitrary input is same (e.g., \frac{1}{3}+\frac{1}{3} = \frac{2}{3}).
In this case I set -max_num_tokens 50.
I am wondering is there any restrictions on image's format (or shape)? thank you.

Do you have Python code

To be honest, I'm not very familiar with Lua code, so I wish you could have a python implementation. It’s very important for me. Thank you very much