The tex-to-md's intro from neural-loop

tex-to-md's Introduction

https://pylatexenc.readthedocs.io/en/latest/latex2text/ does the heavy lifting. You can see its output by running latex2text on a tex file.

The script is adding some preprocessing to happen before latex2text (tex_to_md.py) to prep and update content.

The most complicated of the handling scripts are

tex_figure_prep (converts images to png)
tex_figures (converts tex into Hugo Shortcode )

Use: arxiv-search.py (downloads papers into /inputs) (I committed some papers its not needed to run this) pipeline.py -> will process and move papers into /failed or /inputs-processed and /markdown when complete

Workflow: 1: Read directory of papers 'inputs' and copy to 'inputs-processed' 2: Run various scripts to process the tex files 3: Run tex_to_md.py on the tex 4: Run postprocessing on the md 5: Move the md to output 'markdown' directory If you get this far, you can start to check the output to find issues. If you go further the aimodels site is running on hugo and is how I render the pages to be visible

Recommend Projects

neural-loop / tex-to-md Goto Github PK

tex-to-md's Introduction

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent