knennigtri / merge-markdown Goto Github PK
View Code? Open in Web Editor NEWA tool to take in a list of markdown files and merge them with optional HTML/PDF output
A tool to take in a list of markdown files and merge them with optional HTML/PDF output
Currently use cases have been tested for merging many modules together in the same guide folder with the same common asset folder. Next generation will allow for a merge file with multiple guide folders and multiple asset folders:
This will allow for custom options.
Rather than
"lorem-module.md": ["noYAML","TOC"]
"lorem-module.md": {"noYAML":true,"TOC":"#### Module Contents","timestamp":"06/20/2021"}
merge-markdown requires wkhtmltopdf installed first: http://wkhtmltopdf.org/downloads.html
C:\Users\prpurush\AppData\Roaming\npm\node_modules\@knennigtri\merge-markdown\node_modules\wkhtmltopdf\index.js:180
throw new Error(err); // critical error
^
Error: Error: spawn wkhtmltopdf ENOENT
at ChildProcess.<anonymous> (C:\Users\prpurush\AppData\Roaming\npm\node_modules\@knennigtri\merge-markdown\node_modules\wkhtmltopdf\index.js:180:11)
at Object.onceWrapper (events.js:520:26)
at ChildProcess.emit (events.js:400:28)
at Process.ChildProcess._handle.onexit (internal/child_process.js:275:12)
at onErrorNT (internal/child_process.js:467:16)
at processTicksAndRejections (internal/process/task_queues.js:82:21)
{
"input": {
"lorem-frontmatter.md": "",
"lorem-module.md": {"noYAML":true,"TOC":true}
},
"output": "+moduleGuide.md",
"replace":{
"timestamp":"06/01/2021",
"returnToMainTOC": "[...back to main TOC](#course-contents)"
}
}
i tried using
replace:
(^#): "##"
to change my headers hierarchy.
My files looked like:
# header 1
## header 2
### header 3
With the regex i expected as result
## header 1
### header 2
#### header 3
Instead I received:
## header 1
## header 2
### header 3
This should be allowed
---
input:
frontmatter.md: ""
lorem-module/: ""
ipsum-module/: ""
replace:
<!--{timestamp}-->: 05/25/2021
<!--{returnToMainTOC}-->: "[...back to main TOC](#course-contents)"
<!--{courseTitle}-->: My Course Title
<!--{author}-->: Chuck Grant
<!--#-->: ""
({#(.*?)}): ""
---
Currently there is support for module/chapter/file specific TOCs but there is not an option for a global TOC of the final merged document. Previously this was handled by using typora, but to make this tool agnostic of the editor, there needs to be an option to add a final global TOC by using the manifest.
Using DocToc, you can specify where the final global TOC should be located:
<!-- START doctoc -->
<!-- END doctoc -->
IN the manifest, I think it would be an optional global param, either array or comma separated of available configurations.
globalToc: "mode, maxHeaderLevel, title, notitle, entryPrefix, processAll,"
From Doctoc
transform(content, mode, maxHeaderLevel, title, notitle, entryPrefix, processAll, updateOnly)
When a non typical location is referenced for a markdown file, the temp file is not removed properly.
It seems right now when you have your project in a path with space, it breaks on the PDF creation part, so I need to double check how pandoc and wkhtmltopdf handles paths.
Potentially inject Module # into the header #
Currently the features customizable to these tools are limited, but I would like to make them as customizable as possible. Based on their projects these are the feature that could be customized:
node-pandoc
If there is a pandoc configuration, the value will automatically be read into the pandoc args of node-pandoc
{
"css": "-c main.css"
"template": "--template template.latex"
"...": "any other commands of pandoc"
}
wkhtmltopdf
Below is an example, but effectively any command from https://wkhtmltopdf.org/usage/wkhtmltopdf.txt can be used.
{
"B": "1in",
"T": "1in",
"L": ".7in",
"R": ".7in",
"s": "Letter",
"footerLine": true,
"footerCenter": "Page [page]",
}
Currently it's assumed that the assets folder will be at the root location so that all markdown files share a common location for assets. This allows for the final markdown file to be created with correct relative links to the assets. Unfortunately this requires the content creator to know this and architect the project accordingly.
The goal is learn the relative link between source md > assets and then create an updated relative link from final md > assets.
It would be great if there was an option to disable the "auto-resolution" feature.
I'm currently using merge-markdown to combine Marp documents and Marp requires relative path.
Allows for
Hello there! Thanks for creating this. May I know how can I merge all md files in a directory including those in sub-directories?
error message:
`merge-markdown -m SUMMARY.md
Found manifest to use: SUMMARY.md
Manifest file does not contain valid YAML or JSON content.
"... is not valid JSONd token '#', "# Summary
at JSON.parse ()
at getManifestJSON (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:174:22)
at Object.init [as run] (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:116:24)
at Object. (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\bin\global.js:5:15)
at Module._compile (node:internal/modules/cjs/loader:1376:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)
at Module.load (node:internal/modules/cjs/loader:1207:32)
at Module._load (node:internal/modules/cjs/loader:1023:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:135:12)
at node:internal/main/run_main_module:28:49
Manifest does not exist or has incorrect syntax. Choose a valid folder or file.
Usage: merge-markdown [ARGS]
Arguments:
-m Path to input folder, yaml, or json manifest
-v, --version Displays version of this package
--qa QA mode.
--nolinkcheck Skips linkchecking
--pdf Output to PDF. wkhtmltopdf must be installed http://wkhtmltopdf.org/downloads.html
--html Output to HTML
-h, --help Displays this screen
-h [manifest|options|outputOptions|qa] See examples
Default manifest: manifest.[md|yaml|yml|json] unless specified in -m.
C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:124
throw err;
^
"... is not valid JSONd token '#', "# Summary
at JSON.parse ()
at getManifestJSON (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:174:22)
at Object.init [as run] (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:116:24)
at Object. (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\bin\global.js:5:15)
at Module._compile (node:internal/modules/cjs/loader:1376:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)
at Module.load (node:internal/modules/cjs/loader:1207:32)
at Module._load (node:internal/modules/cjs/loader:1023:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:135:12)
at node:internal/main/run_main_module:28:49
Node.js v20.10.0`
content of manifest file:
`# Summary
-m can take in:
If it's a folder, check for a manifest.json file. if there is no manifest file, automatically merge all files in the folder together with default manifest options. Name the output the same as the input folder
if --qa is set and manifest.qa.exclude DNE the value defaults to frontmatter
if -m is not specified, look for a manifest.json at the root. If there is no manifest file, automerge files in the root together. output file will be generically called merge.md
Currently the markdown output defaults to the input folder. it should be updated to output the output files to the current working directory.
user should be able to use toc
or doctoc
option within the manifest. This will be important once #47 is implemented so users understand where the different customizable options for TOC come from.
Currently the features customizable to these tools are limited, but I would like to make them as customizable as possible. Based on their projects these are the feature that could be customized:
doctoc
based on this method transform(files, mode, maxHeaderLevel, title, notitle, entryPrefix, processAll, stdOut, updateOnly)
these will be customizable and below are the defaults:
{
"mode": "github.com",
"maxlevel": 3,
"title": "",
"notitle": false,
"all": true
"update-only": true
}
A manifest should also be able to take in other manifests and rerun the process. Here's how it would work:
Input:
The tool would recognize it's a manifest file and recursively call the merge.js script.
merge-js produces a moduleFinal.md file that is now taken in as the inputFile.
Currently output options are mixed in with global options. Ideally the output needs to read something like:
name
, else look for manifest.output.name
Ideally they should be nested under output:
---
input:
../../frontmatter.md: ''
mymodule.md: {noYAML: true, TOC: true, replace: {<!--#-->: "Module 1:"}}
output:
name: "merged/mymerged.md"
mergedTOC: true
pandoc:
css: -c main.css
latexTemplate: --template template.latex
title: -M title:Example
wkhtmltopdf:
marginBottom: 1in
marginTop: 1in
marginLeft: .7in
marginRight: .7in
pageSize: Letter
footerLine: true
footerCenter: Page [page]
qa: {exclude: "(frontmatter)"}
replace:
<!--{copyrightYear}-->: 2022
<!--{timestamp}-->: 10/22/2022
<!--{returnToMainTOC}-->: "[Return to Course Contents](#course-contents)"
<!--{courseType}-->: Activity Guide
<!--{courseTitle}-->: My Course Title
<!--{courseCreator}-->: The Merge Company
<!--{author}-->: Ronan Boxer
---
As a user, I want to be able to add regex values to the input
param so that I don't have to manually write out every single file in the manifest. Examples:
module-
input:
frontmatter: ""
module-*: {noYAML: true}
input:
frontmatter: ""
^[0-9]: {noYAML: true}
word
input:
frontmatter: ""
word$: {noYAML: true}
module/
ending in -chapter.md
input:
/(?<=module/).*(?=-chapter.md)/g: ""
module-*: {noYAML: true}
This would allow for course and modules to use 1 frontmatter for the same process
hello!
Is it possible to merge all files in a folder without the YAML front matter?
I could only merge files without YAML front matter by using the myManifest.md file and the option for each file e.g.
---
input:
00-1-title-pages.md: {noYAML: true}
00-2-toc.md: {noYAML: true}
00-3-figures.md: {noYAML: true}
00-4-tables.md: {noYAML: true}
00-5-acronyms-and-definitions.md: {noYAML: true}
01-introduction.md: {noYAML: true}
02-0-methodology.md: {noYAML: true}
02-1-method-litrev.md: {noYAML: true}
03--litrev-overview.md: {noYAML: true}
03-1-litrev-bci.md: {noYAML: true}
03-2-litrev-nf.md: {noYAML: true}
03-3-litrev-meditation.md: {noYAML: true}
03-4-litrev-neuromeditation.md: {noYAML: true}
03-5-litrev-bcmi.md: {noYAML: true}
03-6-litrev-syncronisation-entrainment.md: {noYAML: true}
output: myOutput.md
---
But my list of files is actually longer and sometimes the filenames change.
Thanks! k
I'd like to open a few methods up (and make sure they are up to date) to allow other applications to depend on mine. Things I want to open up:
When I have some external link in my file that contains a "colon", then the execution of the merge-markdown
becomes very slow (scales with the number of external links I have).
For example, if I have a line
See the google doc file: [googledoc](https://docs.google.com/)
Then it will take very long to finish executing. Please help.
Currently it exists in the copyright with
Fallback options should go:
Local then Global then Default
This applies to:
Hello,
It appears that there is some kind of race condition when trying to merge files.
This does not happen every time but often enough.
You can reproduce the issue with this manifest (all the markdown files are empty):
---
input:
1.md: {noYAML: true}
2.md: {noYAML: true}
3.md: {noYAML: true}
4.md: {noYAML: true}
5.md: {noYAML: true}
6.md: {noYAML: true}
7.md: {noYAML: true}
8.md: {noYAML: true}
9.md: {noYAML: true}
10.md: {noYAML: true}
TOC: false
output: bug.md
---
output:
❯ merge-markdown
No -m argument given. Using default: manifest.[md|yaml|yml|json]
Found manifest to use: manifest.yml
*********.//1.md*********
1.md.temp added to merge list
*********.//2.md*********
2.md.temp added to merge list
*********.//3.md*********
3.md.temp added to merge list
*********.//4.md*********
4.md.temp added to merge list
*********.//5.md*********
5.md.temp added to merge list
*********.//6.md*********
6.md.temp added to merge list
*********.//7.md*********
7.md.temp added to merge list
*********.//8.md*********
8.md.temp added to merge list
*********.//9.md*********
9.md.temp added to merge list
*********.//10.md*********
10.md.temp added to merge list
++++++++++++++++++++
List of files to merge:
.//1.md.temp
.//2.md.temp
.//3.md.temp
.//4.md.temp
.//5.md.temp
.//6.md.temp
.//7.md.temp
.//8.md.temp
.//9.md.temp
.//10.md.temp
node:internal/process/promises:288
triggerUncaughtException(err, true /* fromPromise */);
^
[Error: ENOENT: no such file or directory, open '/private/tmp/poc/5.md.temp'] {
errno: -2,
code: 'ENOENT',
syscall: 'open',
path: '/private/tmp/poc/5.md.temp'
}
Node.js v18.8.0
Thanks for the tool !
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.