Giter Site home page Giter Site logo

knennigtri / merge-markdown Goto Github PK

View Code? Open in Web Editor NEW
58.0 3.0 9.0 2.13 MB

A tool to take in a list of markdown files and merge them with optional HTML/PDF output

JavaScript 96.64% Dockerfile 3.36%
doctoc html markdown markdown-link-check merge noyaml pandoc pdf presentation wkhtmltopdf

merge-markdown's People

Contributors

knennigtri avatar xabiergoros avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

merge-markdown's Issues

multi root merge file

Currently use cases have been tested for merging many modules together in the same guide folder with the same common asset folder. Next generation will allow for a merge file with multiple guide folders and multiple asset folders:

  • GuideA folder
    • assets folder
    • moduleA1 folder
      • moduleMergedfile.md << covered
    • moduleA2 folder
    • guideMergedfile.md << covered
  • GuideB folder
    • assets folder
    • moduleB1 folder
    • moduleB2 folder
  • multiGuideMergedFile.md << not covered

update options to be JSON rather than array

This will allow for custom options.

Rather than
"lorem-module.md": ["noYAML","TOC"]
"lorem-module.md": {"noYAML":true,"TOC":"#### Module Contents","timestamp":"06/20/2021"}

Error out better when a user doesn't have wkhtmltopdf installed

merge-markdown requires wkhtmltopdf installed first: http://wkhtmltopdf.org/downloads.html

C:\Users\prpurush\AppData\Roaming\npm\node_modules\@knennigtri\merge-markdown\node_modules\wkhtmltopdf\index.js:180
    throw new Error(err); // critical error
    ^
Error: Error: spawn wkhtmltopdf ENOENT
    at ChildProcess.<anonymous> (C:\Users\prpurush\AppData\Roaming\npm\node_modules\@knennigtri\merge-markdown\node_modules\wkhtmltopdf\index.js:180:11)
    at Object.onceWrapper (events.js:520:26)
    at ChildProcess.emit (events.js:400:28)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:275:12)
    at onErrorNT (internal/child_process.js:467:16)
    at processTicksAndRejections (internal/process/task_queues.js:82:21)

Allow global replacements

{
	"input": {
		"lorem-frontmatter.md": "",
		"lorem-module.md": {"noYAML":true,"TOC":true}
	},
	"output": "+moduleGuide.md",
	"replace":{
		"timestamp":"06/01/2021",
		"returnToMainTOC": "[...back to main TOC](#course-contents)"
	}
}

replace by regex doesnt match all cases

i tried using

replace:
(^#): "##"

to change my headers hierarchy.

My files looked like:

# header 1
## header 2
### header 3

With the regex i expected as result

## header 1
### header 2
#### header 3

Instead I received:

## header 1
## header 2
### header 3

folders in input

This should be allowed

---
input:
 frontmatter.md: ""
 lorem-module/: ""
 ipsum-module/: ""
replace:
 <!--{timestamp}-->: 05/25/2021
 <!--{returnToMainTOC}-->: "[...back to main TOC](#course-contents)"
 <!--{courseTitle}-->: My Course Title
 <!--{author}-->: Chuck Grant
 <!--#-->: ""
 ({#(.*?)}): ""
---

Allow option for global TOC

Currently there is support for module/chapter/file specific TOCs but there is not an option for a global TOC of the final merged document. Previously this was handled by using typora, but to make this tool agnostic of the editor, there needs to be an option to add a final global TOC by using the manifest.

Using DocToc, you can specify where the final global TOC should be located:

<!-- START doctoc -->
<!-- END doctoc -->

IN the manifest, I think it would be an optional global param, either array or comma separated of available configurations.

globalToc: "mode, maxHeaderLevel, title, notitle, entryPrefix, processAll,"

From Doctoc

transform(content, mode, maxHeaderLevel, title, notitle, entryPrefix, processAll, updateOnly)

Cover for spaces in the paths

It seems right now when you have your project in a path with space, it breaks on the PDF creation part, so I need to double check how pandoc and wkhtmltopdf handles paths.

Optional full configuration of pandoc and wkhtmltopdf

Currently the features customizable to these tools are limited, but I would like to make them as customizable as possible. Based on their projects these are the feature that could be customized:

node-pandoc
If there is a pandoc configuration, the value will automatically be read into the pandoc args of node-pandoc

{
"css": "-c main.css"
"template": "--template template.latex"
"...": "any other commands of pandoc"
}

wkhtmltopdf
Below is an example, but effectively any command from https://wkhtmltopdf.org/usage/wkhtmltopdf.txt can be used.

{
 "B": "1in",
 "T": "1in",
 "L": ".7in",
 "R": ".7in",
 "s": "Letter",
 "footerLine": true,
 "footerCenter": "Page [page]",
}

Allow for assets in any project location

Currently it's assumed that the assets folder will be at the root location so that all markdown files share a common location for assets. This allows for the final markdown file to be created with correct relative links to the assets. Unfortunately this requires the content creator to know this and architect the project accordingly.

The goal is learn the relative link between source md > assets and then create an updated relative link from final md > assets.

Option to disable "auto-resolution"

It would be great if there was an option to disable the "auto-resolution" feature.

I'm currently using merge-markdown to combine Marp documents and Marp requires relative path.

merging nested md files

Hello there! Thanks for creating this. May I know how can I merge all md files in a directory including those in sub-directories?

error when merging manifest file

error message:
`merge-markdown -m SUMMARY.md
Found manifest to use: SUMMARY.md
Manifest file does not contain valid YAML or JSON content.
"... is not valid JSONd token '#', "# Summary
at JSON.parse ()
at getManifestJSON (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:174:22)
at Object.init [as run] (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:116:24)
at Object. (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\bin\global.js:5:15)
at Module._compile (node:internal/modules/cjs/loader:1376:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)
at Module.load (node:internal/modules/cjs/loader:1207:32)
at Module._load (node:internal/modules/cjs/loader:1023:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:135:12)
at node:internal/main/run_main_module:28:49
Manifest does not exist or has incorrect syntax. Choose a valid folder or file.
Usage: merge-markdown [ARGS]
Arguments:
-m Path to input folder, yaml, or json manifest
-v, --version Displays version of this package
--qa QA mode.
--nolinkcheck Skips linkchecking
--pdf Output to PDF. wkhtmltopdf must be installed http://wkhtmltopdf.org/downloads.html
--html Output to HTML
-h, --help Displays this screen
-h [manifest|options|outputOptions|qa] See examples
Default manifest: manifest.[md|yaml|yml|json] unless specified in -m.

C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:124
throw err;
^

"... is not valid JSONd token '#', "# Summary
at JSON.parse ()
at getManifestJSON (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:174:22)
at Object.init [as run] (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\index.js:116:24)
at Object. (C:\Program Files (x86)\Nodist\bin\node_modules@knennigtri\merge-markdown\bin\global.js:5:15)
at Module._compile (node:internal/modules/cjs/loader:1376:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)
at Module.load (node:internal/modules/cjs/loader:1207:32)
at Module._load (node:internal/modules/cjs/loader:1023:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:135:12)
at node:internal/main/run_main_module:28:49

Node.js v20.10.0`

content of manifest file:
`# Summary

Make -m simpler

-m can take in:

  • folder
  • manifest file

If it's a folder, check for a manifest.json file. if there is no manifest file, automatically merge all files in the folder together with default manifest options. Name the output the same as the input folder

if --qa is set and manifest.qa.exclude DNE the value defaults to frontmatter

if -m is not specified, look for a manifest.json at the root. If there is no manifest file, automerge files in the root together. output file will be generically called merge.md

Add simple example in README

I have the following folder:
image

How to generate a single md ? I couldn't figure it out using the README.
Maybe add the trivial case first, before discussion variations in README

update -m folder use case

Currently the markdown output defaults to the input folder. it should be updated to output the output files to the current working directory.

rename toc to doctoc

user should be able to use toc or doctoc option within the manifest. This will be important once #47 is implemented so users understand where the different customizable options for TOC come from.

Customizable Doctoc options

Currently the features customizable to these tools are limited, but I would like to make them as customizable as possible. Based on their projects these are the feature that could be customized:

doctoc
based on this method transform(files, mode, maxHeaderLevel, title, notitle, entryPrefix, processAll, stdOut, updateOnly)
these will be customizable and below are the defaults:

{
"mode": "github.com", 
"maxlevel": 3, 
"title": "", 
"notitle": false,
"all": true
"update-only": true
}

manifest of manifests

A manifest should also be able to take in other manifests and rerun the process. Here's how it would work:
Input:

  • manifest1.json
  • manifest.json
    Output:
  • final.md

The tool would recognize it's a manifest file and recursively call the merge.js script.
merge-js produces a moduleFinal.md file that is now taken in as the inputFile.

Add output options to output

Currently output options are mixed in with global options. Ideally the output needs to read something like:

  • if manifest.output===string then manifest.output = name, else look for manifest.output.name
  • pandoc options = manifest.output.pandoc || manifest.pandoc
  • wkhtmltopdf options = manifest.output.wkhtmltopdf || manifest.wkhtmltopdf
  • mergedTOC = manifest.output.mergedTOC || manifest.TOC

Ideally they should be nested under output:

---
input:
 ../../frontmatter.md: ''
 mymodule.md: {noYAML: true, TOC: true, replace: {<!--#-->: "Module 1:"}}
output: 
 name: "merged/mymerged.md"
 mergedTOC: true
 pandoc:
  css: -c main.css
  latexTemplate: --template template.latex
  title: -M title:Example
 wkhtmltopdf:
  marginBottom: 1in
  marginTop: 1in
  marginLeft: .7in
  marginRight: .7in
  pageSize: Letter
  footerLine: true
  footerCenter: Page [page]
qa: {exclude: "(frontmatter)"}
replace:
 <!--{copyrightYear}-->: 2022
 <!--{timestamp}-->: 10/22/2022
 <!--{returnToMainTOC}-->: "[Return to Course Contents](#course-contents)"
 <!--{courseType}-->: Activity Guide
 <!--{courseTitle}-->: My Course Title
 <!--{courseCreator}-->: The Merge Company
 <!--{author}-->: Ronan Boxer
---

Regex for input

As a user, I want to be able to add regex values to the input param so that I don't have to manually write out every single file in the manifest. Examples:

Any file in the current directory starting with module-

input:
  frontmatter: ""
  module-*:  {noYAML: true}

Any file in the current directory starting with a number

input:
  frontmatter: ""
  ^[0-9]:  {noYAML: true}

Any file in the current directory ending with a word

input:
  frontmatter: ""
  word$:  {noYAML: true}

Any file in module/ ending in -chapter.md

input:
  /(?<=module/).*(?=-chapter.md)/g: ""
  module-*: {noYAML: true}

merge-markdown -m path/to/files without YAML

hello!

Is it possible to merge all files in a folder without the YAML front matter?

I could only merge files without YAML front matter by using the myManifest.md file and the option for each file e.g.

---
input:
  00-1-title-pages.md: {noYAML: true}
  00-2-toc.md: {noYAML: true}
  00-3-figures.md: {noYAML: true}
  00-4-tables.md: {noYAML: true}
  00-5-acronyms-and-definitions.md: {noYAML: true}
  01-introduction.md: {noYAML: true}
  02-0-methodology.md: {noYAML: true}
  02-1-method-litrev.md: {noYAML: true}
  03--litrev-overview.md: {noYAML: true}
  03-1-litrev-bci.md: {noYAML: true}
  03-2-litrev-nf.md: {noYAML: true}
  03-3-litrev-meditation.md: {noYAML: true}
  03-4-litrev-neuromeditation.md: {noYAML: true}
  03-5-litrev-bcmi.md: {noYAML: true}
  03-6-litrev-syncronisation-entrainment.md: {noYAML: true}
output: myOutput.md
---

But my list of files is actually longer and sometimes the filenames change.

Thanks! k

Make more accessible using node.js

I'd like to open a few methods up (and make sure they are up to date) to allow other applications to depend on mine. Things I want to open up:

  • getManifest
  • merge files
  • add presentation

When I have an external link in my markdown file then the execution gets very slow

When I have some external link in my file that contains a "colon", then the execution of the merge-markdown becomes very slow (scales with the number of external links I have).

For example, if I have a line

See the google doc file: [googledoc](https://docs.google.com/)

Then it will take very long to finish executing. Please help.

Race condition ? - no such file or directory

Hello,

It appears that there is some kind of race condition when trying to merge files.

This does not happen every time but often enough.

You can reproduce the issue with this manifest (all the markdown files are empty):

---
input:
  1.md: {noYAML: true}
  2.md: {noYAML: true}
  3.md: {noYAML: true}
  4.md: {noYAML: true}
  5.md: {noYAML: true}
  6.md: {noYAML: true}
  7.md: {noYAML: true}
  8.md: {noYAML: true}
  9.md: {noYAML: true}
  10.md: {noYAML: true}
  
TOC: false
output: bug.md
---

output:

❯ merge-markdown
No -m argument given. Using default: manifest.[md|yaml|yml|json]
Found manifest to use: manifest.yml
*********.//1.md*********
1.md.temp added to merge list
*********.//2.md*********
2.md.temp added to merge list
*********.//3.md*********
3.md.temp added to merge list
*********.//4.md*********
4.md.temp added to merge list
*********.//5.md*********
5.md.temp added to merge list
*********.//6.md*********
6.md.temp added to merge list
*********.//7.md*********
7.md.temp added to merge list
*********.//8.md*********
8.md.temp added to merge list
*********.//9.md*********
9.md.temp added to merge list
*********.//10.md*********
10.md.temp added to merge list
++++++++++++++++++++
List of files to merge:
    .//1.md.temp
    .//2.md.temp
    .//3.md.temp
    .//4.md.temp
    .//5.md.temp
    .//6.md.temp
    .//7.md.temp
    .//8.md.temp
    .//9.md.temp
    .//10.md.temp
node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: ENOENT: no such file or directory, open '/private/tmp/poc/5.md.temp'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/private/tmp/poc/5.md.temp'
}

Node.js v18.8.0

Thanks for the tool !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.