Giter Site home page Giter Site logo

souvikinator / notion-to-md Goto Github PK

View Code? Open in Web Editor NEW
980.0 7.0 82.0 222 KB

Convert notion pages, block and list of blocks to markdown (supports nesting and custom parsing)

Home Page: https://www.npmjs.com/package/notion-to-md

License: MIT License

TypeScript 99.56% JavaScript 0.44%
notion notion-api notion-database md markdown notion-to-md notion2md notion-client notion-markdown nodejs

notion-to-md's Introduction


notion-to-md banner
Notion-to-MD
(Notion to Markdown)

Notion-to-MD is a Node.js package that allows you to convert Notion pages to Markdown format.

Convert notion pages, blocks and list of blocks to markdown (supports nesting) using notion-sdk-js

notion-to-md - Programmatically convert notion pages to markdown | Product Hunt

Install

npm install notion-to-md

Usage

⚠️ Note: Before getting started, create an integration and find the token. Details on methods can be found in API section

⚠️ Note: Starting from v2.7.0, toMarkdownString no longer automatically saves child pages. Now it provides an object containing the markdown content of child pages.

converting markdown objects to markdown string

This is how the notion page looks for this example:

const { Client } = require("@notionhq/client");
const { NotionToMarkdown } = require("notion-to-md");
const fs = require('fs');
// or
// import {NotionToMarkdown} from "notion-to-md";

const notion = new Client({
  auth: "your integration token",
});

// passing notion client to the option
const n2m = new NotionToMarkdown({ notionClient: notion });

(async () => {
  const mdblocks = await n2m.pageToMarkdown("target_page_id");
  const mdString = n2m.toMarkdownString(mdblocks);
  console.log(mdString.parent);
})();

Separate child page content

parent page content:

child page content:

NotionToMarkdown takes second argument, config

const { Client } = require("@notionhq/client");
const { NotionToMarkdown } = require("notion-to-md");
const fs = require('fs');
// or
// import {NotionToMarkdown} from "notion-to-md";

const notion = new Client({
  auth: "your integration token",
});

// passing notion client to the option
const n2m = new NotionToMarkdown({ 
  notionClient: notion,
    config:{
     separateChildPage:true, // default: false
  }
 });

(async () => {
  const mdblocks = await n2m.pageToMarkdown("target_page_id");
  const mdString = n2m.toMarkdownString(mdblocks);
  
  console.log(mdString);
})();

Output:

toMarkdownString returns an object with target page content corresponding to parent property and if any child page exists then it's included in the same object.

User gets to save the content separately.

Disable child page parsing

...

const n2m = new NotionToMarkdown({ 
  notionClient: notion,
    config:{
     parseChildPages:false, // default: parseChildPages
  }
 });

...

converting page to markdown object

Example notion page:

const { Client } = require("@notionhq/client");
const { NotionToMarkdown } = require("notion-to-md");

const notion = new Client({
  auth: "your integration token",
});

// passing notion client to the option
const n2m = new NotionToMarkdown({ notionClient: notion });

(async () => {
  // notice second argument, totalPage.
  const x = await n2m.pageToMarkdown("target_page_id", 2);
  console.log(x);
})();

Output:

[
  {
    "parent": "# heading 1",
    "children": []
  },
  {
    "parent": "- bullet 1",
    "children": [
      {
        "parent": "- bullet 1.1",
        "children": []
      },
      {
        "parent": "- bullet 1.2",
        "children": []
      }
    ]
  },
  {
    "parent": "- bullet 2",
    "children": []
  },
  {
    "parent": "- [ ] check box 1",
    "children": [
      {
        "parent": "- [x] check box 1.2",
        "children": []
      },
      {
        "parent": "- [ ] check box 1.3",
        "children": []
      }
    ]
  },
  {
    "parent": "- [ ] checkbox 2",
    "children": []
  }
]

converting list of blocks to markdown object

const { Client } = require("@notionhq/client");
const { NotionToMarkdown } = require("notion-to-md");

const notion = new Client({
  auth: "your integration token",
});

// passing notion client to the option
const n2m = new NotionToMarkdown({ notionClient: notion });

(async () => {
  // get all blocks in the page
  const { results } = await notion.blocks.children.list({
    block_id,
  });

  //convert to markdown
  const x = await n2m.blocksToMarkdown(results);
  console.log(x);
})();

Output: same as before

Converting a single block to markdown string

  • only takes a single notion block and returns corresponding markdown string
  • nesting is ignored
  • depends on @notionhq/client
const { NotionToMarkdown } = require("notion-to-md");

// passing notion client to the option
const n2m = new NotionToMarkdown({ notionClient: notion });

const result = n2m.blockToMarkdown(block);
console.log(result);

result:

![image](https://media.giphy.com/media/Ju7l5y9osyymQ/giphy.gif)

Custom Transformers

You can define your own custom transformer for a notion type, to parse and return your own string. setCustomTransformer(type, func) will overload the parsing for the giving type.

const { NotionToMarkdown } = require("notion-to-md");
const n2m = new NotionToMarkdown({ notionClient: notion });
n2m.setCustomTransformer("embed", async (block) => {
  const { embed } = block as any;
  if (!embed?.url) return "";
  return `<figure>
  <iframe src="${embed?.url}"></iframe>
  <figcaption>${await n2m.blockToMarkdown(embed?.caption)}</figcaption>
</figure>`;
});
const result = n2m.blockToMarkdown(block);
// Result will now parse the `embed` type with your custom function.

Note Be aware that setCustomTransformer will take only the last function for the given type. You can't set two different transforms for the same type.

You can also use the default parsing by returning false in your custom transformer.

// ...
n2m.setCustomTransformer("embed", async (block) => {
  const { embed } = block as any;
  if (embed?.url?.includes("myspecialurl.com")) {
    return `...`; // some special rendering
  }
  return false; // use default behavior
});
const result = n2m.blockToMarkdown(block);
// Result will now only use custom parser if the embed url matches a specific url

Contribution

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. Please make sure to update tests as appropriate.

Contributers

License

MIT

notion-to-md's People

Contributors

alvinometric avatar amnano avatar dantehemerson avatar dharshatharan avatar doradx avatar dyllan-to-you avatar emoriarty avatar gregonarash avatar hatton avatar kungpaogao avatar marviel avatar miaogaolin avatar raphtlw avatar scopsy avatar smor avatar souvikinator avatar souvikns avatar that-ambuj avatar ugross avatar vildar avatar zirkelc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

notion-to-md's Issues

Incorrect syntax for the orderest lists.

Hey,
For the ordered lists (numbered_list_item) , the incorrect syntax for unordered lists (bulleted_list_item) is rendered.

Expected:

1. Hey
2. There

Actual:

- Hey
- There

I have tried an approach that might be able to solve this issue #19. Happy to discuss on it.

"Could not find block" error

I'm using the first example from the readme to get the markdown from a page. I have added the integration to the relevant page and copied the page ID (correctly) from the page. However, when I run the script, I get a 404 "Could not find block" error for a different page ID. I have no idea why this is happening. Am I missing something? What can I do to debug this?

Code:

const { Client } = require("@notionhq/client");
const { NotionToMarkdown } = require("notion-to-md");
const fs = require('node:fs/promises');

const TOKEN = process.env.TOKEN;

const notion = new Client({
  auth: TOKEN,
});

// passing notion client to the option
const n2m = new NotionToMarkdown({ notionClient: notion });

(async () => {
  const mdblocks = await n2m.pageToMarkdown("d82b628c-e95a-4723-82fc-67aaec962db9");
  const mdString = n2m.toMarkdownString(mdblocks);

  //writing to file
  fs.writeFile("test.md", mdString, (err) => {
    console.log(err);
  });
})();

Error:

@notionhq/client warn: request fail {
  code: 'object_not_found',
  message: 'Could not find block with ID: dfc6d411-0a31-45d6-ba51-be5ea4be0cf5. Make sure the relevant pages and databases are shared with your integration.'
}

Note that the block ID in the error is different than what I am supplying in my script.

Suggestion: Add default return value in setCustomTransformer callback function

Hi.
When using setCustomTransformer, we may want to transform the output only in some situations. It would be great to be able to return the default n2m output when we need to.

Exemple of what it may look like:

n2m.setCustomTransformer("paragraph", async (block, defaultOutput) => {
  if (hasSomeProperty(block)) return someOutput(block);
  return defaultOutput(block);
});

Additionally, we may want to only slightly transform the default output and it may make things easier to generate that output string then transform it.

@notionhq/client has changed `text` property to `rich_text`

From version 1.0.1, @notionhq/client has renamed the text property to rich_text. Obviously, this change breaks the expected output.
Screen Shot 2022-03-05 at 07 12 44

The following notion blocks:
Screen Shot 2022-03-05 at 07 07 54

Are converted to:

#

##

###

As can be seen, the text content is lost.

I don't know what impact the other changes bring, but this is the most noticeable 😅.

Until a fix is available, use a previous version of @notionhq/client.

New document for each page?

I finally got this working (thanks for your help!) but now I've discovered that it tries to put everything into a single markdown file. Is there a way to get it to create a new markdown file for each child page?

Feature: Typescript support

It would be great to have typescript support, either by adding type files or converting the project to typescript.

Support of Table of Contents

Is there a possibility to make a Table of Contents from notion work?

I am using notion-to-md for my blog and wanted to see what can be done

Feature Request: A way to add markup before and after children blocks

Hi.
It would be really useful to be able to add markup before and after children blocks. It would allow us to inject html tags more easily. The "outline structure" of Notion blocks could be used to specify children of a tag in a natural way. Since most (all?) markdown parsers accept html, it makes sense to me to easily allow replacing blocks with html tags.

One obvious candidate for such feature is the toggle block. I saw that you currently need to re-fetch children blocks inside the toggle transform function. Wouldn't it be more consistent if the parent toggle block only outputs <details><summary>${summary}</summary> as 'opening output' and </details> as 'closing output'? Then the children's outputs would be injected in between the opening and closing strings of the parent.

Of course, there would be some considerations to keep in mind. For example, children would need to be aware of all their parents' types because we shouldn't mistakenly create <p> elements inside <p> elements for example.
That is not really up to this library to make this check but simply allow one to implement her own logic with the setCustomTransformer function.

Example

n2m.setCustomTransformer("toggle", async (block, ancestors) => {
  const { has_children, toggle } = block;
  let toggle_rich_md = "";
  let toggle_plain_text = "";
  toggle.rich_text.forEach((rich_text) => {
    toggle_rich_md += n2m.annotatePlainText(
      rich_text.plain_text,
      rich_text.annotations
    );
    toggle_plain_text += rich_text.plain_text
  });

  // if a string is returned, it is the 'opening output' and there is no 'closing output'
  // (so it stays consistent with the current API)
  if (!has_children) return `<p>${toggle_rich_md}</p>`

  // if an object is returned, children of this block will be written between the 'open' string and the 'close' string
  return { open: `<details><summary>${toggle_plain_text}</summary>`, close: `</details>`};
});

Invalid RegEx on Safari

Works fine on Chrome and Firefox, but on Safari I receive the following unhandled runtime error when accessing the project through localhost:

'SyntaxError: Invalid regular expression: invalid group specifier name'

File with issue: ./node_modules/notion-to-md/build/utils/md.js

notion-to-md v2.5.5

Improvement: Use string union for Block Type instead of string

Hi, I have noticed that the setCustomTransformer method takes the first argument as a string for the type of block to use the transformer function(the second argument) which can be anything arbitrary. However there can be spelling mistakes by the users of this library and in order leverage typescript's type safety and tsserver's intellisense, that we use a string union for the first argument of the setCustomTransformer so that the users can get good autocompletion.

The code could look like this:

type BlockType = "image" | "video" | "file" | "pdf" | .... | string

(I've added the last union as a string so that there is flexibility to use any arbitrary string at the risk of the user of this library while also providing intellisense to the users of the library who would otherwise have to guess what string to put here.)

and we could change the type definition for setCustomTransformer to:

setCustomTransformer(type: BlockType, ...): ...

Missing tab spaces for text with new line

Nested Text with multiple new lines has a tab appended only at the beginning of the text.

Text: lorem ipsum\nlorem ipsum
Expected Result:

    lorem ipsum
    lorem ipsum

Outcome:

    lorem ipsum
lorem ipsum

Adding a custom transformer for synced blocks produces double output

I'm using docu-notion which consumes this module.

I created a custom transformer for synced blocks since they were getting ignored.

However, the markdown output is doubled. Somehow the transform function is getting called twice.

The code for handling synced blocks is a bit unclear.

I suspect it's complicated by that for synced blocks, you need to dig into the children and then follow the synced blocks there. But even if my custom transformer just returns some hard-coded text string, that output is also doubled, so the doubling is not due to the custom transformer, it's seems the problem is upstream.

Relevant notion doc:

image

Then the relevant output is:

image

The transformer to test is:

{
  type: "synced_block",
  getStringFromBlock: async (
    context: IDocuNotionContext,
    block: NotionBlock
  ) => {
    return `test output ${Math.random()}`;
  }
}

Missing tests

In the context of testing I would like to talk about two things

  1. Hitting Notion API to fetch block contents now to successfully run this tests we will be depending on notion API and need an API key and also the database has to follow a pattern which will be hard to test locally every time anyone clones the repo.
  2. Secondly we are parsing the block JSON object and converting it into markdown flavored string, I think this is something we can test effectively and should mock the block data structure to do so.

Keep errors uncaught

Hello!
First of all just wanna say thank you for this library. It's very helpful in dealing with getting the content of a Notion page as markdown.

But I believe it is better if error handling would be delegated instead to the user of the library such that we're able to handle the errors ourselves. I understand that returning an empty array would ultimately be fine but I think it's better for the users to be able to handle the errors from the Notion API as they see fit. Some may not even want to have the error logged out.

Code in question below:

} catch (e) {
console.log(e);
return [];
}

I myself was surprised that enclosing await n2m.pageToMarkdown(pageId) in a try-catch block did not throw any errors when the pageId I supplied is invalid, hoping that I'll get the error and handle it on my end. But apparently the library was already doing that so I might have no way to catch that error and do my own handling.

If I'm not mistaken, just removing the try-catch block inside getBlockChildren() should be enough since if an error is thrown there, it would escalate all the way up to which function of the library we called.

Would like to know your thoughts on this.

Callout block support feature request

Why?

We use callouts on our projects to highlight some blocks. And would love to add support for callouts for notion-to-md.
Currently, the library only generates a normal code block in that case

How?

We can convert callouts to a quote block:

💡 Text content of the quote.

Toggle summary may be truncated

Hello,

I'm trying to convert a Notion /toggle's summary properly, but it is truncated.

This :
image
is converted as :

<summary>Les principes de base du fonctionnement des Fablabs ont été définis par la&nbsp;</summary>

This is due to https://github.com/souvikinator/notion-to-md/blob/master/src/notion-to-md.ts#L305 handling rich_text[0] only.

Directly producing HTML is not this package's purpose and I'd rather output the full summary as Markdown, but outputting <details> and <summary> is probably the only way this can be done.

Any idea on how to deal with this properly ?

page in markdown

hi, i wanna ask that can i get the page_id in the pageblock not the link_to_page block?
I3_MXG3GYR QJ7~I3GV`JUL

Improving the complexity and flow of the package

After working on a few issues I realized that there is a need to work on the whole flow of notion-to-md and how it handles different blocks.
Also, I'm not sure but it feels like the complexity of the whole package can be improved however we have to figure that out.

Here you can track the progress of the refactoring and any contribution is appreciated.

Detailed walkthrough?

Is there a walkthrough or tutorial somewhere that offers step-by-step instructions on how to use this? Thanks!

Unified way to handle links

I'm struggling with inline links to other pages in Notion content. Contrary to what one might expect, those aren't reported as link_to_page, but will silently be passed to md.link():

if (content["href"])
  plain_text = md.link(plain_text, content["href"]);

parsedData += plain_text;

See /src/notion-to-md.ts#L398-L401

In this case, href might be /fae18db43ae240bbb05771a6e531b494, which makes it hard to detect valid links and replace them with references to the local pages.

A good solution would be to pass those to some kind of (new?) link transformer, if available.

Problem with start (Object.fromEntries is not a function)

Hello! My code is:

import * as fs from 'fs';
import * as path from 'path';
const { Client } = require("@notionhq/client");
//const { NotionToMarkdown } = require("notion-to-md");
// or
 import {NotionToMarkdown} from "notion-to-md";

const notion = new Client({
  auth: "secret_M*************Z",
});

// passing notion client to the option
const n2m = new NotionToMarkdown({ notionClient: notion });

(async () => {
  const mdblocks = await n2m.pageToMarkdown("ed937efd320c43a7bc32227728bcbc1e");
  const mdString = n2m.toMarkdownString(mdblocks);

  //writing to file
  fs.writeFile("test.md", mdString, (err) => {
    console.log(err);
  });
})();

then i start with command:

npx ts-node 1.ts

And get error:

npx: installed 15 in 2.176s
TypeError: Object.fromEntries is not a function
    at pick (/node_modules/@notionhq/client/src/helpers.ts:19:17)
    at Object.list (/node_modules/@notionhq/client/src/Client.ts:280:22)
    at getBlockChildren (/node_modules/notion-to-md/src/utils/notion.ts:15:60)
    at NotionToMarkdown.pageToMarkdown (/node_modules/notion-to-md/src/notion-to-md.ts:60:42)
    at /1.ts:16:30
    at Object.<anonymous> (/1.ts:23:3)
    at Module._compile (internal/modules/cjs/loader.js:778:30)
    at Module.m._compile (/root/.npm/_npx/8486/lib/node_modules/ts-node/src/index.ts:1455:23)
    at Module._extensions..js (internal/modules/cjs/loader.js:789:10)
    at Object.require.extensions.(anonymous function) [as .ts] (/root/.npm/_npx/8486/lib/node_modules/ts-node/src/index.ts:1458:12)
null

Please help

Broken blockquotes

This

Screenshot 2022-06-19 at 4 52 26 PM

When converted to markdown using this package, looks like:

> 

        Your mind is for having ideas, not holding them.

When it should have been rendered as:

> Your mind is for having ideas, not holding them.

On the website, it gets rendered as a code block, which is not what I expected.

Screenshot 2022-06-19 at 4 54 56 PM

Will try my best to fork and create a pull request to resolve this issue. If you've got the time, please offer guidance on how this issue can be fixed, thank you.

A "raw" link to another notion page is dropped

A raw link to another website, e.g. This is a bare to google https://www.google.com/ is emitted as
This is a bare to google [https://www.google.com/](https://www.google.com/)

A bit of text with an underlying link to another page
image

is emitted as
This is an inline link another page [Apples](/c62dbc3fede94264a1d5e0245fa4be73).

However, if you have a link without a text label, then notion-to-md just drops it. This
image
is emitted as
This is a raw link to the intro:

In the content returned by the API, this last case comes as this kind of block

 {
      "object": "block",
      "id": "5f4fdb6f-8adb-4cff-9137-2168530b78bf",
      <snip>
      "type": "link_to_page",
      "link_to_page": {
        "type": "page_id",
        "page_id": "8b16a919-6add-4e37-b1a4-4103feb24d5d"
      }
}

Reference Synced Blocks are not supported

Please add support for reference synced blocks as well. Currently, these are silently ignored in output. original synced blocks are working fine though

block: {
  "object": "block",
  "id": "77b645b9-2edee724eb",
  "parent": {
    "type": "page_id",
    "page_id": "77e1abf5cd665865"
  },
  "created_time": "2022-10-14T06:08:00.000Z",
  "last_edited_time": "2022-10-14T06:08:00.000Z",
  "created_by": {
    "object": "user",
    "id": "ecc52c68-e563-491698"
  },
  "last_edited_by": {
    "object": "user",
    "id": "ecc52c68-e563-4f65-9561-1e"
  },
  "has_children": true,
  "archived": false,
  "type": "synced_block",
  "synced_block": {
    "synced_from": {
      "type": "block_id",
      "block_id": "52edd340-0926a7c9e2"
    }
  }
}

Synced blocks wrongly produce indented Markdown output

Hello,

Thanks for the good work, this package is very useful to me !

General context

I'm trying to build a Notion to Hugo process using notion-to-md, which I simply called notion-to-hugo. I want to use it to build online courses in the most generic way I can, allowing content creators to leverage the full potential of Notion while ensuring that the final website works in our course setup.

I want to be able to customize the rendering process to be able to adapt the generated content for the particular Hugo settings that I use, including shortcodes for instance.

I use Notion Synced Block to reuse content across pages.

I started to work on a generic pre/post-processing pipeline here. I was very happy to see setCustomTransformer emerging, as I like the direction it's heading !

The issue

You can see that in this Notion page there is a Notion synced block. When processed with notion-to-md, the Markdown MdBlock output is the following :

	This block is shared with other pages and editing one changes all other instances.

	> 💡 This is a callout which turns into a `<div class="notices note"></div>` block.

As you can see, the content is indented, leading to the paragraph being converted as a code block by Hugo, which you can see here on the generated HTML page produced by notion-to-hugo through notion-to-md.

Analysis

The block structure produced by notion-to-md is the following :

  {
    "type": "synced_block",
    "parent": "",
    "children": [
      {
        "type": "paragraph",
        "parent": "This block is shared with other pages and editing one changes all other instances.",
        "children": []
      },
      {
        "type": "callout",
        "parent": "> 💡 This is a callout which turns into a `<div class=\"notices note\"></div>` block.",
        "children": []
      }
    ]
  }

We can see that the actual content blocks are children of the synced block. The synced block is only a container for actual content, and I don't see any reason why we should treat its children as real children.

The indent is made here :

      if (mdBlocks.parent) {
        if (
          mdBlocks.type !== "to_do" &&
          mdBlocks.type !== "bulleted_list_item" &&
          mdBlocks.type !== "numbered_list_item"
        ) {
          // add extra line breaks non list blocks
          mdString += `\n${md.addTabSpace(mdBlocks.parent, nestingLevel)}\n\n`;
        } else {
          mdString += `${md.addTabSpace(mdBlocks.parent, nestingLevel)}\n`;
        }
      }

My guess is that we should not addTabSpace to children of blocked whose type is synced_block.

Attempts at solving the issue

I tried the following things :

  • remove the leading spaces in the generated md output with one of my own post-processor : I seem to lack information about the blocks if I plug it after generating the Markdown output ;
  • use setCustomTransformer to deal with the particular case : I did not find a proper way to use it for my case ;
  • fiddle with toMarkdownString to deal with synced_block : I can't get it to work as I could find no way of determining the parent's type here.

I'm feeling stuck, probably by my lack of understanding of some of the processing flows... Would you be able to help me sort it out ? I hope I provided enough information for that.

Thanks a lot !

Does this recursively render nested pages?

I have a book written in Notion where each chapter is its own Page, all collected in one master Page. Does this library recursively render those child pages (and any child pages they may have) into one single concatenated Markdown string?

Thanks!

Behaviour for uploaded images?

I was seeing that for images that are added to Notion via a public url are exported as is with the same url in markdown. For custom/actually uploaded images (non publicly hosted images; from the notion filepicker/drag and drop) to Notion, how does the export work?

Does it extract the image as base64 in markdown? Or how does it work? @souvikinator

feature: handle properties as frontmatter

Hey everyone!

I thought it would be an interesting feature to allow notion-to-md to take properties of a page and convert them to YAML-style markdown frontmatter. For example:

A notion page with the following properties:

image

would get converted to the following markdown:

---
title: Second Post
summary: My second blog post, its really good and cool
slug: second-post
author: Parker Rowe
date: 2022-03-11
tags:
 - TypeScript
 - SvelteKit
 - Markdown
---

# My Second Post

Yea baby

...

I've started working on a similar parser for my personal site using Notion as my blog CMS (example WIP code here), but I thought it could be useful to have it baked into notion-to-md.

Feel free to discuss! 🍻

Feature: Add language to code blocks

I personally use syntax highlighting for code blocks, and having the language specified in the code block is how the syntax highlighting library decides to colours. The language is available in the Notion response, it just has to add it to the markdown.

Retrieve all block children

Hi! I noticed that you currently support passing totalPage to specify the number of pages/blocks that it requests from the API (thanks to #9 )

What about supporting totalPage = null, which would automatically attempts to retrieve all block children instead of a fixed number of pages? It would simplify usecases where you may not know the number of blocks there are on the page beforehand but you always know you want all of them.

If you're open to such a PR I could write and submit one!

fs.writeFile failing

I am trying the code from the README. I added the API code and the page ID, saved it as notion-to-md.js and ran it using node. But I get the following error:

node notion-to-md.js
/Users/[user folder]/Downloads/notion-to-md.js:18
  fs.writeFile("test.md", mdString, (err) => {
  ^

ReferenceError: fs is not defined
    at /Users/[user folder]/Downloads/notion-to-md.js:18:3
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Node.js v19.9.0

I tried manually creating a file called test.md, but that didn't help - same error.

Use [[page title]] instead of [page title](ID#)?

I'm interested in migrating from Notion to Logseq. For that I don't just want to have each page as a separate Markdown document , but also to use the following method of linking to pages.

Instead of page title, each page should be listed as [[page title]].

This way users of Obsidian or Logseq (as well as a dozen other apps that now use this standard), will be able to import the MD files and jump from document to document rather than linking back to Notion.

Caption for images

It's quite easy to implement caption for images:

n2m.setCustomTransformer('image', async (block) => {
  const { image } = block as ImageBlockObjectResponse;
  const src = image.type === 'external' ? image.external.url : image.file.url;
  const caption = image.caption ? image.caption[0]?.plain_text : '';

  return `
  <figure>
    <img src="${src}" alt=${caption} />
    <figcaption>${caption}</figcaption>
  </figure>`;
});

I was wondering why that's not the default behaviour?

Add parent element to blocks?

Hey, I love your work!

Is there any way we could add a parent for a specific block? My use case is I want to add a wrapper div to all tables. Is that possible?

Thanks

here's how to handle column support

I needed to show 2 images in one row (2 columns), but it isn't directly possible. Currently, content in columns is rendered as individual rows. I figured out a way to show it in columns as how it is shown in Notion.

n2m.setCustomTransformer("column_list", async (block) => {
    const mdBlocks = await n2m.pageToMarkdown(block.id);
    const mdString = n2m.toMarkdownString(mdBlocks);
    return `
    <div style='display:"flex";column-gap:"10px"'>
   ${mdString}
    </div>
    `;
  });
n2m.setCustomTransformer("column", (block) => {
// avoid rendering it twice
    return "";
  });

I thought of sharing it in case others are looking for same.

Should a notion link parse to normal hyper link?

image

I am expect it to be notion.so/{pageid}

case "link_preview":
{
let blockContent;
if (type === "bookmark") blockContent = block.bookmark;
if (type === "embed") blockContent = block.embed;
if (type === "link_preview") blockContent = block.link_preview;
if (blockContent) return md.link(type, blockContent.url);
}
break;

Maybe in this case to handle, if the url starts with notion then transform to notion.so/{pageid}

an small convertion issue in `callout` blocks

Hi :)
I found another small issue
Let me show you some screenshots

Notion:
Screenshot_20220802_232634

Result:
Screenshot_20220802_231555
code:
Screenshot_20220802_230846
right here, the four-space before image without >


Expectation:
Screenshot_20220802_231604
manually edited code:
Screenshot_20220802_233409

same in github (markdown), its not a Jekyll-specific problem
Screenshot_20220802_234412

thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.