Giter Site home page Giter Site logo

asnunes / notion-page-to-html Goto Github PK

View Code? Open in Web Editor NEW
157.0 6.0 42.0 841 KB

NodeJS tool to convert public Notion pages to HTML from page ID

License: MIT License

JavaScript 0.48% TypeScript 99.32% Makefile 0.14% Shell 0.06%
notion-pages html html5 equation

notion-page-to-html's Issues

Bug with uploaded images

Hi
I seem to be having an issue with uploaded images.

This is the URL I am trying to use with the module https://www.notion.so/dhavalsoneji/2c5dd1f8b26840d7ba882d1490a4a917

I get this error:

error - uncaughtException: SyntaxError: Unexpected token P in JSON at position 0
    at JSON.parse (<anonymous>)
    at IncomingMessage.<anonymous> (/Users/dhaval/git-clones/portfolio/node_modules/notion-page-to-html/dist/utils/usecases/http-get/node-http-get.js:72:44)
    at IncomingMessage.emit (node:events:406:35)
    at IncomingMessage.emit (node:domain:475:12)
    at endReadableNT (node:internal/streams/readable:1343:12)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)

Which is happening because stringData is trying to be JSON parsed, but its value is:

PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPEVycm9yPjxDb2RlPkFjY2Vzc0RlbmllZDwvQ29kZT48TWVzc2FnZT5BY2Nlc3MgRGVuaWVkPC9NZXNzYWdlPjxSZXF1ZXN0SWQ+WENKU0tNQzM5VEhQMTU5RjwvUmVxdWVzdElkPjxIb3N0SWQ+bjV2T3I4NEpURk5CWE5Ed1RKeWN4U0FocU5pUDJqSkd2U2dEcFl6ckcwQU5uQ1B6cUNUWHZLODJxZndFOE1OakwxOFFuNlpxSkU4PTwvSG9zdElkPjwvRXJyb3I+

Which b64 decodes to:

<?xml version="1.0" encoding="UTF-8"?>
<Error>
  <Code>AccessDenied</Code>
  <Message>Access Denied</Message>
  <RequestId>XCJSKMC39THP159F</RequestId>
  <HostId>n5vOr84JTFNBXNDwTJycxSAhqNiP2jJGvSgDpYzrG0ANnCPzqCTXvK82qfwE8MNjL18Qn6ZqJE8=</HostId>
</Error>

I believe this is happening due to an image I uploaded to notion for the page cover. The API by default doesn't give us a useful URL to get the image.
It will give something like:
https://s3-us-west-2.amazonaws.com/secure.notion-static.com/40b79211-1ae6-427f-8b3f-85216732792a/Untitled.png
Which is inaccessible

I think a solution is to check if the image url contains notion's aws and use notion's image endpoint

if (image.includes("amazonaws.com") && image.includes("secure.notion-static.com")) {
    image = "https://www.notion.so/image/" + encodeURIComponent(image) + "?table=block&id=" + id
}

Where id is the ID of the page.

This should give something like:
https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F40b79211-1ae6-427f-8b3f-85216732792a%2FUntitled.png?table=block&id=2c5dd1f8-b268-40d7-ba88-2d1490a4a917

Which is properly accessible

Databases and table support

Hello there! Outstanding job with this! ๐Ÿš€
Is there any plan to introduce support for databases and/or tables?

Example?

Two ideas to make this project more approachable

  1. A public example of input Notion and output/resulting hosted HTML file somewhere

  2. (If I'm understanding this project correct) Ideally there'd be a way to run this without writing any code, via npx, to have a single command to take a public notion URL and covert it to static HTML that can be hosted anywhere. (I'm also looking at https://github.com/leoncvlt/loconotion as something similar)

Apologies if I'm misunderstanding what this is intended for though!

Subpages Support

Works well however does not support sub-pages, here is a demo code:

const NotionPageToHtml = require('notion-page-to-html');
const fs = require('fs');

async function getPage() {
  const { title, icon, cover, html } = await NotionPageToHtml.convert("XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX");
  console.log(title);

  fs.writeFile("./out.html", html, function(err) {
    if(err) {
        return console.log(err);
    }
    console.log("The file was saved!");
  });
}

getPage();

Subpages would be much more useful, especially for generating static pages with github actions

404 on every page I've tried so far

After updating from 1.1.2 to 1.1.3, every page I try to read throws "Can not find Notion Page of id [id]. Is the url correct? It is the original page or a redirect page (not supported)?"

I looked through this, but still not sure why it is being triggered:
e190430

Here are some example page ids that work with 1.1.2 but not 1.1.3:
22425fea05234ab282bb79f5b81881c4
640f9f67-56c6-4f6b-a371-81de2b2b16fd
71f15f9d-3da7-457d-b32e-b5363a402c2e
736b903b-3e9b-4dcd-973c-9262bb16f545

resume releases?

Hello! amazing project thanks to everyone involve and especially @asnunes

I saw that there is no release since 5 months now, can we do a new release in NPM? I need this fix c03060a so we can use it in a project

For now I'll clone the project and publish it in a private npm repository.

Code formatting in Notion, not always well rendered.

Hello,

I think, but I'm not sure, that I've found a tiny bug in your API and it would make me so happy if you could take a minute to look at it, that would be great.
Let me give you some background.

I make my articles in Notion and then I use a process with n8n to send it to WordPress.
To do this, I use your API, which transforms the notion page into HTML. It's a crazy thing, and then in seconds it arrives as a WordPress Post and it's done, all I have to do is add an image made by ChatGPT. It' so easy, thank you!

Everything works fine as long as there's no code in the text.

When there is code, the code is present but not quite correctly formatted.
To help you understand better, I'll give you an example:

Here's the Notion page: https://lmvi.notion.site/Test-e3e31ae11fc94ab38d6c12ed98a226ae
Here's the page created by your wonderfull API: https://notion-page-to-html-api.vercel.app/html?id=e3e31ae11fc94ab38d6c12ed98a226ae

If you had a couple of minutes to correct it, it would be so great and so wonderful.
1000 thanks for your help and your tool.
Sincerely
Jean-Marc Henry

Error when inputting a page that has an uploaded cover image

Whenever I try to get the HTML of a page with no cover or cover from Unsplash, everything works correctly. The moment I upload my own cover, I get

A server error has occurred
FUNCTION_INVOCATION_FAILED

When I look into the Vercel log, I get:

[GET] /?id=Page-with-a-custom-cover-65f2a5a62d7344768e95cfa91b423618
10:44:58:94
2022-01-29T09:45:01.776Z	1e9b4de1-f8c6-409f-a7f1-09860b5d1548	ERROR	Uncaught Exception 	{"errorType":"SyntaxError","errorMessage":"Unexpected token P in JSON at position 0","stack":["SyntaxError: Unexpected token P in JSON at position 0","    at JSON.parse (<anonymous>)","    at IncomingMessage.<anonymous> (/var/task/node_modules/notion-page-to-html/dist/utils/usecases/http-get/node-http-get.js:71:44)","    at IncomingMessage.emit (events.js:412:35)","    at endReadableNT (internal/streams/readable.js:1334:12)","    at processTicksAndRejections (internal/process/task_queues.js:82:21)"]}
Unknown application error occurred

I've deployed this app on https://notion-to-html.vercel.app/, I've been testing it with this Notion database - no-cover page and Unsplash cover page work, custom cover does not. Thank you for this amazing extension, I hope we can find a solution, I'll gladly follow up with further information if needed.

Private pages?

Hi, this does exactly what I need. For my use case, all the pages I want to convert are private. I'm willing to help submit a PR because it doesn't look like any other library can parse Notion blocks into HTML.

Alternatively, I'm already using the nishan library to talk to the API, and I really just need an exposed method that allowed me to pass blocks into it.

I'm not sure which is easiest, but again happy to submit a PR.

Unable to find any public page NPM latest

NotionPageToHtml.convert("https://www.notion.so/asnunes/Simple-Page-Text-2-4d64bbc0634d4758befa85c5a3a6c22f").then((page) => console.log(page));

Error: Can not find Notion Page of id 4d64bbc0-634d-4758-befa-85c5a3a6c22f. Is the url correct? It is the original page or a redirect page (not supported)?

My setup:
npm version 8.3.1
node v16.14.0

Same error on:
https://npm.runkit.com/notion-page-to-html

Some pages have incorrectly rendered images

Demo page and code:

import NotionPageToHtml from "notion-page-to-html";
import fs from "fs";

async function getPage() {
  const { html } = await NotionPageToHtml.convert(
    "https://jmlecoach.notion.site/Travis-Scott-x-Jordan-1-Low-OG-Olive-2ed7e492a0ed4588970efdf9a69ce954"
  );

  fs.writeFile("./out.html", html, function (err) {
    if (err) {
      return console.log(err);
    }
    console.log("The file was saved!");
  });
}

getPage();

The output ends up hiding some images and subtitles that are present in the original page.

MathJax Rendering

The MathJax rendering is erroneous. For example, it doesn't recognize \\ as a marker for the equation to be rendered on the next line.

Outputs the contents of nested pages too

When you try to get the content of a page, it also adds the HTML content of any nested page.

Take this page as an example-
https://therohitdas.notion.site/Rohit-Das-99163302a0c44cd086292fd8f7637c5e
I have added a page "Testing something don't mind" page at the very bottom.

The output -
https://notion-page-to-html-api.vercel.app/html?id=99163302a0c44cd086292fd8f7637c5e
When you visit this link, you can see the content of the nested page at the very bottom.

What would be nice?

  • Only the top-level page is fetched and rendered as HTML.
  • All nested pages are just https://notion-page-to-html-api.vercel.app/html?id=[ID FOR NESTED PAGE] So when you click on those links you will be actually fetching the nested page. And this cycle can keep on going.

FYI - I also hosted the API on Vercel and this is my endpoint: https://notion-page-to-html-api.vercel.app Don't use it for testing. as it points to a fork of this repo.

Unexpected token e in JSON at position 0

I'm getting the following error:

SyntaxError: Unexpected token e in JSON at position 0
    at JSON.parse (<anonymous>)
    at IncomingMessage.<anonymous> (/node_modules/notion-page-to-html/dist/utils/usecases/http-get/node-http-get.js:74:44)
    at IncomingMessage.emit (node:events:406:35)
    at endReadableNT (node:internal/streams/readable:1329:12)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)

When trying to use it with this page:

https://www.notion.so/smarterlabs/Setting-Up-a-New-Vercel-Website-584a1c27a10642d8869473588a5c1b45

I can get it to work just fine with the example page you have in the readme. So I'm assuming it's something on that page the notion-page-to-html module doesn't support.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.