asnunes / notion-page-to-html Goto Github PK
View Code? Open in Web Editor NEWNodeJS tool to convert public Notion pages to HTML from page ID
License: MIT License
NodeJS tool to convert public Notion pages to HTML from page ID
License: MIT License
Hi ! Thanks for this package ;)
But my notion URL is not notion.so
but notion.site
, I think that you'll need to adapt to accept other domain name.
Can I open a PR for that pls ? ;)
Hi
I seem to be having an issue with uploaded images.
This is the URL I am trying to use with the module https://www.notion.so/dhavalsoneji/2c5dd1f8b26840d7ba882d1490a4a917
I get this error:
error - uncaughtException: SyntaxError: Unexpected token P in JSON at position 0
at JSON.parse (<anonymous>)
at IncomingMessage.<anonymous> (/Users/dhaval/git-clones/portfolio/node_modules/notion-page-to-html/dist/utils/usecases/http-get/node-http-get.js:72:44)
at IncomingMessage.emit (node:events:406:35)
at IncomingMessage.emit (node:domain:475:12)
at endReadableNT (node:internal/streams/readable:1343:12)
at processTicksAndRejections (node:internal/process/task_queues:83:21)
Which is happening because stringData
is trying to be JSON parsed, but its value is:
PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPEVycm9yPjxDb2RlPkFjY2Vzc0RlbmllZDwvQ29kZT48TWVzc2FnZT5BY2Nlc3MgRGVuaWVkPC9NZXNzYWdlPjxSZXF1ZXN0SWQ+WENKU0tNQzM5VEhQMTU5RjwvUmVxdWVzdElkPjxIb3N0SWQ+bjV2T3I4NEpURk5CWE5Ed1RKeWN4U0FocU5pUDJqSkd2U2dEcFl6ckcwQU5uQ1B6cUNUWHZLODJxZndFOE1OakwxOFFuNlpxSkU4PTwvSG9zdElkPjwvRXJyb3I+
Which b64 decodes to:
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>XCJSKMC39THP159F</RequestId>
<HostId>n5vOr84JTFNBXNDwTJycxSAhqNiP2jJGvSgDpYzrG0ANnCPzqCTXvK82qfwE8MNjL18Qn6ZqJE8=</HostId>
</Error>
I believe this is happening due to an image I uploaded to notion for the page cover. The API by default doesn't give us a useful URL to get the image.
It will give something like:
https://s3-us-west-2.amazonaws.com/secure.notion-static.com/40b79211-1ae6-427f-8b3f-85216732792a/Untitled.png
Which is inaccessible
I think a solution is to check if the image url contains notion's aws and use notion's image endpoint
if (image.includes("amazonaws.com") && image.includes("secure.notion-static.com")) {
image = "https://www.notion.so/image/" + encodeURIComponent(image) + "?table=block&id=" + id
}
Where id
is the ID of the page.
This should give something like:
https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F40b79211-1ae6-427f-8b3f-85216732792a%2FUntitled.png?table=block&id=2c5dd1f8-b268-40d7-ba88-2d1490a4a917
Which is properly accessible
Hello there! Outstanding job with this! ๐
Is there any plan to introduce support for databases and/or tables?
Two ideas to make this project more approachable
A public example of input Notion and output/resulting hosted HTML file somewhere
(If I'm understanding this project correct) Ideally there'd be a way to run this without writing any code, via npx
, to have a single command to take a public notion URL and covert it to static HTML that can be hosted anywhere. (I'm also looking at https://github.com/leoncvlt/loconotion as something similar)
Apologies if I'm misunderstanding what this is intended for though!
Works well however does not support sub-pages, here is a demo code:
const NotionPageToHtml = require('notion-page-to-html');
const fs = require('fs');
async function getPage() {
const { title, icon, cover, html } = await NotionPageToHtml.convert("XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX");
console.log(title);
fs.writeFile("./out.html", html, function(err) {
if(err) {
return console.log(err);
}
console.log("The file was saved!");
});
}
getPage();
Subpages would be much more useful, especially for generating static pages with github actions
Uncaught (in promise) InvalidPageUrlError: Url "https://vladkrutenyuk.notion.site/Blog-Post-9173b6a515c54ad6bc9738f0001cd9af" is not a valid notion page.
After updating from 1.1.2 to 1.1.3, every page I try to read throws "Can not find Notion Page of id [id]. Is the url correct? It is the original page or a redirect page (not supported)?"
I looked through this, but still not sure why it is being triggered:
e190430
Here are some example page ids that work with 1.1.2 but not 1.1.3:
22425fea05234ab282bb79f5b81881c4
640f9f67-56c6-4f6b-a371-81de2b2b16fd
71f15f9d-3da7-457d-b32e-b5363a402c2e
736b903b-3e9b-4dcd-973c-9262bb16f545
Hello,
I think, but I'm not sure, that I've found a tiny bug in your API and it would make me so happy if you could take a minute to look at it, that would be great.
Let me give you some background.
I make my articles in Notion and then I use a process with n8n to send it to WordPress.
To do this, I use your API, which transforms the notion page into HTML. It's a crazy thing, and then in seconds it arrives as a WordPress Post and it's done, all I have to do is add an image made by ChatGPT. It' so easy, thank you!
Everything works fine as long as there's no code in the text.
When there is code, the code is present but not quite correctly formatted.
To help you understand better, I'll give you an example:
Here's the Notion page: https://lmvi.notion.site/Test-e3e31ae11fc94ab38d6c12ed98a226ae
Here's the page created by your wonderfull API: https://notion-page-to-html-api.vercel.app/html?id=e3e31ae11fc94ab38d6c12ed98a226ae
If you had a couple of minutes to correct it, it would be so great and so wonderful.
1000 thanks for your help and your tool.
Sincerely
Jean-Marc Henry
Whenever I try to get the HTML of a page with no cover or cover from Unsplash, everything works correctly. The moment I upload my own cover, I get
A server error has occurred
FUNCTION_INVOCATION_FAILED
When I look into the Vercel log, I get:
[GET] /?id=Page-with-a-custom-cover-65f2a5a62d7344768e95cfa91b423618
10:44:58:94
2022-01-29T09:45:01.776Z 1e9b4de1-f8c6-409f-a7f1-09860b5d1548 ERROR Uncaught Exception {"errorType":"SyntaxError","errorMessage":"Unexpected token P in JSON at position 0","stack":["SyntaxError: Unexpected token P in JSON at position 0"," at JSON.parse (<anonymous>)"," at IncomingMessage.<anonymous> (/var/task/node_modules/notion-page-to-html/dist/utils/usecases/http-get/node-http-get.js:71:44)"," at IncomingMessage.emit (events.js:412:35)"," at endReadableNT (internal/streams/readable.js:1334:12)"," at processTicksAndRejections (internal/process/task_queues.js:82:21)"]}
Unknown application error occurred
I've deployed this app on https://notion-to-html.vercel.app/, I've been testing it with this Notion database - no-cover page and Unsplash cover page work, custom cover does not. Thank you for this amazing extension, I hope we can find a solution, I'll gladly follow up with further information if needed.
Hi, I'm not really looking for complete Notion page conversion. Could you please expose an api that converts just Notion RichText block to HTML. Basically opposite of what https://github.com/instantish/martian does.
Hi, this does exactly what I need. For my use case, all the pages I want to convert are private. I'm willing to help submit a PR because it doesn't look like any other library can parse Notion blocks into HTML.
Alternatively, I'm already using the nishan library to talk to the API, and I really just need an exposed method that allowed me to pass blocks into it.
I'm not sure which is easiest, but again happy to submit a PR.
NotionPageToHtml.convert("https://www.notion.so/asnunes/Simple-Page-Text-2-4d64bbc0634d4758befa85c5a3a6c22f").then((page) => console.log(page));
Error: Can not find Notion Page of id 4d64bbc0-634d-4758-befa-85c5a3a6c22f. Is the url correct? It is the original page or a redirect page (not supported)?
My setup:
npm version 8.3.1
node v16.14.0
Same error on:
https://npm.runkit.com/notion-page-to-html
Demo page and code:
import NotionPageToHtml from "notion-page-to-html";
import fs from "fs";
async function getPage() {
const { html } = await NotionPageToHtml.convert(
"https://jmlecoach.notion.site/Travis-Scott-x-Jordan-1-Low-OG-Olive-2ed7e492a0ed4588970efdf9a69ce954"
);
fs.writeFile("./out.html", html, function (err) {
if (err) {
return console.log(err);
}
console.log("The file was saved!");
});
}
getPage();
The output ends up hiding some images and subtitles that are present in the original page.
The MathJax rendering is erroneous. For example, it doesn't recognize \\
as a marker for the equation to be rendered on the next line.
When you try to get the content of a page, it also adds the HTML content of any nested page.
Take this page as an example-
https://therohitdas.notion.site/Rohit-Das-99163302a0c44cd086292fd8f7637c5e
I have added a page "Testing something don't mind" page at the very bottom.
The output -
https://notion-page-to-html-api.vercel.app/html?id=99163302a0c44cd086292fd8f7637c5e
When you visit this link, you can see the content of the nested page at the very bottom.
What would be nice?
https://notion-page-to-html-api.vercel.app/html?id=[ID FOR NESTED PAGE]
So when you click on those links you will be actually fetching the nested page. And this cycle can keep on going.FYI - I also hosted the API on Vercel and this is my endpoint: https://notion-page-to-html-api.vercel.app
Don't use it for testing. as it points to a fork of this repo.
I'm getting the following error:
SyntaxError: Unexpected token e in JSON at position 0
at JSON.parse (<anonymous>)
at IncomingMessage.<anonymous> (/node_modules/notion-page-to-html/dist/utils/usecases/http-get/node-http-get.js:74:44)
at IncomingMessage.emit (node:events:406:35)
at endReadableNT (node:internal/streams/readable:1329:12)
at processTicksAndRejections (node:internal/process/task_queues:83:21)
When trying to use it with this page:
https://www.notion.so/smarterlabs/Setting-Up-a-New-Vercel-Website-584a1c27a10642d8869473588a5c1b45
I can get it to work just fine with the example page you have in the readme. So I'm assuming it's something on that page the notion-page-to-html module doesn't support.
I've tried it with "https://separated-dance-8ef.notion.site/Elettrotecnica-2853f50e539a4b03a9ee26fca4bbd3ba" but I only get the titles of the toggles, without the possibility to open them.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.