Giter Site home page Giter Site logo

leoncvlt / loconotion Goto Github PK

View Code? Open in Web Editor NEW
812.0 812.0 127.0 4.34 MB

๐Ÿ“„ Python tool to turn Notion.so pages into lightweight, customizable static websites

CSS 9.15% JavaScript 7.33% Python 80.98% Dockerfile 2.54%
notion pyhton scraping static-site-generator

loconotion's People

Contributors

2m avatar aahnik avatar aehernandez avatar akash-sharma-1 avatar bryanhpchiang avatar dosoft avatar douglasjarquin avatar ecolabardini avatar joonatanjak avatar kevindaffaarr avatar leoncvlt avatar leshchenko1979 avatar nanobjorn avatar nijatismayilzada avatar noahsaso avatar safaorhan avatar sathesh95 avatar sunz1e avatar vincent-maladiere avatar web3gurung avatar zvedenyuk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

loconotion's Issues

A page is not being parsed ...

I have my configuration like this:

name = "mysite"

page = "https://www.notion.so/top-level-page"

[site]

  [[site.meta]]
  name = "title"
  content = "Profile"
  [[site.meta]]
  name = "description"
  content = "Home page"
  

[pages]
  [pages.subpage]
    slug = "Support"
    
    [[pages.subpage.meta]]
    name = "title"
    content = "Support"
    [[pages.subpage.meta]]
    name = "robots"
    content = "noindex"
    

the urls are replaced with top-level-page and subpage for demonstration purposes here.
the page with [pages.3e0fdde4ea6b09db986d7fbcb4a] is not being parsed.

Is it because it is not a sub-page of the main top level page in Notion ?
The top level (index.html) page, does not have any link to the subpage.

But I still want that subpage to be there in that slug.

How can I do it ?

Add command option to change where the website is saved

I think that editing the program to let the user choose where to save the website would be better than just making a dist folder in the working directory.

the option would probably be
-d /directory

where d stands for destination.

Enable Custom Favicons

It's a minor thing, but being able to add a custom favicon would be a nice to have for personalization of these sites. In the future could explore other cosmetic adjustments to pages, but this would be relatively straightforward change, just requires ensuring the metadata is set correctly if the person provides a file path. Probably makes sense to handle in site config file.

https://favicon.io/tutorials/what-is-a-favicon/

https://stackoverflow.com/questions/11893478/add-favicon-to-website

Support for new Notion URL format

Not 100% sure, but I believe the URL format for Notion shared pages recently changed.

It's now notion.site instead of notion.so:

Editing view: https://www.notion.so/bryanchiang/Bryan-Chiang-fc01c67a1ed9402e83eb8efd5c99a216
Shared view: https://bryanchiang.notion.site/Bryan-Chiang-fc01c67a1ed9402e83eb8efd5c99a216

I get a parser error with the second one.

Ito-MacBook:loconotion bryanhpchiang$ python3 loconotion https://www.notion.so/bryanchiang/fc01c67a1ed9402e83eb8efd5c99a216
[23:09:54] INFO Initialising parser with simple page url
[23:09:54] INFO Setting output path to 'dist/bryanchiang/fc01c67a1ed9402e83eb8efd5c99a216'
[23:09:54] INFO Initialising chromedriver at /usr/local/lib/python3.9/site-packages/chromedriver_autoinstaller/91/chromedriver
[23:09:56] INFO Parsing page 'https://www.notion.so/bryanchiang/fc01c67a1ed9402e83eb8efd5c99a216'
[23:10:57] CRITICAL Timeout waiting for page content to load, or no content found. Are you sure the page is set to public?
Traceback (most recent call last):
  File "/usr/local/Cellar/[email protected]/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/Cellar/[email protected]/3.9.4/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/bryanhpchiang/Documents/Workspace/loconotion/loconotion/__main__.py", line 144, in <module>
    main()
  File "/Users/bryanhpchiang/Documents/Workspace/loconotion/loconotion/__main__.py", line 123, in main
    Parser(config=config, args=vars(args))
  File "/Users/bryanhpchiang/Documents/Workspace/loconotion/loconotion/notionparser.py", line 85, in __init__
    self.run(url)
  File "/Users/bryanhpchiang/Documents/Workspace/loconotion/loconotion/notionparser.py", line 667, in run
    f"Finished!\n\nProcessed {len(tot_processed_pages)} pages in {formatted_time}"
TypeError: object of type 'NoneType' has no len()

Will trying modifying the check for a valid notion.so website.

Trouble with installing chromedriver-autoinstaller

Hi,

I keep running into the following error, when I try to pip install -r requirements.txt

ERROR: Could not find a version that satisfies the requirement chromedriver-autoinstaller==0.2.0 (from -r requirements.txt (line 11)) (from versions: none)
ERROR: No matching distribution found for chromedriver-autoinstaller==0.2.0 (from -r requirements.txt (line 11))

I've also tried manually installing chromedriver and even placed it in /usr/local/bin/, but to no avail. Do advise if I'm making some sort of a noob error here. Any and all help appreciated.

Thanks! :)

Timeout when trying to load pages with embeds at the bottom

I have two pages that constantly fail to be loaded by loconotion:

While trying to download them, I get the error message:

CRITICAL Timeout waiting for page content to load, or no content found. Are you sure the page is set to public?

I suspect the reason is that they are the only pages in my site that contain an embedded from at the end.

The workaround for this is running the chomedriver in a non-headless mode (loconotion switch --non-headless), waiting for the page to open and then manually scrolling it to the bottom so that the form loads.

Judging from the code, the script waits for all elements to be loaded but the embeds seem to start loading properly only after you scroll to them.

Links on sub pages don't works

Hello,
When I click on a link in subpages of Notion wich refered to other notions pages, it doesn't do anything. Right click then "Open in a new tab" works.
It works when I removed contenteditable="true" data-content-editable-leaf="true" near each link in the source code of generated html pages.

Getting parsing errors on the Example Notion page

Hi,

I was trying to build files for this example notion page given in the README itself. I received some parsing based errors on this notion page.

Heres's the error - stack trace :
image

I suspect it's related to the parsing of toggle buttons from the toggling list existing on the page.

Chromedriver Crashes when Program is run outside of loconotion folder

Traceback (most recent call last):
  File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ubuntu/loconotion/loconotion/__main__.py", line 144, in <module>
    main()
  File "/home/ubuntu/loconotion/loconotion/__main__.py", line 134, in main
    Parser(config=parsed_config, args=vars(args))
  File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 84, in __init__
    self.driver = self.init_chromedriver()
  File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 240, in init_chromedriver
    return webdriver.Chrome(
  File "/home/ubuntu/.local/lib/python3.9/site-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
    self.service.start()
  File "/home/ubuntu/.local/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 98, in start
    self.assert_process_still_running()
  File "/home/ubuntu/.local/lib/python3.9/site-packages/selenium/webdriver/common/service.py", line 109, in assert_process_still_running
    raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: Service /usr/bin/chromedriver unexpectedly exited. Status code was: 1

I ran my script from the home folder, and I got this error. When I run my autoupload-to-ghpages script (that exists in the home folder) inside the loconotion directory, it runs fine. What's going on?

For now, I'll specify to cd to the loconotion directory inside the script, but I just thought that this was weird.

Blog post properties view distorted on iOS Safari

The properties section under the title and before the blog post content is distorted with weird padding all over the place on Safari on iOS. I'll try and get screenshots.

Has anyone else encountered this issue?

Issues on iOS mobile

I found this from your comment on HN โ€”ย Great work. I am looking forward to using this.

Based on a Reply to your aforementioned comment, I tried https://loconotion-example.netlify.app/ on an iPhone, and I can't seem to scroll left-right. This happens in both landscape and portrait mode.

Details of my phone

iPhone 8 Plus with a 5.5-inch display

Please let me know if I can provide any details.

Again, great work! Thanks.

CRITICAL FileNotFoundError: [Errno 2] No such file or directory: 'dist/Notion Test Site/loconotion-example-page-03c403f4fdc94cc1b315b9469a8950efnotion.site/loconotion-example.html'

First of all, thank you for a great tool!

Getting this error while trying to render example_site.toml:

CRITICAL FileNotFoundError: [Errno 2] No such file or directory: 'dist/Notion Test Site/loconotion-example-page-03c403f4fdc94cc1b315b9469a8950efnotion.site/loconotion-example.html'

Probably the same problem as in #50

Full log:

$ python3 loconotion example/example_site.toml
[04:00:37] INFO Initialising parser with configuration file
[04:00:37] INFO Setting output path to 'dist/Notion Test Site'
[04:00:37] INFO Initialising chromedriver at /usr/local/lib/python3.7/site-packages/chromedriver_autoinstaller/93/chromedriver
[04:00:39] INFO Parsing page 'https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950ef'
[04:00:50] INFO Got the domain as https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#a0247055e2754381a8472aa985e43e44
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#ffa3779739fd4ba286b7f8462f9e8e60
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#21a306a517ac44549694b7d7d267152f
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#78fec4dc40814c12ac98b153763da356
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#1d2ac510eec646fe879145360a6d886d
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#6aeb70d2a0244356b83fd91a04ad751c
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Media-be1a5c3e1c9640a0ab9ba0ba9b67e6a5
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Code-1a6eb2daf2eb4508847433d6560a47bf
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Embeds-7228df8683b6471ea958e1e2249aceaf
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Inline-Databases-d77dafb46a22444899e7ab0525a318dc
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/d2fa06f244e64f66880bb0491f58223d?v=96639eede8a14a808fa2c9a075f6bb0a
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/2483a3a5c3fd445980c1adc8e550b552?v=e6f8ca9db37444f792ed7db9511a21fc
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/2604ce45890645c79f67d92833083fee?v=e138f6fdcea24f87b442577732b2052d
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/a28dba2e7a67448da52f2cd2c641407b?v=9d90e33318404691bec3b62c8b4d87a5
[04:00:50] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/9e5a383774ae4d99a42f8632776f96bd?v=d8bfa6070e344d2fb877c25ecd81127f
[04:00:50] INFO Exporting page 'https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950ef' as 'index.html'
[04:00:50] INFO Parsing page 'https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef'
[04:00:59] INFO Got the domain as https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#a0247055e2754381a8472aa985e43e44
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#ffa3779739fd4ba286b7f8462f9e8e60
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#21a306a517ac44549694b7d7d267152f
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#78fec4dc40814c12ac98b153763da356
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#1d2ac510eec646fe879145360a6d886d
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef#6aeb70d2a0244356b83fd91a04ad751c
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Media-be1a5c3e1c9640a0ab9ba0ba9b67e6a5
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Code-1a6eb2daf2eb4508847433d6560a47bf
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Embeds-7228df8683b6471ea958e1e2249aceaf
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Inline-Databases-d77dafb46a22444899e7ab0525a318dc
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/d2fa06f244e64f66880bb0491f58223d?v=96639eede8a14a808fa2c9a075f6bb0a
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/2483a3a5c3fd445980c1adc8e550b552?v=e6f8ca9db37444f792ed7db9511a21fc
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/2604ce45890645c79f67d92833083fee?v=e138f6fdcea24f87b442577732b2052d
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/a28dba2e7a67448da52f2cd2c641407b?v=9d90e33318404691bec3b62c8b4d87a5
[04:00:59] INFO Got this as href https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/9e5a383774ae4d99a42f8632776f96bd?v=d8bfa6070e344d2fb877c25ecd81127f
[04:00:59] INFO Exporting page 'https://www.notion.so/Loconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950efnotion.site/Loconotion-Example-03c403f4fdc94cc1b315b9469a8950ef' as 'loconotion-example-page-03c403f4fdc94cc1b315b9469a8950efnotion.site/loconotion-example.html'
[04:00:59] CRITICAL FileNotFoundError: [Errno 2] No such file or directory: 'dist/Notion Test Site/loconotion-example-page-03c403f4fdc94cc1b315b9469a8950efnotion.site/loconotion-example.html'

[feature request] GitHub Action that will deploy to github pages.

GitHub Pages can be used to host static content easily.

It would be great if there was a GitHub action, that could do the job. It would rebuild the pages periodically or when triggered by user. The github repo would contain the html files to serve.

So the user experience would be something like this:

  1. Make your desired notion pages public
  2. Create a github repo from template ( the template would contain the gh aciton workflow script and empty config file ) to host the static content
  3. edit the config file to set the pages and other things
  4. the script will be triggered on a push to main branch
  5. the script can be run manually at times, or will run periodically ( cron expression set by user)

GitHub urls are user.github.com/repo/page_slug

Users could also use custom domain (configure gh page url in repo settings)
thus: custom.com/page_slug

Add support to Google Analytics

Thanks for creating such a great library.

I tried to hack your TOML file without a success.

I would like to add the following code (from Google Analytics) to each page's header:

<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-xxxxxxxxxxx"></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag(){dataLayer.push(arguments);}
  gtag('js', new Date());

  gtag('config', 'G-xxxxxxxxx');
</script>

Any possible way?

This would be great to be able to track how user navigate through the site.

Suddenly changing to Notion's support page

My page is https://www.notion.so/jamesdeluk/James-IT-Notes-9969909992c04b5ba3a734cdf0a74530

Output basically explains what's going on. It stops parsing my site, changes to a random Notion official one, then errors out.

# does most of my site, and then:
# my page
[12:35:53] [INFO] Parsing page 'https://www.notion.so/Web-Exploitation-65246df297494ae9a22e5d0f866e973b'
[12:36:00] [INFO] Downloading 'https://notion-emojis.s3-us-west-2.amazonaws.com/v0/svg-twitter/1f97d.svg'
[12:36:01] [INFO] Exporting page 'https://www.notion.so/Web-Exploitation-65246df297494ae9a22e5d0f866e973b' as 'web-exploitation.html'

# my page
[12:36:01] [INFO] Parsing page 'https://www.notion.so/Burp-Suite-65adbb35c50349719abf8e07ee71f600'
[12:36:08] [INFO] Downloading 'https://notion-emojis.s3-us-west-2.amazonaws.com/v0/svg-twitter/1f34a.svg'
[12:36:09] [INFO] Exporting page 'https://www.notion.so/Burp-Suite-65adbb35c50349719abf8e07ee71f600' as 'burp-suite.html'

# notion page?
[12:36:09] [INFO] Parsing page 'https://www.notion.so/50de58f824bf4100a4a9fa5b6f783c2b'
[12:36:18] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fff307fed-4664-46b0-8de1-6fdcc1e937b2%2FNotion-logo.png?table=block&id=83715d77-03ee-4b86-99b5-e659a4712dd8&width=40&userId=&cache=v2'
[12:36:20] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F6de57ff9-a9eb-41d6-914c-597fe52a7294%2Fnotion-logo-no-background.png?table=block&id=e040febf-70a9-4950-b862-0e6f00005004&width=40&userId=&cache=v2'
[12:36:20] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fdbe4d424-b449-43c3-917a-4dc7b0fd6eca%2FTable_of_Contents.png?table=block&id=261da5f7-cc88-462a-a754-cf6ddcfd8145&width=2990&userId=&cache=v2'
[12:36:22] [INFO] Downloading 'https://s3.us-west-2.amazonaws.com/secure.notion-static.com/8d0a144c-6218-49ed-ae78-397a42a12de7/TOC_-_jump.gif?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20210418%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20210418T113611Z&X-Amz-Expires=86400&X-Amz-Signature=682eadf21e2398c2ef20059255d80ea15c37a7d5d67174c071e5fe8932914f7e&X-Amz-SignedHeaders=host'
[12:36:23] [INFO] Downloading 'https://s3.us-west-2.amazonaws.com/secure.notion-static.com/2fa141d8-953e-4d34-9244-25abf7b99afe/Table_of_Contents.gif?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20210418%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20210418T113611Z&X-Amz-Expires=86400&X-Amz-Signature=b38b8aa5999d210f25a1f8e1704cd08ee3ac890c33a747b77bd211a83568e720&X-Amz-SignedHeaders=host'
[12:36:25] [INFO] Downloading 'https://s3.us-west-2.amazonaws.com/secure.notion-static.com/5a399489-e3aa-41f8-b6e0-0dc4ab508068/TOC_-_color.gif?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20210418%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20210418T113611Z&X-Amz-Expires=86400&X-Amz-Signature=f8291d154de4affd67885b606c6071fa96c3ecde76c39c4748d9e5cba1d88fd9&X-Amz-SignedHeaders=host'
[12:36:27] [INFO] Downloading 'https://s3.us-west-2.amazonaws.com/secure.notion-static.com/59fa2459-57ee-45d1-ac37-55d0eea35e4e/Drag_and_drop_TOC.gif?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20210418%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20210418T113611Z&X-Amz-Expires=86400&X-Amz-Signature=5502e27f634e03694e717ec055278c5434178d0dab76eb5ee312f3ce3d53c37b&X-Amz-SignedHeaders=host'
[12:36:29] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fb6491bb2-ef11-407f-8a01-373fb32c8868%2FDelete_and_Duplicate_TOC.png?table=block&id=d8147075-76f4-421a-8722-464f4b8d74d0&width=2990&userId=&cache=v2'
[12:36:31] [INFO] Exporting page 'https://www.notion.so/50de58f824bf4100a4a9fa5b6f783c2b' as '50de58f824bf4100a4a9fa5b6f783c2b.html'

# notion page
[12:36:31] [INFO] Parsing page 'https://www.notion.so/Notion-Official-83715d7703ee4b8699b5e659a4712dd8'
[12:36:39] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fff307fed-4664-46b0-8de1-6fdcc1e937b2%2FNotion-logo.png?table=block&id=83715d77-03ee-4b86-99b5-e659a4712dd8&width=250&userId=&cache=v2'
[12:36:40] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fc0bd23d0-b4d9-4f2c-9f31-24745bfab4bd%2Fnotion-logo-no-background.png?table=block&id=181e961a-eb5c-4ee6-9153-07c0dfd5156d&width=40&userId=&cache=v2'
[12:36:40] [INFO] Downloading 'https://s3.us-west-2.amazonaws.com/secure.notion-static.com/6425a4d2-6a96-4fa7-83d8-44389e9337f7/notion-logo.svg?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20210418%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20210418T113633Z&X-Amz-Expires=86400&X-Amz-Signature=e2223a99b36d124cdb27bf780661208082483c7e9ebc28c60074177eab4ba8fd&X-Amz-SignedHeaders=host'
[12:36:41] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F44fe1c94-d71a-427d-82ca-ed142b2fc323%2Fnotion-logo-no-background.png?table=block&id=04f306fb-f59a-413f-ae15-f42e2a1ab029&width=40&userId=&cache=v2'
[12:36:42] [INFO] Exporting page 'https://www.notion.so/Notion-Official-83715d7703ee4b8699b5e659a4712dd8' as 'notion-official.html'

# notion page
[12:36:42] [INFO] Parsing page 'https://www.notion.so/What-s-New-157765353f2c4705bd45474e5ba8b46c'
[12:37:44] [CRITICAL] Timeout waiting for page content to load, or no content found. Are you sure the page is set to public?
[12:37:44] [INFO] Parsing page 'https://www.notion.so/Notion-Template-Gallery-181e961aeb5c4ee6915307c0dfd5156d'
[12:38:14] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F0bcaa8ea-b972-4c03-972a-d1dc539523dd%2FTemplate_gallery_Jan_2021.png?table=block&id=181e961a-eb5c-4ee6-9153-07c0dfd5156d&width=3840&userId=&cache=v2'
[12:38:15] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fc0bd23d0-b4d9-4f2c-9f31-24745bfab4bd%2Fnotion-logo-no-background.png?table=block&id=181e961a-eb5c-4ee6-9153-07c0dfd5156d&width=250&userId=&cache=v2'
# about 400 of those then
[12:44:25] [INFO] Exporting page 'https://www.notion.so/Notion-Template-Gallery-181e961aeb5c4ee6915307c0dfd5156d' as 'notion-template-gallery.html'
[12:44:25] [INFO] Parsing page 'https://www.notion.so/notion/Duplicate-public-pages-d8a461baeeb54d91b156ff5559192321'
[12:44:43] [INFO] Downloading 'https://s3.us-west-2.amazonaws.com/secure.notion-static.com/c882bb85-6a40-4c1f-8c3d-8b3267078904/ScreenFlow.gif?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAT73L2G45O3KS52Y5%2F20210418%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20210418T114426Z&X-Amz-Expires=86400&X-Amz-Signature=b3d877b882dc3b2042eab90497a3a74884d16a7e5a3c6f55cd7fe18098c242e6&X-Amz-SignedHeaders=host'
[12:44:45] [INFO] Downloading 'https://www.notion.so/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2Fe99a75c6-ef1d-42b4-8f15-b9bc757e74ec%2FDuplicate_as_Template.png?table=block&id=76feeca7-abfa-4044-b577-976c2ff201e8&width=3070&userId=&cache=v2'
[12:44:47] [INFO] Exporting page 'https://www.notion.so/notion/Duplicate-public-pages-d8a461baeeb54d91b156ff5559192321' as 'notion/duplicate-public-pages.html'
[12:44:47] [CRITICAL] FileNotFoundError: [Errno 2] No such file or directory: 'dist\\jamesdeluk\\james-it-notes\\notion\\duplicate-public-pages.html'

Improved user experience

hi, @leoncvlt

  1. why dont you upload loconotion to pypi ? so that users can directly pip install loconotion (you can use poetry publish, and also remove the requirements.txt)
  2. Why does loconotion not use chromedriver in path ? I have proper chromedriver installed. and in path.

if i run chromedriver --version it shows correct output. but while using loconotion, it does not detect chromedriver by default. i need to pass the --chromedriver argument. (this can be solved by not passing any chrome driver path to selenium. it will autodetect)

  1. the end user experience can be something like this:
pip install loconotion
mkdir mysite && cd mysite
loconotion https://notion.so/path-to-page

then running ls gives:

index.html
assets

the assets folder will contain all images/fonts/css/js

[bug] Cached font files are not used

I can definitely see it downloads font files and caches them using hashed filenames, but it doesn't persist changed stylesheet file.

It iterates css rules, finds font-face rule and downloads the web font file, and changes the style object.

for rule in stylesheet.cssRules:
if rule.type == cssutils.css.CSSRule.FONT_FACE_RULE:
# if any are found, download the font file
font_file = (
rule.style["src"].split("url(/")[-1].split(") format")[0]
)
cached_font_file = self.cache_file(
f"https://www.notion.so/{font_file}"
)
rule.style["src"] = f"url({str(cached_font_file)})"

But it doesn't persist changed stylesheet object, as it results in requesting non-existing files as it refers to the original filename like /lyon-text-semibold-acb7f110189034ff6a1afa4b730be0ed.woff rather than 25bf88e58579843c4aef766a6afe75998af40282.woff

Please take a look at this, and I hope to see this is fixed soon!

Thank you for making this great program.

Rendering issues on Chromium on Linux

Hi, After looking through a couple of the other issues here, it seems that rendering issues have been brought up before, so I expect that you might ask me to continue this thread under issue #1 .

However, I thought I might provide a bit more detail about what I'm seeing right now, just to see if there were any more specific fixes.

Glitch 1:

glitch1
As you can see here, the properties for this list view are distorted. That's an issue that I think was brought up previously, but I wanted to add my example here.

Glitch 2:

glitch2
The titles of each columns on this table view are missing.

Glitch 3:

Screenshot from 2021-01-25 10-56-11
The preview on this gallery view is not visible.

For reference, here is the site generated by loconotion and here is what it's supposed to look like.

Here is my config file.

## Loconotion Site Configuration File ##
# This Config is for my notion-powered personal site

# name of the folder that the site will be generated in
name = "personal-website"

# the notion.so page to being parsing from. This page will become the index.html
# of the generated site, and loconotion will parse all sub-pages present on the page
page = "https://www.notion.so/sobelj/Joshua-Sobel-64e7f9095d7245c6987ec6687fcf4c39"

# optionally apply notion's dark mode, remove the line below to use the default light mode
theme = "dark"

Any help would be much appreciated.

Edit: I also just noticed that the database search features are also broken.

Docker

This is a great project, thank you! I'm using it for https://github.com/philfreo/sjc.vote.

I tried getting this to run inside docker but I had trouble with the chrome driver stuff. It'd be nice to have a more portable way to run this (perhaps even inside GitHub Actions) without having to install all the dependencies. Perhaps that would even allow some sort of integration that auto-builds upon a change in Notion.

Stripe code injection?

I just ran a speedtest and noticed that there is some stripe code element injected into the code which does not appear to be present on the original site. This is actually new and didn't happen as of earlier this week. I thought it was in my pipeline at first, but the demo page has the same code injected:

<iframe allowpaymentrequest="true" allowtransparency="true" aria-hidden="true" frameborder="0" name="__privateStripeMetricsController3360" scrolling="no" src="https://js.stripe.com/v3/m-outer-d6c2bdb836ab7d041671a72774049a01.html#url=https%3A%2F%2Fwww.notion.so%2FLoconotion-Example-Page-03c403f4fdc94cc1b315b9469a8950ef&amp;title=Notion%20%E2%80%93%20The%20all-in-one%20workspace%20for%20your%20notes%2C%20tasks%2C%20wikis%2C%20and%20databases.&amp;referrer=&amp;muid=NA&amp;sid=NA&amp;version=6&amp;preview=false" style="border: none !important; margin: 0px !important; padding: 0px !important; width: 1px !important; min-width: 100% !important; overflow: hidden !important; display: block !important; visibility: hidden !important; position: fixed !important; height: 1px !important; pointer-events: none !important; user-select: none !important;" tabindex="-1"></iframe>

Any idea where this is coming from? I checked the html page after running loconotion, so it's not the frontend. It's not on the original notion page and I cannot find it in the code. Given that this is happening on both your sample and my page it doesn't appear to be an injection along the transmission path and most likely Chromedriver. However, I haven't changed the chromedriver version, and it seems unlikely that the chromium hosted version has changed.

I'm not sure the code actually does anything, but it's a bit unnerving to have a payment processor's code injected.

Any ideas?

Logging Errors

Amazing piece of code! So incredibly helpful.

I just parsed my Notebook and it worked beautifully, although (at least) one page didn't work. I ran it again and saw a couple other [ERROR]s and [CRITICAL]s such as pages with the same name being overwritten.

Yet when I checked webdrive.log, there are no mentions to these issues. Only the successful actions!

Is there a way to see logs of errors etc?

Links to missing files

There are some links in <head> to files in images folder that is not in dist:

<html class="notion-html">
<head lang="en">
...
    <link href="/image/https%3A%2F%2Fs3-us-west-2.amazonaws.com%2Fsecure.notion-static.com%2F17843fb5-7467-41c7-b96b-db638f903c3a%2FEugene_Zvedenyuk_1.jpg?table=block&amp;id=37b946f0-1829-42f4-965f-58e2076bc57b&amp;spaceId=4fcefc68-f9c0-4427-b186-180d28820548&amp;width=200&amp;userId=&amp;cache=v2" rel="shortcut icon" type="image/x-icon"/>
    <link href="/images/logo-ios.png" rel="apple-touch-icon"/>
...

And while we're here it actually would be great to place all assets (images, scripts, stylesheets) into separate folder. It would be even better to place internal pages into separate folders for each to get rid of .html in URL.

Something like this:

dist/project1/
  index.html
  assets/
    ...
  page1/
    index.html
  page2/
    index.html

make "Name column on tables become a link" optional

It's a great feature, but sometimes we just need tables as a way to format information and don't need a conversion of the name column to the link.

Note, that if the table includes multiple (let's say 50 rows) then this conversion takes a LOT of time.

Crash when starting to parse another notion links

I'm trying to load

https://www.notion.so/Getting-Started-42e612b783b24c0699ab3da248a69f1a

and got:

[03:55:29] INFO Parsing page 'https://www.notion.so/notion/Keyboard-Shortcuts-66e28cec810548c3a4061513126766b0'
[03:55:41] INFO Exporting page 'https://www.notion.so/notion/Keyboard-Shortcuts-66e28cec810548c3a4061513126766b0' as 'notion/keyboard-shortcuts.html'
[03:55:41] CRITICAL FileNotFoundError: [Errno 2] No such file or directory: 'dist/getting-started/notion/keyboard-shortcuts.html'

Pretty URLs instead of Ugly URLs

Currently script generates pages with slug in filename
Ex: blog.html
It creates ugly url like /blog.html with .html extension
It would be great to have feature to generate pages in folders and with file index.html in it
Ex: /blog/index.html
In the result url path will look like /blog without .html extension

You can see example in Hugo site https://gohugo.io/content-management/urls/#pretty-urls

Installation error command and driver

after the installation the driver was not foud so i downloaded it and placed inside the loconotion folder
but i can't run the command for add it
nicolamac loconotion % python loconotion --chromedriver ./chromedriver Traceback (most recent call last): File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 165, in _run_module_as_main mod_name, loader, code, fname = _get_main_module_details(_Error) File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 133, in _get_main_module_details return _get_module_details(main_name) File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/runpy.py", line 119, in _get_module_details code = loader.get_code(mod_name) File "/usr/local/Cellar/python@2/2.7.16/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pkgutil.py", line 281, in get_code self.code = compile(source, self.filename, 'exec') File "/Users/nicolatoledo/Personale/loconotion/loconotion/__main__.py", line 14 log.critical(f"ModuleNotFoundError: {error}. have your installed the requirements?") ^ SyntaxError: invalid syntax

Comments

Is it possible to add Disqus comments below some pages?

Rendering bug with inline references

Using inline references causes HTML an HTML rendering bug due to an oversized svg icon.
Expected result (as seen on Notion.so):
image

Actual result:
image

Steps to reproduce:
Download the following Notion page with default settings: https://www.notion.so/cc0textures/Loconotion-Rendering-bug-demonstration-1f1cda00e3ff4d5b9e70c8232210b7d7

python loconotion https://www.notion.so/cc0textures/Loconotion-Rendering-bug-demonstration-1f1cda00e3ff4d5b9e70c8232210b7d7

I am using Locomotion master with this commit being the latest: f580af4

Windows 10 20H2
Chrome 88.0

Any tips for temporary workarounds would be greatly appreciated (for example a way to just hide the small icon which is causing the issue).

Parser crashes with unexpected behavior of scrollIntoView() on table block

Greetings from Korea, the first non-English language environment where Notion officially supports :D

Issue

notionparser.py#L431 seems to expect that javascript scrollIntoView() will make table_row_hover_target visible.
But instead...

image

table_row_hover_target is still invisible, behind the table header.

Environment

  • MacOS 10.15.5 Catalina
  • (Session info: chrome=84.0.4147.105)
  • loconotion is on current master(73c21cc)

Workaround (if this issue affects everyone)

I managed to resolve this issue - quick and (so) dirty - by injecting some scroll-margin-top on div[data-block-id='{table_row_block_id}'] and then scrollIntoVIew() into it.

                # for each row, hover the mouse over it to make the open button appear,
                # then grab its href and wrap the table row's name into a link
                table_row_block_id = table_row["data-block-id"]
                table_row_scroll_target = self.driver.find_element_by_css_selector(
                    f"div[data-block-id='{table_row_block_id}']"
                )
                table_row_hover_target = self.driver.find_element_by_css_selector(
                    f"div[data-block-id='{table_row_block_id}'] > div > div"
                )
                # need to scroll the row into view or else
                # the open button won't visible to selenium
                self.driver.execute_script(
                    "arguments[0].style.scrollMarginTop = '100px'; arguments[0].scrollIntoView();", table_row_scroll_target
                )

I think it would be possible to check div.notion-table-view-header-cell and div.notion-topbar's heights to improve this, but since I don't have any experience on Selenium it will take too much time for me to do it alone.

Twitter Embeds

So I had some tweet embeds in some of my Notion pages that aren't being rendered properly.

You can use the Twitter embed API to get the HTML for the tweet and replace the tag that's currently on the page.

Currently, I'm using Location to download all the pages and then doing some clean-up.

This is how I'm adding in the tweet embeds:

`tweets = soup.findAll("twitter-widget")

for tweet in tweets:
    tweet_url = "https://twitter.com/name/status/" + tweet.attrs['data-tweet-id']
    twitter_api = "https://publish.twitter.com/oembed?url=" + tweet_url
    tweet_embed_html = requests.get(twitter_api).json()['html'].replace("\n", "")
    tweet_embed_html = "<div style='width: 300'>" + tweet_embed_html + "</div>"
    tweet.append(BeautifulSoup(tweet_embed_html, 'html.parser'))`

Can I add this to the repo?

Crash on tables export

Upon trying to export the following page

https://www.notion.so/ceecnyc/Group-Members-c2858b83c7194a49ab527dfa4191b749

I'm getting this error

[22:35:46] ERROR Timeout waiting for the 'open' button to appear for row in table with block id 6bc9d731-73e1-4ef3-bb26-0918b4b8647b
Traceback (most recent call last):
  File "/Users/dan/.pyenv/versions/3.8.2/lib/python3.8/runpy.py", line 193, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/dan/.pyenv/versions/3.8.2/lib/python3.8/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "loconotion/__main__.py", line 138, in <module>
    main()
  File "loconotion/__main__.py", line 117, in main
    Parser(config=config, args=vars(args))
  File "loconotion/notionparser.py", line 85, in __init__
    self.run(url)
  File "loconotion/notionparser.py", line 638, in run
    tot_processed_pages = self.parse_page(url)
  File "loconotion/notionparser.py", line 629, in parse_page
    self.parse_page(
  File "loconotion/notionparser.py", line 486, in parse_page
    table_row_href = self.driver.find_element_by_css_selector(
  File "/Users/dan/.pyenv/versions/3.8.2/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 598, in find_element_by_css_selector
    return self.find_element(by=By.CSS_SELECTOR, value=css_selector)
  File "/Users/dan/.pyenv/versions/3.8.2/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 976, in find_element
    return self.execute(Command.FIND_ELEMENT, {
  File "/Users/dan/.pyenv/versions/3.8.2/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/Users/dan/.pyenv/versions/3.8.2/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"div[data-block-id='6bc9d731-73e1-4ef3-bb26-0918b4b8647b'] > div > a"}
  (Session info: headless chrome=85.0.4183.102)

I thought it was because the table entry pages were empty, but even when I put text in the table pages it still throws this error.

(in any case thank you for a wonderful project!)

atom or rss feed support

It would be nice to have atom or rss support.

I think it would be better to have feeds for every database that has a single datetime field, if they are multiple datetime fields, the default behavior would be to pick the first and allow to configure it in toml.

Heuristically extract metadata for pages if none exists in the config file.

Problem

Loconotion doesn't set the meta tags for the page if metadata for it is not supplied in the config.

Proposal

Extract metadata from the page and set the correct meta tags for the page if none is supplied in the config.
If metadata cannot be extracted and there is none supplied in the config, loconotion can use the global metadata for adding meta tags.

This could be enabled by an additional option in the config.

I can implement this and open a PR if you like with some direction.

Open collection items in a modal (popup)

I have collections of items coming from a database that I would like to present as-is. With the exported site, if a user clicks on a collection item, it goes to the item's dedicated page. This is functional, but not very user-friendly.

@leoncvlt What would be an easy way to handle this issue? I am eager to help with a PR, if you could give me a couple of hints.

Setting a property of type "URL" to be visible in a Gallery View causes layout issues

Create a Gallery view where some cards have a property of type "URL". Set this property to be visible on the individual cards. Just like this example page I made: https://www.notion.so/cc0textures/79385cd91fd74099bfbd06b283ff8901?v=d78006c6fe794ac39329355654ae457b

Expected result:
image
All cards are displayed properly. Page 1 and 5 have a visible and clickable URL.

Result after downloading with Loconotion:
image
The cards for pages 1 and 5 are severely distorted and overlap with the cards below them.

The problem does not occur after downloading with Loconotion without setting the URL to be visible:
image

Crash when parsing toggle blocks

Hello, awesome project you've got here :)

We are having issues with what I assume are only specific toggle blocks. I already tried increasing the timeout, however it keeps crashing at the exact same page.

After a few timeout messages:

[16:51:44] INFO Parsing page 'https://www.notion.so/What-s-New-157765353f2c4705bd45474e5ba8b46c'
[16:52:31] WARNING Timeout waiting for toggle block to open. Likely it's already open, but doesn't hurt to check.
[16:52:41] WARNING Timeout waiting for toggle block to open. Likely it's already open, but doesn't hurt to check.
[16:53:00] WARNING Timeout waiting for toggle block to open. Likely it's already open, but doesn't hurt to check.
[16:53:58] WARNING Timeout waiting for toggle block to open. Likely it's already open, but doesn't hurt to check.
[16:54:23] WARNING Timeout waiting for toggle block to open. Likely it's already open, but doesn't hurt to check.
[16:55:41] WARNING Timeout waiting for toggle block to open. Likely it's already open, but doesn't hurt to check.

I finally get:

Traceback (most recent call last):
File "/home/ubuntu/.pyenv/versions/3.9.2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/ubuntu/.pyenv/versions/3.9.2/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/ubuntu/loconotion/loconotion/main.py", line 144, in
main()
File "/home/ubuntu/loconotion/loconotion/main.py", line 123, in main
Parser(config=config, args=vars(args))
File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 85, in init
self.run(url)
File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 658, in run
tot_processed_pages = self.parse_page(url)
File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 645, in parse_page
self.parse_page(
File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 645, in parse_page
self.parse_page(
File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 645, in parse_page
self.parse_page(
[Previous line repeated 7 more times]
File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 336, in parse_page
open_toggle_blocks(self.args["timeout"])
File "/home/ubuntu/loconotion/loconotion/notionparser.py", line 301, in open_toggle_blocks
toggle_button = toggle_block.find_element_by_css_selector(
File "/home/ubuntu/.cache/pypoetry/virtualenvs/loconotion-IMDcpsi3-py3.9/lib/python3.9/site-packages/selenium/webdriver/remote/webelement.py", line 430, in find_element_by_css_selector
return self.find_element(by=By.CSS_SELECTOR, value=css_selector)
File "/home/ubuntu/.cache/pypoetry/virtualenvs/loconotion-IMDcpsi3-py3.9/lib/python3.9/site-packages/selenium/webdriver/remote/webelement.py", line 658, in find_element
return self._execute(Command.FIND_CHILD_ELEMENT,
File "/home/ubuntu/.cache/pypoetry/virtualenvs/loconotion-IMDcpsi3-py3.9/lib/python3.9/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
return self._parent.execute(command, params)
File "/home/ubuntu/.cache/pypoetry/virtualenvs/loconotion-IMDcpsi3-py3.9/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/ubuntu/.cache/pypoetry/virtualenvs/loconotion-IMDcpsi3-py3.9/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: timeout: Timed out receiving message from renderer: 300.000
(Session info: headless chrome=89.0.4389.90)

Never loads YouTube video

If you have a long page and a YouTube video on it somewhere in the end then it never loads and throws a timeout error:

CRITICAL Timeout waiting for page content to load, or no content found. Are you sure the page is set to public?

A workaround that I've found is to use --non-headless flag and when browser window opens to manually scroll to every video and wait till YouTube player lazy loads (don't have to click anywhere, just scroll).

Search & Notion links in the Navigation menu

Seems like they just added these items (search & notion) links into the navigation menu (top row).
And these items render using desktop layout therefore cause issues on the mobile

Links not work

Hi!
I try create landing for bashair.ru
But links in page not work.
Example

<div class="notranslate" data-root="true" placeholder="List" spellcheck="false" style="max-width: 100%; width: 100%; white-space: pre-wrap; word-break: break-word; caret-color: transparent; padding: 3px 2px; text-align: left;" contenteditable="true">
<a class="notion-link-token notion-enable-hover" data-token-index="0" href="https://vk.com/vozduh_str" rel="noopener noreferrer" style="cursor:pointer;color:inherit;word-wrap:break-word;text-decoration:inherit" target="_blank">
<span style="border-bottom:0.05em solid;border-color:rgba(55,53,47,0.4);opacity:0.7">https://vk.com/vozduh_str</span></a></div></a></div>

if delete attribute in div - 'contenteditable', then works!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.