Comments (6)
Hi! Really appreciate this package!
I was also using Kindle in Portuguese and not getting the location and date. I changed my devise to English and it is all good now.
However, non-english letters are missing. I think is the same problem as @asyr01.
"Transformação" -> "Transformao"
"Mudança" -> "Mudana"
"Está" -> "est"
"Você" -> "voc"
I saw that the last commit was regard this issue:
raw_clippings_text = raw_clippings_text.encode("ascii", errors="xmlcharrefreplace").decode()
If it was only utf-8-sig I think it would read the non-english letter (I tried manually running the funcion "read_raw_clippings" on my "My Clippings.txt"), but I don't know what would happen on other parts of the code.
Thank you!
from kindle2notion.
Placing #46 here for reference. Thanks for contributing again @asyr01!
Regarding your second question, the current package is already capable of doing that. It can be optimized with a JSON structure to track clippings instead of the current method.
from kindle2notion.
For reference, this is the snippet of code that scrapes out the location, page and date information from the text file.
See lines 49-53
in /kindle2notion/parasing.py
The function that addresses this is pasted below:
def _parse_page_location_and_date(raw_clipping_list: List) -> Tuple[str, str, str]:
second_line = raw_clipping_list[1]
second_line_as_list = second_line.strip().split(' | ')
page = location = date = ''
for element in second_line_as_list:
element = element.lower()
if 'page' in element:
page = element[element.find('page'):].replace('page', '').strip()
if 'location' in element:
location = element[element.find('location'):].replace('location', '').strip()
if 'added on' in element:
date = parse(element[element.find('added on'):].replace('added on', '').strip())
date = date.strftime('%A, %d %B %Y %I:%M:%S %p')
return page, location, date
One would need to replace 'page'
, 'location'
and 'added on'
in lines 49, 51, 53 with their language equivalent terms as used in the respective My Clippings.txt
file to get the relevant result.
In your case from my limited understanding it would be 'destaque na página'
, 'destaque ou posição
, and Adicionado:
.
Leaving this issue open cause I'm unsure of how to incorporate this feature within the structure of the package. I'm open to hearing inputs from the GH community on this one. A working solution may be to identify the language on scraping the first clipping and adapting the relevant keywords to fetch respectively. I can change the languages on my Kindle and make some test clippings so that they would get saved in that language in the My Clippings
file and code from there.
from kindle2notion.
Really appreciate the hard work you put in.
There is no problem with English. However when it comes to my Turkish Books,
Unfortunately there is missing worlds on notion which includes special letters in Turkish,
For example "i, ç , ü, ö", This non-english letters are missing,
Maybe we could find some way to handle it.
Also when we start the script for second time, if clippings are all same it could skip existing ones
and only append the new ones, is it possible?
Thanks, Have a good one.
from kindle2notion.
Hi! Really appreciate this package!
I was also using Kindle in Portuguese and not getting the location and date. I changed my devise to English and it is all good now.
However, non-english letters are missing. I think is the same problem as @asyr01.
"Transformação" -> "Transformao"
"Mudança" -> "Mudana"
"Está" -> "est"
"Você" -> "voc"I saw that the last commit was regard this issue:
raw_clippings_text = raw_clippings_text.encode("ascii", errors="xmlcharrefreplace").decode()If it was only utf-8-sig I think it would read the non-english letter (I tried manually running the funcion "read_raw_clippings" on my "My Clippings.txt"), but I don't know what would happen on other parts of the code.
raw_clippings_text = open(clippings_file_path, "r", encoding="utf-8-sig").read()Thank you!
Thanks for the tip @mefonseca! Implemented your request in the latest release.
@asyr01 please update the package and try running it on your system. It should account for those letters now.
@lfschafaschek Will implement custom Portuguese support soon!
Thank you all for your patience and goodwill. Hope this fix addresses your issues here.
from kindle2notion.
Hi, I'm running the latest version and I have the same issue as above but with the Czech characters like these: ěščřžňů. Can the Czech language be also supported? Thank you!
from kindle2notion.
Related Issues (20)
- raise HTTPError( requests.exceptions.HTTPError: Invalid input. HOT 9
- I keep getting the following IndexError: list index out of range HOT 2
- requests.exceptions.HTTPError: Invalid input. HOT 2
- Can't install kindle2notion. setuptools not available in the build environment. HOT 5
- I keep getting this error "raise HTTPError( requests.exceptions.HTTPError: Something went wrong." HOT 1
- "from more_itertools import grouper" needs to commented on the packages. HOT 1
- ENHANCEMENT: Create and maintain JSON object of books and clippings to reduce API calls and update only those clippings that are modified each time. HOT 1
- APIResponseError: API token is invalid HOT 3
- Update docs to reflect the new Notion API integration token setup
- AttributeError: 'NoneType' object has no attribute 'strftime' HOT 4
- Unable to sync clippings with many highlights HOT 1
- TypeError: issubclass() arg 1 must be a class HOT 2
- 'ascii' codec can't encode character '\u2019' in position 217: ordinal not in range(128) HOT 1
- Error when trying to export. HOT 4
- ValueError: invalid literal for int() with base 10: '60.0' HOT 1
- Client error '400 Bad Request' with body.children[0].paragraph.rich_text[16].text.content.length should be ≤ `2000`, instead was `2001`.
- Unsure if My input format is correct: 401 Client Error HOT 1
- Need to update the documentation about how to share database with integration
- Add support for exported notes form Kindle app
- ValueError: time data '' does not match format '%A, %d %B %Y %I:%M:%S %p' HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kindle2notion.