Comments (7)
Hi!
Thanks for good questions! I guess ultimate goal is to have a fast, well tested and stable Python org-mode parser.
Another thing I'd like it to keep is BSD license the original author used.
That means only using documentation and reverse engineering parsing/regexes etc, but I think ultimately it's better for the format and Org ecosystem not to be restricted by copyleft license.
I'm using Org-mode as a primary means of organising information nad logging; and I don't see a good alternative to the format around there so I'm planning to support the parser in the foreseeable future.
That said, for my purposes the parser more or less satisfies me now since I mainly use outlines, timestamps and tags. However there are couple of projects where I'm extracting links and tables as well (in somewhat ad hoc way), and I'd like to integrate it too (#8).
My personal blogging setup is a bit different: I use Emacs in batch mode to render my org-mode files and then do a bit of post-processing on top of it (mainly to fix things that org-export
does I find stupid).
However I also want to use some pre-processing (e.g. to support some dynamic things/content filtering) and for that I'm definitely not going to use Elisp :) So I'm motivated to support proper structure parsing too.
I could spare some time to implement other syntax which I'm not using as well. I guess would be interesting to hear what's the highest priority for people.
If you could give a sample org mode file with bits of syntax that matter most for you (and perhaps Python interface that you'd expect for them) that'd help me to prioritize :)
In terms of implementing it, I'd like to keep the interface returning strings wherever possible (e.g. .heading
/.body
) for the sake of simplicity, so perhaps separate method returning 'rich' content would serve people's needs for more structure well.
I'll give a think/experiments on how to implement it and share. If you have any other libraries in mind for interface inspiration, please let me know!
P.S. Good point about readme section; I'll add it when I shape my thoughts a bit more clearly.
from orgparse.
https://github.com/novoid/Memacs/blob/master/memacs/lib/orgformat.py could be a promising connection point between our projects. Maybe we are able to develop a merged version from both projects as a standard library for formatting strings into Org mode elements.
My code may be not 100% clean since it was developed "on demand" when something was needed. I once added some unit tests which are not providing full coverage. However, the tests show some basic examples: https://github.com/novoid/Memacs/blob/master/memacs/lib/tests/orgformat_test.py
from orgparse.
Hi,
I don't say that my decisions are perfect. However, https://github.com/novoid/lazyblorg/wiki/Orgmode-Elements lists the Org mode elements my naïve parser supports (and I am using for my own blog http://karl-voit.at/ ) and https://github.com/novoid/lazyblorg/wiki/Data-Structures gives a short intro into some data structures.
My choice so far was to retrieve a list of Org mode elements that consists of a list themselves. You see examples on https://github.com/novoid/lazyblorg/wiki/Data-Structures#representation-of-blog-data within the key variable "content".
from orgparse.
This Issue is quit old. But I just want to put my thoughts into it, too.
I am using org-roam-v2 which use ID`s to link notes together. There are some "solutions" arround to create HTML content out of it but none of them work well for me. I do not need a Blog/Website but just HTML files for offline use.
Also org-mode itself has problems exporting content and taking the ID-links into account. Side problem is that this is not reproducible but occur often. And I not the only person reporting such problems. I invested to much time in finding a solution, understanding bugs etc.
Now I decided to write my own "org-to-html-thing". I do not want to learn Lisp. I am from Python so this is my choice.
I was really glad to see that something like orgparse
exists. This save half of the work for me.
Since yesterday I have a working prototype. I will polish it up a bit and will publish it on my main account on Codeberg in the next days.
Btw: I am free for naming suggestions. I am not good with things like that. :D
from orgparse.
Now I decided to write my own "org-to-html-thing". I do not want to learn Lisp. I am from Python so this is my choice.
You are aware of https://pypi.org/project/pypandoc/ ?
I use it as a fallback for single Orgdown elements in lazyblorg and it's doing great so far.
from orgparse.
I could spare some time to implement other syntax which I'm not using as well. I guess would be interesting to hear what's the highest priority for people.
If you could give a sample org mode file with bits of syntax that matter most for you (and perhaps Python interface that you'd expect for them) that'd help me to prioritize :)
Meanwhile, there is https://gitlab.com/publicvoit/orgdown which defines an initial level of syntax elements I'd expect to be included in the supported elements of any Org-mode-syntax parser. Maybe this is a selection of elements that makes sense to more people.
from orgparse.
You are aware of https://pypi.org/project/pypandoc/ ?
Not until yet. Seems like it can handle some basic org constructs and convert them to html. But I will need a pandoc binary in the background.
I will keep this in mind when I come to some more complex org construct. But currently I only of text, links, headings, lists and sources blocks.
from orgparse.
Related Issues (20)
- How do i load a file with custom TODO keys? HOT 3
- Date parsing with timezone information HOT 2
- speedup parsing
- Invalid syntax when trying to use the library HOT 5
- orgparse.load() is broken for file-like objects HOT 3
- support names on tables HOT 1
- Support more Effort format in Properties HOT 2
- Repeated_tasks and logbook parsing HOT 4
- Minor difference in date objects parsed
- Support properties in OrgRootNode HOT 2
- Question: Why is `_special_comments['TITLE']` a list? HOT 1
- parsing of multiline properties HOT 1
- QUESTION: Reason about using `codecs.open()` HOT 2
- Move `tests` outside the package folder HOT 4
- Offer `CONTRIBUTE.md` (was: Unittest not running) HOT 3
- Providing OrgEnv to load() with pathlib.Path errors HOT 1
- Logbook drawer tags are not removed from body text HOT 1
- Non-existant date errors without context HOT 5
- Extracting line numbers from org nodes? HOT 1
- How to access comments and properties in clocks (Loogbook entry)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from orgparse.