laws-africa / bluebell Goto Github PK
View Code? Open in Web Editor NEWBluebell is a generic Akoma Ntoso 3 parser.
Home Page: https://laws.africa/open-law-technology
License: GNU General Public License v3.0
Bluebell is a generic Akoma Ntoso 3 parser.
Home Page: https://laws.africa/open-law-technology
License: GNU General Public License v3.0
perhaps, to mirror the BULLETS
markup, something like
NUMBERS
1. sdfsdf
2. dfsfs
3. sdfsdf
where the space denotes the end of the num
portion (although I realise li
s don't have num
s)
This is the code that handles escaping both block elements and inlines when unparsing:
<!-- first text nodes of these elems must be escaped if they have special chars -->
<xsl:template match="a:*[self::a:p or self::a:listIntroduction or self::a:listWrapUp]/text()[not(preceding-sibling::*)]">
<xsl:call-template name="escape">
<xsl:with-param name="text" select="." />
</xsl:call-template>
</xsl:template>
However, I don't think this is correct. Surely we should be escaping inlines for everything, not just p
, listIntroduction
and listWrapUp
? What about headings and crossheadings?
Additionally, I don't think it's possible to have a text() tag that does have a preceding sibling! I think this is a hangover from slaw where we only escaped the first paragraphs of some things.
eg. {{FN 1}}
Nested attachments are now supported.
eg.:
1. The text of the paragraph
should be recognised as a paragraph.
Debate is a top-level element. It has a different structure to the normal hierarchy.
rather be explicit.
BLOCKLIST
and ITEM
eg.
QUOTE{startQuote "|endQuote >>}
PARAGRAPH (a)
Some text
eg: ****Notice**s**
becomes *<b>*Notice</b>s**
when it should be <b><b>Notice</b>s</b>
For block elements that the grammar allows arbitrary attributes, drop attributes that would create invalid AKN.
eg. <heading/>
and <subheading/>
shouldn't result in errors on a round trip. They don't need to be preserved, necessarily, but we should be able to handle unparsing and re-parsing them.
eg. this doesn't work, but it should SEC 2. -
Use Statement.required_children and override (part of) DocumentRoot.to_dict() to append mainBody with children = [empty_p()] if main_body has no text?
There are subtle differences, such as bluebell being more specific about which elements don't need an eId (particularly inline elements).
We should make the code match as much as possible, so we can maintain the two more easily.
Ensure that the grammar and the XSLT fully support attributes and PARA.foo
style class names.
Some of the elements do (eg. tables and table cells), and some of the grammar does when parsing, but not when unparsing.
It should also only allow valid attribute names, and drop invalid ones.
eg. <i>foo/</i>
should be stable, not <i>foo</i>/
-- may be caused by having another /
in the string, like <i>the type/class of/</i>
eg. http://kenyalaw.org/caselaw/cases/view/196612/ quotes a judgment which quotes an act.
Eg. parse:
1. some text at the start of a paragraph,
that also wraps
as
PARAGRAPH 1.
some text at the start of a paragraph,
that also wraps
What happens with indents? I suspect we should enforce consistent indents
eg.
SEC 1
SUBSEC (a)
some text
[[this remark
spans multiple
lines and
indents]]
eg. PARA for PARAGRAPH
Possibly may also be related to multiple footnotes in one paragraph disappearing when unparsing.
eg: <num>1-</num>
and SEC 2- -
and SEC 2- - heading
and SEC 2-
In particular, ref
.
num
if there is onee.g.
PARA
Intro
PARA 1.
First para
PARA 1A.
Added in later
PARA
Unnumbered
PARA 2.
Second (actually third/fourth/fifth, depending on who's counting) para.
PARA 2.
Another para with the num 2.
PARA 2.3-4.5.
Another para with the num 2.
PARA 2.3-4.5.
Another para with the num 2.
PARA 2.3-4.5_1
Another para with the num 2.
should have the following as their eIds:
(Not finalised)
The following markup should work in theory but doesn't:
BULLETS
* Level 1
BULLETS
* Level 2
e.g. in https://edit.laws.africa/documents/4742/ instrument\instruments: the \ was deleted (I had to use \\
)
Complicated legislation like Income Tax Acts can have provisos to provisos, and provisos containing deeply nested elements. Delineating where precisely the proviso ends will help readers as well as future drafters understand the structure.
e.g. paragraph 1.1.1. shouls have an eId ending in para_1-1-1
, not para_111
. See Annex of https://edit.laws.africa/documents/4608/ for an example
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.