Giter Site home page Giter Site logo

metanorma / asciidoctor-rfc Goto Github PK

View Code? Open in Web Editor NEW
15.0 14.0 7.0 1.19 MB

AsciiRFC: an AsciiDoc/asciidoctor backend to produce RFC XML v3 (RFC 7991) and v2 (RFC 7749)

License: BSD 2-Clause "Simplified" License

Ruby 99.97% Shell 0.03%
ietf ietf-rfcs rfc internet-draft rfc-process asciidoc asciidoctor ribose-open

asciidoctor-rfc's Introduction

Metanorma: the standard for standards

Gem Version Build Status Code Climate Pull Requests Commits since latest

Metanorma is dedicated to harmonizing standard documents produced by different standard-setting bodies in a manner that maintains correct semantics while allowing each standard publisher to define appropriate semantic extensions.

Simply put, it allows standards bodies or any other organization to create their own standard or specification document in a best practices manner.

Metanorma is composed of a number of specifications and software implementations. The Metanorma document model is based on the SecureDoc document model.

For more on Metanorma and who uses it, refer to https://www.metanorma.org

Installation on supported platforms

Installing individual components

The Metanorma workflow can be utilized via the metanorma-cli Ruby gem.

gem install metanorma-cli

Usage

Threaded execution

Metanorma has threaded execution, to generate output documents from the same Presentation XML input more quickly. Similar to relaton, the METANORMA_PARALLEL environment variable can be used to override the default number of parallel fetches used.

Origin of name

Meta- is a prefix of Greek origin ("μετα") for “with” “after”. In English, it has ended up meaning "about (its own category)"; e.g. meta-discussion (a discussion about discussion). (For the roundabout way it ended up with that meaning, see https://en.wikipedia.org/wiki/Meta#Etymology.)

Norma is Latin for “rule” and “standard”; hence English norm, but also German Norm "standard".

The Metanorma project is for setting a standard for standard documents created by standards-setting organizations (which is a meta thing to do); hence this name.

Metanorma seeks to embrace all standards documents standards, but not possess any: it can give rise to many "standard" standards, but not limit the extension of any of those standards.

The motto of the project is Aequitate verum, "Truth through equity". Dealing with all standards fairly (aequitate), we seek not an abstract virtue (veritas), but a practical reality on the ground (verum), that can be used by stakeholders of multiple standards.

asciidoctor-rfc's People

Contributors

abunashir avatar camobap avatar opoudjis avatar paolobrasolin avatar ronaldtse avatar strogonoff avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

asciidoctor-rfc's Issues

BCP14 rendering output in v3 vs v2

In the README it says v3: <bcp14>MUST NOT</bcp14>. Not supported in v2.

Should be v3: <bcp14>MUST NOT</bcp14>. Rendered as inline text "MUST NOT" in v2.

@opoudjis could you confirm that v2 does render them? Thanks!

:name: attribute conversion to `:name` (v3) / `{:docName, :number}` (v2) issues

Currently we deal with :name / {:docName, :number} separately, but I'd like to see them align to the usage shown in the README.

So setting the document attribute :name (or :rfc-name if it is more clear) should automatically set:

I-D

  • v3: the front/seriesInfo@value will be set to this value.
  • v2: the rfc@docName will be set to this value.

RFC

  • v3: the front/seriesInfo@value will be set to this value.
  • v2: the rfc@number will be set to this value.

Internet-Draft

= Document title
Author
:doctype: internet-draft
:name: draft-hello-00
:intended-status: informational
:submission-type: irtf

v3

<?xml version="1.0" encoding="UTF-8"?>
<rfc preptime="1970-01-01T00:00:00Z" version="3" submissionType="IRTF">
<front>
<title>Document title</title>
<seriesInfo name="Internet-Draft" status="informational" stream="IRTF" value="draft-hello-00"/>
<author fullname="Author" />
</front>
<middle></middle>
</rfc>

=> v2

<?xml version="1.0" encoding="UTF-8"?>
<rfc version="2" submissionType="IRTF" docName="draft-hello-00 ">
<front>
<title>Document title</title>
<seriesInfo name="Internet-Draft" value="draft-hello-00"/>
<author fullname="Author" />
</front>
<middle></middle>
</rfc>

RFC

= Document title
Author
:doctype: rfc
:name: 1234
:intended-status: informational
:submission-type: irtf

v3

<?xml version="1.0" encoding="UTF-8"?>
<rfc preptime="1970-01-01T00:00:00Z" version="3" submissionType="IRTF">
<front>
<title>Document title</title>
<seriesInfo name="RFC" status="informational" stream="IRTF" value="1234"/>
<author fullname="Author" />
</front>
<middle></middle>
</rfc>

=> v2

<?xml version="1.0" encoding="UTF-8"?>
<rfc version="2" submissionType="IRTF" number="1234 ">
<front>
<title>Document title</title>
<seriesInfo name="RFC" value="1234"/>
<author fullname="Author" />
</front>
<middle></middle>
</rfc>

asciidoctor doctype should be "rfc"

asciidoctor supports multiple doctypes, including:

  • article
  • book
  • inline
  • manpage

And for our gem, the doctype should definitely be rfc.

Agree?

Prettify README (and move to AsciiDoc)

We probably should use tables for indicating v3/v2 asciidoc syntax equivalence to XMLRFC.

For syntax description, maybe we should mix v3/v2 and separate them according to purpose. For example, a separate section about "Front matter", "Abstract" so that duplicated syntax between v2 and v3 will just be shown once.

And README.md should really be README.adoc

Entities in Asciidoc

I admit to being defeated by this. Asciidoc can have HTML entities like &nbsp; , as can RFC XML. (Yes, they are not XML entities, but they happen anyway, and IETF has them in its canonical examples of RFC XML.) Nokogiri presumably can deal with them via the NOENT option; but the DocumentFragment used to process Nokogiri in noko() does not; and as a result a line like

&lt;&nbsp;(&amp;lt;)

gets mangled into

&lt;(lt;)

I can't work out how to get ParseOptions into DocumentFragment. @paolobrasolin, help!

AsciiDoc type header author syntax

We currently support author attributes syntax like this:

= Title
Author One <[email protected]>, Author Two <[email protected]>
:organization: Author1 Org
:email: [email protected]
:phone: (123) 456-7890
:street: 1, 1 Street
:city: One City
:region: OC
:code: 12345
:country: United States of America
:link: http://author.com/~author1
:organization_2: Author2 Org
:email_2: [email protected]
...

But obviously this is very messy.

Of course, AsciiDoc (and asciidoctor) does not natively support Author Attributes, but we got to make one for XMLRFC.

The corresponding markup in MMark is this:

[[author]]
initials = "A."
surname = "One"
fullname = "Author 1 One"
organization = "Author1 Org"
  [author.address]
  email = "[email protected]"
  uri = "http://author.com/~author1"
    [author.address.postal]
    street = "1, 1 Street"
    city = "One City"
    region = "OC"
    code = "12345"
    country = "United States of America"

Could we could use an asciidoctor-like Preprocessor Directive syntax block like ifdef::/endif:: like this?

rfc-author::begin[]
  :initials: A. O.
  :surname: One
  :fullname: Author One
  :organization: Author1 Org
  :address.email: [email protected]
  :address.uri: http://author.com/~author1
  :address.phone: (123) 456-7890
  :address.postal.street: "1, 1 Street"
  :address.postal.city: "One City"
  :address.postal.region: "OC"
  :address.postal.code: "12345"
  :address.postal.country: "United States of America"  
rfc-author::end[]

RFC style bibliography references

@paolobrasolin XMLRFC supports bibliographic references like this (defined in XMLRFCv2 and XMLRFCv3). Could we adopt this into a format that asciidoctor-bibliography supports?

<reference anchor='NIST.SP.800-56Ar2' target='http://dx.doi.org/10.6028/NIST.SP.800-56Ar2'>
  <front>
    <title>SP 800-56Ar2 Recommendation for Pair-Wise Key Establishment Schemes Using Discrete Logarithm Cryptography</title>
    <author initials="B." surname="Barker" fullname="Elaine B. Barker">
      <organization>National Institute of Standards and Technology</organization>
      <address>
        <postal>
          <street>100 Bureau Drive</street>
          <city>Gaithersburg</city>
          <region>MD</region>
          <code>20899</code>
          <country>United States</country>
        </postal>
        <uri>http://www.nist.gov/</uri>
      </address>
    </author>
    <author initials="L." surname="Chen" fullname="Lily Chen">
      <organization>National Institute of Standards and Technology</organization>
      <address>
        <postal>
          <street>100 Bureau Drive</street>
          <city>Gaithersburg</city>
          <region>MD</region>
          <code>20899</code>
          <country>United States</country>
        </postal>
        <uri>http://www.nist.gov/</uri>
      </address>
    </author>
    <author initials="A." surname="Roginsky" fullname="Allen Roginsky">
      <organization>National Institute of Standards and Technology</organization>
      <address>
        <postal>
          <street>100 Bureau Drive</street>
          <city>Gaithersburg</city>
          <region>MD</region>
          <code>20899</code>
          <country>United States</country>
        </postal>
        <uri>http://www.nist.gov/</uri>
      </address>
    </author>
    <author initials="M." surname="Smid" fullname="Miles Smid">
      <organization>Orion Security Solutions, Inc.</organization>
      <address>
        <postal>
          <street>1489 Chain Bridge Road</street>
          <street>Suite 300</street>
          <city>McLean</city>
          <region>VA</region>
          <code>22101</code>
          <country>United States</country>
        </postal>
        <uri>http://www.orionsecuritysolutions.com</uri>
      </address>
    </author>
    <date month='May' year='2013'/>
  </front>
</reference>

Port over RFC samples from IETF and competitive solutions (MMark, Kramdown)

There is plenty of value in porting the number of RFC samples from other example repositories:

The porting should also include:

  • implement a check against default XML / MMark / Kramdown for each document so that our output is directly comparable (and hopefully close to identical) for the corresponding RFC text source (.adoc vs .mkd)

With this we can compare and iteratively streamline our syntax to make it better than what MMark/Krandown does.

Supporting BCP14 (RFC 2119) keywords

Some keywords are special in Internet Drafts / RFCs: "MUST", "MUST NOT", "SHOULD", etc.

These words are BCP14 keywords and must always be represented as uppercase "MUST". Our gem should detect these and ensure they are properly encoded in the XML within bcp14 tags.

Shift cref to child of preceding paragraph

As now noted in the Readme, the Asciidoc and RFC XML document models are different: paragraphs in Asciidoc do not straightforwardly contain other blocks the way they can in RFC XML. In particular, admonitions are separate blocks, whereas cref elements are assumed to belong to a paragraph.

For that reason, I propose to do postprocessing on the nokogiri, take any cref that is the child of a section (which is what happens now, and which is invalid for RFC XML), and make it the child of its preceding paragraph (where present). If no paragraph is present, it will be the child of an empty paragraph.

Objections or comments?

Emend encoding of intended status

Per @ronaldtse's edits to README:

Set the intended-series attribute to set the intended series of this
document.

The following values are allowed: standard, informational, experimental, bcp, fyi, full-standard.

When doctype is set to:

  • internet-draft: this value can be one of standard, full-standard, bcp, fyi, informational,
    experimental, or historic to indicate the intended series once the document is published as an RFC.

** In v3, this sets a second front/seriesInfo element with @status as one of those values, and an empty @name, to indicate this.
** In v2, this sets the front@category value to one of std, bcp, info, exp, historic.

  • rfc: this value can be one of full-standard, bcp or fyi to indicate the current status of this document.

** In v3, this sets a second front/seriesInfo element with @status as one of those values, and an empty @name, to indicate this.
** In v2, this sets the front@category value to one of std, bcp, info. (While in https://tools.ietf.org/html/rfc7749#appendix-A.1[v2], the values exp, historic are also possible for a RFC, our gem does not support it.)

Proper rendering of example / code blocks

Code blocks in MMark are always centered unless given an attribute to not to.

However, if you have multiple code blocks (e.g., in ABNF) to demonstrate in a series, you will see very awkward centering of those code blocks so they are all in the middle. This feels very strange because code should alway start from the left side with some fixed indent.

The default behavior of our gem should left align the blocks with some minor indent that is configurable, centering should only be applicable to mathematical formulas, which should be centered.

Incorporating mathematical formatting in RFC XML output

There's lots of mathematical formatting in MMark, delimited with $$, which I'm putting into Asciidoctor with the math[] macro. Maths is not natively supported by RFC XML. Should we presuppose an external XML schema for these, or leave them unrendered?

At a minimum, I could render it as italics, although I would have to selectively italicise only the letters and not the numbers or symbols in any such expression.

Smart apostrophes

Asciidoctor natively converts apostrophes (as opposed to single quotes) to smart apostrophes; e.g. don't is rendered as don‘t. This is not necessarily a good thing for XML output. Nor should we be forcing users to escape apostrophes as don\'t.

@ronaldtse, should I undo smart apostrophes in the RFC XML output?

Reimplement displayreferences in v3

Now that I have got rid of ulist reference conversion, I still need to process a preamble to the raw XML of RFC XML bibliography, in order to extract displayreferences:

[[[xxx,1]]][[[gof,2]]]
++++
RFC XML Bibliography
++++

needs to process [[[xxx,1]]][[[gof,2]]], and convert them into:

      <displayreference target="xxx" to="1"/>
      <displayreference target="gof" to="2"/>

Output DTD is missing

Child of #59

WARNING: No DTD given, defaulting to /usr/local/lib/python3.6/site-packages/xml2rfc/templates/rfc2629.dtd

Do the sample documents from IETF have DTD link?

Note that RFC 2629 is v1, but the DTD has been updated to v2. No idea if it has been updated to v3; rfc2xml not working for v3 yet.

Support normative vs non-normative citations

Just like MMark, we have to separate normative vs non-normative citations.
https://github.com/miekg/mmark/wiki/Syntax#citations

  • Informative (default) [@RFC1234]
  • Informative [@?RFC1234]
  • Normative [@!RFC1234]
  • Add to bib but do not show cite [-@RFC1234]

Using a higher modifier such as first citing as informative then 'upgrading' to normative would treat the reference as normative in the bibliography.

cc: @paolobrasolin would this be for asciidoctor-bibliography?

Rendering of table blocks

MMark's handling of table blocks don't always work -- most of the time it is "supposed to fit" when seen in text, but then breaks out of the 72 char limit, eventually leaving most tables as hand-formatted ascii-tables.

We should make this work in our gem.

Design choices for Asciidoc

These are the design choices I've made for rendering RFC XML features in Asciidoc which I think are open for discussion with @paolobrasolin, as more comfortable with Asciidoc. I have to say, I'm more or less comfortable with where I have got to with these, but let's run through them.

The issue of whether normative and informative crossreferences are to be differentiated (as they are in MMark) is discussed in separate tickets. Ditto added markup on authors. (Myself, I am wary of adding machinery not built in to asciidoctor, in order to make the gem as reusable as possible, by minimising dependencies. I also don't see a strong reason why RFC Asciidoctor needs to look like MMark. But the customer is always right. :-)

  • Comments:
    ** Are being rendered as admonitions rather than comments, since (as far as I can tell!) the asciidoctor preprocessor actually ignores native asciidoc comments. There's no option in v2 to signal that comments should be removed from a published draft, as there is in v3.

  • Abstract
    ** Is being naively extracted as the first paragraphs after the document header and before the first section or admonitions (in the asciidoc preamble), since that is where the RFC XML schema constrains it to be. This happens whether or not there is an [abstract] style on the paragraph.

  • Boilerplate
    ** I have not rendered v3 boilerplate in Asciidoc at all.

  • Cross-reference formatting attribute
    ** Options of counter, title, or derived text (e.g. Table 1). I am currently ignoring these options, because xrefstyle did not seem to offer what RFC XML needed.

  • iref primary attribute
    ** This is not whether the index term is a primary term, but whether it should be emphasised in the index as a more important reference (as boldface or italics in some book indexes). Does not seem to be supported in Asciidoc.

  • author initials
    ** There is a clash between the Asciidoc :authorinitials attribute, which includes the initial of the surname, and the RFC XML :initials attribute, which leaves it out. For Homer J Simpson, asciidoc would assume the authorinitials are HJS, RFC XML as HJ. Currently I take only the initial letter of the first name as the initial. If the user can specify RFC XML initials for authors, they need to be given a name other than :authorinitials, to avoid confusion.

  • referencegroups
    ** I have delimited groupings of references in the ulist format of references by using nested lists, but that disrupts the rendering of anchors: the default asciidoc processor clearly expects any ulist under bibliography to be flat.

  • date
    ** Asciidoc expects an ISO date under :revdate. I currently process the date and extract out the year, month, date attributes expected by RFC XML.

  • arrays in header
    ** MMark puts array values of header attributes in brackets, e.g. obsoletes: ["7994", "8981"]. I just leave the value as a comma-delimited list, without quotes or brackets, though that raises the possibility of ambiguity for keyword that contain commas.

  • superscript, subscript
    ** Not natively supported at all in RFC XML v2, though they are in v3. I leave them as Latex-style ^ and _. These often end up in attribute values in v2 XML, so rendering them in markup will not always be feasible.

  • table column width
    ** This is more a bug, but v2 has an optional table column width attribute, which I'm reading from the colpcwidth attribute. Following the HTML5 converter, that attribute is not meant to be added if the table is autowidth, but it's being added anyway.

  • Table preamble and postamble
    ** Not supported; any prefatory or footer free text in a table is going to be indistinguishable from a separate paragraph in Asciidoc, unless the table is embedded in an example, say. The elements are desultorily supported in figure (example), but they are quite fragile, and the elements were justly deprecated in v3.

  • figure
    ** Rendered as example. Note that artwork, images, and sourcecode are all meant to be embedded within figures; the converter supplies the figure node if it is not in the Asciidoc markup.

@paolobrasolin , I suggest you have a look at the RFC Asciidoc I've created at https://github.com/riboseinc/rfc-openpgp-oscca-adoc , and let me know if you consider it idiomatic.

Automatic (normative/informative) references in AsciiDoc, should support output in both v2 and v3 formats

It is understandable that some syntax is only acceptable by v2 and some only acceptable by v3.

However, the basic idea of the AsciiDoc is that it should be able to output in both formats regardless of the syntax used.

For example, in References there is a difference between AsciiDoc "v2 references" vs the "v3 references". Both should actually come from the same syntax. We can require the asciidoctor-bibliography gem to support this.

For RFC references like RFC1234 (or just the ones supported by xml2rfc as bibxml, see "Citation Libraries" on the xml2rfc tool page), they do not need to be resolved -- they can be directly entered in into the XMLRFC and xml2rfc will resolve them.

See "Helpful Hints" on the same xml2rfc tool page:

<?xml version='1.0'?>
<!DOCTYPE rfc SYSTEM 'rfc2629.dtd' [
<!ENTITY rfc2629 PUBLIC '' 'http://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2629.xml'>
]>
...
<t>This document was constructed using the <xref target="RFC2629" /> DTD.</t>
...
<references>
...
&rfc2629;
...
</references>

There should be some automatic bibliography generation like this:

[[rfc-bibliography]]

Entities in Document header

Child of #56

Any use of entities in document header attributes is being mangled:

:area: Operations & Management Area

is being converted through

      def area(node, xml)
        node.attr("area")&.split(/, ?/)&.each do |ar|
          xml.area ar
        end
      end

into

<area>Operations &amp;amp; Management Area</area>

Support "section" references like MMark

In normal AsciiDoc a section reference will produce the title.

In RFCs, a section reference should produce something like "Section 5.4.3".

In MMark, references like:

See Section (#header)

Will produce See Section 5.4.3.

Attribute naming/retrieval

The current attribute retrieval in the converter cannot work.
See http://asciidoctor.org/docs/user-manual/#attribute-restrictions

TL;DR: attributes are stored as lowercased.

We're retrieving them using ´camelCase(i.e. their corresponding tag name). Therefore, we're retrievingnil` on all attributes having a composite name.

I suggest we convert them to ´lowercase-dashed-case´ (attribute names allow only letters and dashes, with some exceptions for the first character, so this is standard practice for asciidoctor).

I'll do this on the way while I'm writing specs. Sounds good, @opoudjis?

Table formatting

Child of #59

Asciidoctor supports the following:

  • frame could be all, topbot, sides or none. topbot and sides are not creating the intended effect and is just same as all.
  • grid could be all, rows, cols or none. However it's not really having an effect.

I had only implemented grid. Will need to check.

I found that this configuration works:

[cols="2*^", frame="sides", grid="cols"]
|===
|ttcol #1 |ttcol #2

|c #1 |c #2
|c #3 |c #4
|c #5 |c #6
|===

Command line interface

At the moment it's this:

asciidoctor_rfcxml SOURCE_FILE [options] FORMATS...

-h           --help              show the help message and exit
-n           --no-dtd            disable DTD validation step
-N           --no-network        don't use the network to resolve references
-q           --quiet             dont print anything
-v           --verbose           print extra information
-V           --version           display the version number and exit

-b BASENAME  --basename=BASENAME specify the base name for output files
-D DATE      --date=DATE         run as if todays date is DATE (format: yyyy-mm-dd)
-d DTD       --dtd=DTD           specify an alternate dtd file
-o FILENAME  --out=FILENAME      specify an output filename


FORMATS:
--txt
--html

embed relaxng validation into gem for output

Should validate output against the RFC XML schemas as a courtesy: if output is invalid, and we can't make it valid automatically, user needs to be warned. This is important for v3, for which rfc2xml has not been implemented yet.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.