Giter Site home page Giter Site logo

dioscuri's Introduction

dioscuri

Build Coverage Downloads Size

A gemtext (text/gemini) parser with support for streaming, ASTs, and CSTs.

Do you:

  • 🤨 think that HTTP and HTML are bloated?
  • 😔 feel markdown has superfluous features?
  • 🤔 find gopher too light?
  • 🥰 like BRUTALISM?

Then Gemini might be for you (see this post or this one on why it’s cool).

Contents

What is this?

Dioscuri (named for the gemini twins Castor and Pollux) is a tokenizer/lexer/parser/etc for gemtext (the text/gemini markup format). It gives you several things:

  • buffering and streaming interfaces that compile to HTML
  • interfaces to create unist compliant abstract syntax trees and serialize those back to gemtext
  • interfaces to transform to and from mdast (markdown ast)
  • parts that could be used to generate CSTs

These tools can be used if you now have markdown but want to transform it to gemtext. Or if you want to combine your posts into an RSS feed or on your “homepage”. And many other things!

When should I use this?

Use this for all your gemtext needs!

Install

This package is ESM only. In Node.js (version 14.14+, 16.0+), install with npm:

npm:

npm install dioscuri

In Deno with esm.sh:

import * as dioscuri from 'https://esm.sh/dioscuri@1'

In browsers with esm.sh:

<script type="module">
  import * as dioscuri from 'https://esm.sh/dioscuri@1?bundle'
</script>

Use

See each interface below for examples.

API

This package exports the identifiers buffer, stream, fromGemtext, toGemtext, fromMdast, toMdast. The raw compiler and parser are also exported. There is no default export.

buffer(doc, encoding?, options?)

Compile gemtext to HTML.

doc

Gemtext to parse (string or Buffer).

encoding

Character encoding to understand doc as when it’s a Buffer (string, default: 'utf8').

options.defaultLineEnding

Value to use for line endings not in doc (string, default: first line ending or '\n').

Generally, discuri copies line endings ('\n' or '\r\n') in the document over to the compiled HTML. In some cases, such as > a, extra line endings are added: <blockquote>\n<p>a</p>\n</blockquote>.

options.allowDangerousProtocol

Whether to allow potentially dangerous protocols in URLs (boolean, default: false). URLs relative to the current protocol are always allowed (such as, image.jpg). Otherwise, the allowed protocols are gemini, http, https, irc, ircs, mailto, and xmpp.

Returns

Compiled HTML (string).

Example

Say we have a gemtext document, example.gmi:

# Hello, world!

Some text

=> https://example.com An example

> A quote

* List

…and our module example.js looks as follows:

import fs from 'node:fs/promises'
import {buffer} from 'dioscuri'

const doc = await fs.readFile('example.gmi')

console.log(buffer(doc))

…now running node example.js yields:

<h1>Hello, world!</h1>
<br />
<p>Some text</p>
<br />
<div><a href="https://example.com">An example</a></div>
<br />
<blockquote>
<p>A quote</p>
</blockquote>
<br />
<ul>
<li>List</li>
</ul>

stream(options?)

Streaming interface to compile gemtext to HTML. options is the same as the buffering interface above.

Example

Assuming the same example.gmi as before and an example.js like this:

import fs from 'node:fs'
import {stream} from 'dioscuri'

fs.createReadStream('example.gmi')
  .on('error', handleError)
  .pipe(stream())
  .pipe(process.stdout)

function handleError(error) {
  throw error // Handle your error here!
}

…then running node example.js yields the same as before.

fromGemtext(doc, encoding?)

Parse gemtext to an AST (gast). doc and encoding are the same as the buffering interface above.

Returns

Root.

Example

Assuming the same example.gmi as before and an example.js like this:

import fs from 'node:fs/promises'
import {fromGemtext} from 'dioscuri'

const doc = await fs.readFile('example.gmi')

console.dir(fromGemtext(doc), {depth: null})

…now running node example.js yields (positional info removed for brevity):

{
  type: 'root',
  children: [
    {type: 'heading', rank: 1, value: 'Hello, world!'},
    {type: 'break'},
    {type: 'text', value: 'Some text'},
    {type: 'break'},
    {type: 'link', url: 'https://example.com', value: 'An example'},
    {type: 'break'},
    {type: 'quote', value: 'A quote'},
    {type: 'break'},
    {type: 'list', children: [{type: 'listItem', value: 'List'}]}
  ]
}

toGemtext(tree)

Serialize gast.

Example

Say our script example.js looks as follows:

import {toGemtext} from 'dioscuri'

const tree = {
  type: 'root',
  children: [
    {type: 'heading', rank: 1, value: 'Hello, world!'},
    {type: 'break'},
    {type: 'text', value: 'Some text'}
  ]
}

console.log(toGemtext(tree))

…then running node example.js yields:

# Hello, world!

Some text

fromMdast(tree, options?)

Transform mdast to gast.

options.endlinks

Place links at the end of the document (boolean, default: false). The default is to place links before the next heading.

options.tight

Do not put blank lines between blocks (boolean, default: false). The default is to place breaks between each block (paragraph, heading, etc).

Returns

gast, probably. Some mdast nodes have no gast representation so they are dropped. If you pass one of those in as tree, you’ll get undefined out.

Example

Say we have a markdown document example.md:

# Hello, world!

Some text, *emphasis*, **strong**\
`code()`, and ~~scratch that~~strikethrough.

Here’s a [link](https://example.com 'Just an example'), [link reference][*],
and images: [image reference][*], [](example.png 'Another example').

***

> Some
> quotes

*   a list
*   with another item

1.  “Ordered”
2.  List

```
A
Poem
```

```js
console.log(1)
```

| Name | Value |
| ---- | ----- |
| Beep | 1.2   |
| Boop | 3.14  |

*   [x] Checked
*   [ ] Unchecked

Footnotes[^], ^[even inline].

[*]: https://example.org "URL definition"

[^]: Footnote definition

…and our module example.js looks as follows:

import fs from 'node:fs/promises'
import {gfm} from 'micromark-extension-gfm'
import {footnote} from 'micromark-extension-footnote'
import {fromMarkdown} from 'mdast-util-from-markdown'
import {gfmFromMarkdown} from 'mdast-util-gfm'
import {footnoteFromMarkdown} from 'mdast-util-footnote'
import {fromMdast, toGemtext} from 'dioscuri'

const mdast = fromMarkdown(await fs.readFile('example.md'), {
  extensions: [gfm(), footnote({inlineNotes: true})],
  mdastExtensions: [gfmFromMarkdown, footnoteFromMarkdown]
})

console.log(toGemtext(fromMdast(mdast)))

…now running node example.js yields:

# Hello, world!

Some text, emphasis, strong code(), and strikethrough.

Here’s a link[1], link reference[2], and images: image reference[2], [3].

> Some quotes

* a list
* with another item

* “Ordered”
* List

```
A
Poem
```

```js
console.log(1)
```

```csv
Name,Value
Beep,1.2
Boop,3.14
```

* ✓ Checked
* ✗ Unchecked

Footnotes[a], [b].

=> https://example.com [1] Just an example

=> https://example.org [2] URL definition

=> example.png [3] Another example

[a] Footnote definition

[b] even inline

toMdast(tree)

Transform gast to mdast.

Returns

mdast, probably. Some gast nodes have no mdast representation so they are dropped. If you pass one of those in as tree, you’ll get undefined out.

Example

Say we have a gemtext document example.gmi:

# Hello, world!

Some text

=> https://example.com An example

> A quote

* List

…and our module example.js looks as follows:

import fs from 'node:fs/promises'
import {fromGemtext, toMdast} from 'dioscuri'

const doc = await fs.readFile('example.gmi')

console.dir(toMdast(fromGemtext(doc)), {depth: null})

…now running node example.js yields (position info removed for brevity):

{
  type: 'root',
  children: [
    {
      type: 'heading',
      depth: 1,
      children: [{type: 'text', value: 'Hello, world!'}]
    },
    {
      type: 'paragraph',
      children: [{type: 'text', value: 'Some text'}]
    },
    {
      type: 'paragraph',
      children: [
        {
          type: 'link',
          url: 'https://example.com',
          title: null,
          children: [{type: 'text', value: 'An example'}]
        }
      ]
    },
    {
      type: 'blockquote',
      children: [
        {type: 'paragraph', children: [{type: 'text', value: 'A quote'}]}
      ]
    },
    {
      type: 'list',
      ordered: false,
      spread: false,
      children: [
        {
          type: 'listItem',
          spread: false,
          children: [
            {type: 'paragraph', children: [{type: 'text', value: 'List'}]}
          ]
        }
      ]
    }
  ]
}

gast

gast extends unist, a format for syntax trees, to benefit from its ecosystem of utilities.

Root

interface Root <: Parent {
  type: 'root'
  children: [Break | Heading | Link | List | Pre | Quote | Text]
}

Root (Parent) represents a document.

Break

interface Break <: Node {
  type: 'break'
}

Break (Node) represents a hard break.

Heading

interface Heading <: Literal {
  type: 'heading'
  rank: 1 | 2 | 3
  value: string?
}

Heading (Literal) represents a heading of a section.

Link

interface Link <: Literal {
  type: 'link'
  url: string
  value: string?
}

Link (Literal) represents a resource.

A url field must be present. It represents a URL to the resource.

List

interface List <: Parent {
  type: 'list'
  children: [ListItem]
}

List (Parent) represents an enumeration.

ListItem

interface ListItem <: Literal {
  type: 'listItem'
  value: string?
}

ListItem (Literal) represents an item in a list.

Pre

interface Pre <: Literal {
  type: 'pre'
  alt: string?
  value: string?
}

Pre (Literal) represents preformatted text.

An alt field may be present. When present, the node represents computer code, and the field gives the language of computer code being marked up.

Quote

interface Quote <: Literal {
  type: 'quote'
  value: string?
}

Quote (Literal) represents a quote.

Text

interface Text <: Literal {
  type: 'text'
  value: string
}

Text (Literal) represents a paragraph.

Types

This package is fully typed with TypeScript. It exports the additional types Value (for the input, string or buffer), BufferEncoding ('utf8' etc), CompileOptions (options to turn things to a string), and FromMdastOptions (options to turn things into gast).

Compatibility

This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 14.14+ and 16.0+. It also works in Deno and modern browsers.

Related

  • @derhuerst/gemini – gemini protocol server and client
  • gemini-fetch – load gemini protocol data the way you would fetch from HTTP in JavaScript

Contribute

Yes please! See How to Contribute to Open Source.

Security

Gemtext is safe. As for the generated HTML: that’s safe by default. Pass allowDangerousProtocol: true if you want to live dangerously.

License

MIT © Titus Wormer

dioscuri's People

Contributors

christianmurphy avatar wooorm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

christianmurphy

dioscuri's Issues

Compatibility with CommonJS?

While ESM is on the rise, it's not supported absolutely everywhere. Electron doesn't work with ESM (my particular concern), and CommonJS is still the default in Node.JS's latest LTS release (v14). What is the advantage of limiting this module to ESM code? (As far as I know, ESM cannot be used by CommonJS modules)

Background
I'm making a gemini client with Electron. I'd like to parse it with dioscuri. Electron doesn't yet support running ESM code in the main process (see electron/electron#21457). In the renderer process, I tried both using the import() function and loading it as type=module, and it complained about bad imports and/or was unable to find the module.

readme: link to Gemini client/server?

Hey, author of @derhuerst/gemini here. 👋

I think for users wanted to interact with the Gemini ecosystem, it's genuinely useful to quickly find a Gemtext parser and a Gemini client/server lib. As both @derhuerst/gemini & dioscuri are somewhat hard to find with regular search tools, let's cross-link from both readmes.

What do you think?

A line with a single link in markdown is still formatted as a footnote in gemtext

A quick introduction to "gemtext" markup says that gemtext can only contain one link per line (which makes sense).
I think it's really smart to then format inline links from markdown as footnotes in gemtext. But is it also intended that one link per line in markdown gets also formatted as a footnote in gemtext instead of one link per line in gemtext?
Or do I miss something?

Example:

(async () => {
const { toGemtext, fromMdast } = await import('dioscuri');
const mdast = await import('mdast-util-from-markdown');

const markdown = 
`# Title
This [link](http://example.com) is formatted as a footnote.
Should the following link also be a footnote?

[Another link](http://placekitten.com/)`;

const tree = mdast.fromMarkdown(markdown);
const gemtext= toGemtext(fromMdast(tree));
console.log(gemtext);
})();

Result:

This link[1] is formatted as a footnote. Should the following link also be a footnote?

Another link[2]

=> http://example.com [1]

=> http://placekitten.com/ [2]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.