Giter Site home page Giter Site logo

thaw's Introduction

Thaw

Thaw is a tool to create documents with export to PDF in a text-concentrated working style. It offers a feature-rich and easy to learn markup language that you can write your documents in a human-readable way.

Thaw is a prototype of a document layout engine and currently no more maintained. The successor Letter can be found at https://github.com/bennyboer/letter and is under active development.

Motivation

Current modern alternatives to TeX/LaTeX and its derivatives include WYSIWYG editors like DTP software that are usually very expensive (InDesign), hard to learn (LaTeX/TeX) or human-readable formats like Markdown that lack a lot of features (Math typesetting, Captions, ...). We want to improve the situation by proposing Thaw which lets you write your documents in a distraction-free and human-readable way while at the same time being easy to learn.

Since Thaw documents only consist of simple text files (except from images) you'll have no problems using Git or another version control software to version your document.

Example

Usually a Thaw document is defined by three files:

  • A text file (ending with *.tdt) - in some ways similar to Markdown - where you define the contents (text) of the document
  • A style file (ending with *.tds) - very similar to CSS - where you define the style of the document (page size, colors, fonts, ...)
  • A info file (ending with *.tdi) - a properties file - where you define the encoding of the project files, a bibliography to use, citation style, variables, ...

In the following we want to show some code snippets as well as the result taken from the Demo example in this repository under example/demo. Running the CLI with gradle (An executable version should be downloadable from the Releases page once there is a release) using the demo example files with ./gradlew.bat :cli:run --args="--root-folder='../example/demo' --output='../example/demo/demo.pdf'", we get the following PDF (Only a screenshot shown):

Screenshot

Check out the full PDF here.

Info file

The info file defines some info about the document.

encoding = UTF-8
language = en

bibliography.file = literature.bib
bibliography.style = apa

var.version = v0.1
var.author = Benjamin Eder

Style file

The style file is used to alter the document looks to your needs.

document {
	font-family: Cambria;
	inline-code-font-family: Consolas;

	font-size: 13pt;
	color: #222222;

	width: 210mm;
	height: 297mm;

	margin: 2cm 3cm;
}

/* ... */

Text file

The text file defines the contents and the structure of the document.

#TITLE# Thaw Demo

For demonstration purposes we show some of the features *Thaw* has to offer.
We begin with a table of contents that is simply included in the text file with `#TOC#`.
The table of contents will be automatically generated from the present headlines.

#TOC#

#H1# Images

Most likely when you create a document you will have something like charts or other images - depending on the document type - that you want to display.
As an example you can see a bird I took a picture of a while ago in #REF, bird-image, prefix=Image#.

#IMAGE,
src="res/bird.jpg",
caption="This is an image of a bird. I don't know which kind, since I am hardly an ornithologist. But it is fun to take pictures of those animals!",
label=bird-image
#

...

Documentation

We are currently trying to establish a documentation for the project that specifies all available features and how to use them. In the meantime you can check out the demo example project at example/demo.

Contributing

I'd be glad if you want to contribute to the project. If you're interested write a message to me via email (See my GitHub profile).

Project structure

The project is organized in multiple modules that each take care of a specific part of the application:

Module name Folder Description
CLI /cli Command-line interface for the Thaw project
Core /core The core module containing the document model.
Text /text Text file parsing and model.
Style /style Style file parsing and model.
Reference /reference Reference file parsing and model.
Info /info Document information (meta data, etc.) and model.
Typesetting /typeset Code related to typesetting a document.
Export /export Related to exporting a document (for example to PDF).
Font /font Helps dealing with fonts.
Hyphenation /hyphenation Module dealing with hyphenating individual words.
Plugin /plugin Plugin development resources.
Shared /shared Some shared classes.
Math /math LaTeX to MathML converter + MathML typesetting.
Util /util Some useful module-spanning utilities.
Code /code Dealing with syntax highlighting code.
Table /table Table-related logic (Table model dealing with cell spans).

thaw's People

Contributors

badalsarkar avatar bennyboer avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

badalsarkar

thaw's Issues

Build executable files on tagged releases

Since with #13 we can build executables using JLink for each platform (when running on the platform), we can use Travis to automatically build the images and append them to the GitHub releases page for tagged commits.

Code block problems with empty lines when typing code directly

When typing code directly such as:

#CODE, '
# Headline

This is some text with emphasis applied **here**, *here* and _here_.
Even combining emphasis is allowed ***_in this example_***.
', language=java, style=manni#

it will not work and throw an exception. It works without the empty line.

Extended math support

Extend the math support by supporting more MathML elements.
This is a follow up issue for the initial math support implemented in #16.

Tasks

  • Support units in the attribute values (for example mm, px, cm, em, ..)
  • Table support
    • <mtable> element
    • <mtr> element
    • <mtd> element
  • <menclose> support
  • <maligngroup> and <malignmark> support
  • <merror> support
  • <mfenced> support
  • <mphantom> support
  • Full <mo> support
    • Support for displaying large operators (parenthesis, brackets, sum, product, ...) - Attribute largeop
    • Attribute maxsize
    • Attribute minsize
  • mstyle support

If #23 is already done before working on these tasks, the LaTeX to MathML converter needs to be adjusted as well!

Manage (add, remove) literature sources using CLI

It would be cool if we could manage literature sources (*.tdr files) using the CLI with something like thaw sources add IDENTIFIER, thaw sources remove IDENTIFIER and thaw sources list.

The CLI should then offer two possibilities:

  • Adding the source interactively (by asking for the required fields for the source type (Book, OnlineBook, Website, ...)
  • Trying to automatically fill all required fields using a DOI, ISBN or URL

Thaw editor - Web interface

There is currently tooling planned for IntelliJ IDEA or VSCode. Maybe it would be better to have a web-interface for creating/editing Thaw documents (Comparable to Overleaf).

Column-Layout

Setting the layout in the style file to support multi-column layouts.

#DATE# Thingy to show the current build date

Creating a #DATE# Thingy that is replaced during document build by the current date.

Using option #DATE, format=yyyy-MM-dd# the user should be able to choose the dateformat which is parseable by the SimpleDateFormat class.

LaTeX to MathML converter

In issue #16 we introduced support for math terms and expressions embedded in the document using MathML.
It would be cool to provide additionally support for LaTeX syntax using a converter from LaTeX to MathML.

Document build watch mode

I'd like to introduce a watch mode for the CLI, where the project files are monitored for changes and rebuild.

No hyphenation on monospaced text

Currently we allow hyphenation on monospaced text which I think is wrong.
This affects only in-line code using backticks. Code blocks are not affected.

Setup application

A setup application would be cool to easily install Thaw as well as having the possibility to tick checkboxes for dependencies that are not installed with Thaw (Syntax highlighting for example needs Python and Pygments installed). When the checkboxes are ticked the dependencies should be installed automatically.

Refined style format

The current style file format is JSON which is not the best format. We should aim for a more CSS-style like format.

TODO

  • Support a CSS-like format for styling the document
    • Support for units (mm, cm, em, pt, ...)
    • Write parser for the new format
      • Write lexer for the new format
    • Apply new format to the current model / refactor the current style model
  • Unit tests for the new parser
  • New class option that should be supported on all thingies
  • Make sure all number settings use unit conversion when they are used in code
  • Apply TOC specific styling
  • Apply new line-height settings (1.0/100% from font size as basis)
    • Allow 1.0 for line-height
    • Allow % for line-height
  • Foot notes styling using the footnote {...} styleblock
  • Apply new enumeration settings
  • Apply background-color settings to a page or paragraph
  • Add border styles
  • Add padding and margin setting to set left, top, bottom, right in one line
  • Change documentation and math torture test *.tds file to the new format
  • Apply background and border to code paragraphs
  • Apply background and border to "normal" paragraphs
  • Apply background and border to the table of contents
  • Apply background and border to math paragraphs

CSS-like style format

Quick example:

document { // Styles applying to the whole document
  width: 210mm;
  height: 297mm;
  font-family: url(my-cool-font.ttf); // Can also be "font-family: Arial", or alike
  font-size: 12pt;
  font-variant: plain; // Allowed are also 'bold', 'italic', 'underlined' or even a combination of them like 'bold italic underlined'
  color: #333333; // Font color, also rgba(0.5, 0.5, 0.5, 1.0) or rgb(0.5, 0.5, 0.5) should be allowed
  font-kerning: native; // Can also be "optical"
  line-height: 1.0; // Can also be 100% or 10pt, (or another unit). When 100%/1.0 the line-height is calculated from the letter M as basis
  text-align: center; // Alignment of the text paragraphs
  text-justify: true; // Whether to justify text paragraphs
  margin: 0; // Should work like the css spec says (except for the units where we do not allow px)
  padding: 0; // See comment for margin
  background-color: #EAEAEA; // Background color of the paragraph or page (Should work like the color attribute)
  inline-code-font-family: 'Consolas'; // Special attribute to set the font-family for inline-code using backticks
}

page { // Styles applying to the current page
  header: "my-header-folder"; // Header settings for the current page
  footer: "my-footer-folder";
}

page:page(end=5) {
  // Special page selector, meaning this will apply to all pages starting from page `undefined` (first page) until page 5
  // So in this case this will apply to page 1 to 5.
  // Another example would be :page(5) which means just page 5.
  // To select the last page you would have to use :last-page, also :first-page should be available (for consistency) with a variable offset. For example :last-page(1) would mean the page before the last page.
}

paragraph {
  // Styles applying to all paragraphs (including h1, h2, image, code, ...)
}

h1, h2, h3, h4, h5, h6 { // Multiple thingy names are allowed here as well
  font-family: Calibri; // Will be applied for all the listed thingies
}

h { // Styles that apply to all headline thingies!
  numbering: "%parent-heading%.%level-counter%"; // Customizable numbering
  counter-style: decimal; // As well as upper-roman (I., II., III., ...) or lower-latin (a, b, c, d, ...), or upper-latin, or lower-roman.
  // The variable %level-counter% is determined by the counter-style
  // The variable %parent-heading% is the resulting parent heading string. e.g. the parent heading is "2.1" and the current level counter is 3, then the resulting numbering is "2.1.3".
}

h1.appendix { // Special class for appendix headlines
  numbering: "%level-counter%";
  counter-style: upper-latin;

  // Result would be something like, A, B, C, D, ..., AA, AB, ...
}

h.appendix {
  numbering: "%parent-heading%.%level-counter%";
  counter-style: lower-latin;

  // Result would be for a second-level-headline: "A.a" or "C.b", ...
}

image {
  margin-top: 5mm;
  margin-bottom: 5mm;
  border: 1px solid #000000; // Border can be applied to all paragraphs
}

image.special-class { // Special class you can specify in the text file using #IMAGE, class=special-class#
  margin-left: 5mm; // This could be useful to apply to images that are floating to the left or right
}

enumeration { // Enumeration specific styles
  margin-left: 10mm; // Indent per level
}

enumeration:level(2) { // Style per level of indentation
  list-style-type: circle; // Also allowed are 'square', 'upper-roman', 'lower-alpha', and so on (see https://www.w3schools.com/cssref/pr_list-style-type.asp)
  color: #FF0000; // Also colors should be allowed as well as the font-variant or font settings in general!
  margin: 10mm; // Also ok - why not!
}

toc { // Table of contents specific settings
  margin-left: 5mm; // Indent per level
  fill: dots; // Fill the toc-entry space between heading title and page number with dots, other options could be 'solid' (A solid line) or 'empty' // Not filling anything
}

toc:level(2) {
  // Level specific table of contents settings
  font-variant: italic;
}

style.highlighted { // This is used to format text in any way you want in a paragraph. e. g. "My normal text *.highlighted*Some highlighted text** now again normal text"
  color: #FF0000; // Red color
  font-variant: bold;
  margin: 100mm; // Things like margin that cannot be applied to inline text are ignored
}

Update README to reflect `v0.1` state

The README is outdated.

  • Represent v0.1 release just before the release
  • Link to documentation PDF (uploaded in GitHub releases)
  • Quick example with screenshot of the output PDF
  • Contribution notice (e-mail to me for more details)

Abandon font-variant property

Handling the font-family and font-variant properties is not working very well. Instead we should just be able to specify a font for each variant (font, font-bold, font-italic, font-bold-italic, font-monospaced, line-number-font)

For example:
font: Arial;
font-italic: Roboto Black Italic;

The values must be the full font name or the file of the font using url(my-font.ttf).

Math support (Simple)

We need support for a #MATH# Thingy allowing to enter LaTeX formatted formulas that will be properly renderered in the document.

Maybe we want to convert LaTeX first to MathML using SnuggleTeX and then rather typeset MathML.

Investigate using MathJax as an alternative rendering method for displaying math

The standard way to display math in browsers is https://www.mathjax.org/ which is very complete and well tested.
We may investigate whether we can use it using GraalVM Javascript to produce SVG.

When #7 is finished we should be able to include SVGs into the PDF and thus can use MathJax as an alternative to the native implementation we currently have.

For example we could introduce a new option to the #MATH# thingy that enables the use of MathJax: #MATH, 'MY MATH CODE', renderer=MathJax where allowed values are either native or MathJax.

Include PDF

We need a thingy to include PDF files in the document.

VSCode support

It would be cool to have a plugin for VSCode to support Thaw projects.

Default font settings per platform

Currently a random or just the first font family that is found on a platform is picked as default font. it would be far more useful if we had a list of fonts for each font variant on each platform (Windows, Mac, Linux) to be used (if possible).

Info file variables

The info file is just a simple properties file that may contain variables like:

author.name = Benjamin Eder
author.email = ...

There are two kinds of variables:

  • Predefined variables like the mentioned above starting with author (Will be done with #51)
  • Custom variables you'd like to use in the document (for example title, due-date, ...)

The predefined ones need to be included as meta-data to the PDF file that is generated.

The custom ones should be usable in the *.tdt file: for example #VAR, author.name# should be replaced with the defined variable value.

Background color styles

Currently the background color styles are not working.

  • Allow setting background color for each paragraph type (#CODE# included!)
  • Alternating lines background color settings for normal text paragraphs as well as #CODE#
  • Add to documentation

Support border styles

For things like horizontal rules we need to support border properties in the style file.
We currently have them defined, but they are not yet used.

    BORDER("border", new StringValueParser()),
    BORDER_TOP("border-top", new StringValueParser()),
    BORDER_BOTTOM("border-bottom", new StringValueParser()),
    BORDER_LEFT("border-left", new StringValueParser()),
    BORDER_RIGHT("border-right", new StringValueParser()),
    BORDER_COLOR("border-color", new StringValueParser()),
    BORDER_WIDTH("border-width", new StringValueParser()),
    BORDER_STYLE("border-style", new StringValueParser()),
    BORDER_RADIUS("border-radius", new StringValueParser()),

Apply all CSL (citeproc-java) bibliography styles properly

Currently we do not really apply the styles given by the CSL bibliography (using citeproc-java) to the reference list/bibliography.
Reason for that is that we still need the table thingy (#12).
We should at least apply hanging indent, entry spacing and second field align.

Fill style for headlines

Would be cool if the fill style properties like in the table of contents would also work for headlines.

Include SVG

Including SVG in the document like an image.

Maybe using Apache Batik and FOP?

Literature/Bibliography/Citation style (APA) - First working draft

Creating the first working draft of a literature file (*.tdr).

  • #CITE# Thingy to cite predefined sources
  • In the first draft creating only APA citation style
  • Creating a lot of possible source styles (We currently have Book, OnlineBook, EBook, Website, Article) -> We have to create some more

Line height reconsiderations

Currently we define "lineHeight" in the style file which depends on the font size used. For example when we define the font size to be 16pt, we have to change the lineHeight as well.

It would be better to have something like "lineSpacing: 1.0" where the value 1.0 means that the lines should be spaced as close together as possible. We can determine the maximum lineHeight by checking the height for one the largest characters. We could for example just check the character M. That way we don't have to specify "lineHeight".

Support Markdown as input format

Certainly you might have existing Markdown documents that you want to convert into PDFs or don't want to use the Thaw document text format since you don't need most of the more elaborate features.

Thus we could allow specifying markdown files as text file format instead of *.tdt files.

This should be a configuration option in the *.tdi file:

TDI file

encoding = UTF-8
text-file-format = md // Default value is tdt

When specifying md as value for the text-file-format key, Thaw should look for *.md files in the project folders instead of *.tdt files. We need to provide a separate converter to the Thaw document text format tree for it to work properly.

TODO

  • Introduce new key text-file-format and values md, tdt to the Thaw info model
  • Implement converter from markdown files to the Thaw document text format tree
    • Use existing markdown parser library to provide the extension faster
    • Map HTML tree the markdown parser probably produces to the Thaw document text format tree
    • Unit tests
  • Update documentation

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.