Giter Site home page Giter Site logo

typeset's Introduction

Typeset

Typeset is an HTML pre-processor for web typography which provides typographic features used traditionally in fine printing which remain unavailable to browser layout engines. Typeset's processing brings the following to your webpages:

Typeset does not require any client-side JavaScript and uses less than a kilobyte of CSS. Processed HTML & CSS works in Internet Explorer 5 and without any CSS. Typeset can be used manually or as a plugin for Grunt and gulp.


Getting Started

Install

$ npm i typeset

Usage

const typeset = require('typeset');
let html = '<p>"Hello," said the fox.</p>';
let output = typeset(html);

CSS

Then tweak the CSS to match the metrics of your font and include it on your page.

/*
 Small Capitals
 https://en.wikipedia.org/wiki/Small_caps 
*/

.small-caps {font-variant: small-caps;}

/*
 Optical margin alignment for particular letters 
 https://en.wikipedia.org/wiki/Optical_margin_alignment
*/

.pull-T, .pull-V, .pull-W, .pull-Y {margin-left: -0.07em}
.push-T, .push-V, .push-W, .push-Y {margin-right: 0.07em}

.pull-O, .pull-C, .pull-o, .pull-c {margin-left: -0.04em}
.push-O, .push-C, .push-o, .push-c {margin-right: 0.04em}

.pull-A {margin-left: -0.03em}
.push-A {margin-right: 0.03em}

/* 
 Quotation mark 
 https://en.wikipedia.org/wiki/Quotation_mark
*/

/* Single quotation marks (') */
.pull-single{margin-left:-.27em}
.push-single{margin-right:.27em}

.pull-double, .push-double,
.pull-single, .push-single {display: inline-block}

/* Double quotation marks (") */
.pull-double{margin-left:-.46em}
.push-double{margin-right:.46em}

Options

You can pass an options object to influence how your HTML is typeset:

const options = {
  ignore: '.skip, #anything, .which-matches', // string of CSS selector(s) to ignore
  only: '#only-typeset, .these-elements', // string of CSS selector(s) to exclusively apply typeset to
  disable: ['hyphenate'] // array of typeset feature(s) to disable
};

Features

The following features may be disabled:

  • hyphenate
  • hangingPunctuation
  • ligatures
  • punctuation
  • quotes
  • smallCaps
  • spaces

CLI Usage

$ npm i -g typeset
Usage: typeset-js [options] [<infile> [<outfile>]]

Options:

  -h, --help      output usage information
  -V, --version   output the version number
  -i, --ignore    string of CSS selector(s) to ignore
  -O, --only      string of CSS selector(s) to exclusively apply typeset to
  --disable,      string of typeset feature(s) to disable, separated by commas

Examples:

Compile a file and print it to stdout:

$ typeset-js inputFile.html

To create an output file, just add a second argument:

$ typeset-js inputFile.html outputFile.html

Use the --ignore option to ignore specific CSS selectors:

$ typeset-js inputFile.html outputFile.html --ignore ".some-class, h3"

Use the --disable option to disable typeset features:

$ typeset-js inputFile.html outputFile.html --disable "hyphenate,ligatures"

CLI redirections:

$ cat index.html | typeset-js > outputFile.html

Plugins


Support

If you don't find the answer to your problem in our docs, or have a suggestion for improvements, submit your question here.


License

This software is dedicated to the public domain and licensed under Creative Commons Zero. See the LICENSE file for details.


To Do:

typeset's People

Contributors

danielhaim1 avatar davidmerfield avatar faheempatel avatar gabrielperales avatar jonathanzong avatar kevinschaul avatar lucasconstantino avatar lydell avatar snyk-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

typeset's Issues

add .pull-T etc will mess up kerning

.pull-T and .push-T and other pulled letters do more harm than good. The only one needed here is the pull quote, in my opinion
before adding pull-T push-T
image
after adding pull-T push-T
image

add readme instruction on using it in browser

Hi, can you add a small section in readme about how to use it in browser. So far I found that I can dotypeset(jq("p"));

can I do this?

    typeset(jq("p"), {
      disable: ['hyphenate'], // array of features to disable
    });

Recommendation against use of for...in for array iteration.

Howdy, love the library.

Running into some issues regarding your use of the for...in statement for array iteration. for...in is for enumerable properties of collections, not iteration through iterables (e.g. Arrays). I'd recommend using array.forEach(), for...of (broadly supported by Node >= 0.12, dunno about browser compat.) or a standard for loop, which is obviously more verbose but more appropriate than for...in.

for...in will throw when some dingus (not me, but there are lots of dinguses out there) modifies the Array prototype, because for...in will kick back any additional enumerable properties, not just the elements of the array.

Copy-pasta'd some relevant code from the MDN demonstrating the issue.

Object.prototype.objCustom = function () {}; 
Array.prototype.arrCustom = function () {};

let iterable = [3, 5, 7];
iterable.foo = "hello";

for (let i in iterable) {
  console.log(i); // logs 0, 1, 2, "foo", "arrCustom", "objCustom"
}

for (let i of iterable) {
  console.log(i); // logs 3, 5, 7
}

Add Array.prototype.foo = () => 'bar' to index.js and run the test suite. It'll run through all the places where that needs to be changed.

Cheers!

Black magic

This lib looks great ! But I don't believe in neither fairy tales nor magic.

I won't pass my content in a "black box" that magically promises enhancements.

Could you explain (blog post ?) exactly what problem does your library solve and how ?

Thanks

Markdown

I imagine that Typeset processing would work well with raw markdown. It seems so, but maybe I'm wrong.

Is there something I should be worried about if preprocessing markdown with Typeset? (Maybe something that would break markdown -- the \n\n---\n\n that represents an <hr>, for example.)

Custom class names

Add option to change the class name for the elements that this library creates.

Cheerio Dependency

How integral is cheerio to this problem you're solving? What is it's biggest benefit?

language option

Is there a way to specify the language in the case of processing HTML fragments?

Types of options

This is more of a pet peeve than anything but in my opinion the ignore option should be represented by an array of selectors instead of one huge selector string, reason being that it is more "intuitive" - one would usually expect to ignore multiple selector-strings instead of just one.

So I think the better approach would be to use an Array and join the array when doing the selection/ignorance(?). But then again if you're going into the direction of cutting down the code size I'm totally fine with that.

Command-line tool

It would be great to have a command line tool to use typeset as a post-processor (bearing in mind that typeset is sometimes a reserved shell command).

En-dash for number ranges

I was planning to add a line in punctuation.js to replace a hyphen between two numbers without a space with an en-dash (14-31 days "14&ndash;31 days").

It looks like for ranges with a space, we're expecting it to be replaced with an em-dash. I'm unsure if this is by design but the test HTML (and the Chicago style) calls for an en-dash. Should this be corrected or is there a different rule for ranges with spaces that I'm missing?

Thanks!

Usage with webpack?

For the life of me, I can't figure out how to import this as a dependency using webpack. What am I missing?

Cheerio is a dev dependency, should be dependency

Cheerio should be a dependency, not a dev dependency:

> Cannot find module 'cheerio'
Require stack:
- /Users/anandchowdhary/Projects/open-source/anandchowdhary.com/node_modules/typeset/src/eachTextNode.js
- /Users/anandchowdhary/Projects/open-source/anandchowdhary.com/node_modules/typeset/src/index.js

Build file

The build.js script does not see to produce any .js file and hangs after outputting "done". It also depends on numerous non-listed Node modules.

js build.js
BUILDING!
DONE!
[Have to type Ctrl-C to exit]

`smallCaps` is escaping HTML tags in the browser

Trying to run Typeset from the browser, with jQuery, I got this string:

<p>Yjarni Sigurðardóttir spoke to NATO from Iceland yesterday: "Light of my life, fire of my florins -- my sin, my soul. The tip of the tongue taking a trip to 118° 19' 43.5"."</p>

  <p>"She's faster than a 120' 4" whale." <em>Piña co&shy;ladas</em> were widely consumed in Götterdämmerung from 1880–1912. For the low price of $20 / year from Ex&shy;hi&shy;bits A–E... Then the <em>duplex</em> came forward. "Thrice the tower, he mounted the round gunrest, 'awaking' HTML. He can print a fixed num&shy;ber of dots in a square inch (for in&shy;stance, 600 × 600)."
  </p>

turned into this:

<p>&lt;span class="pull-Y"&gt;Y&lt;/span&gt;jarni Sigurðardót&shy;tir spoke to &lt;span&lt;span class="push-c"&gt;&lt;/span&gt; &lt;span class="pull-c"&gt;c&lt;/span&gt;lass="small-caps"&gt;NATO&lt;/span&gt; from Ice&shy;land yes&shy;ter&shy;day:&lt;span class="push-double"&gt;&lt;/span&gt; &lt;span class="pull-double"&gt;“&lt;/span&gt;Light&lt;span class="push-o"&gt;&lt;/span&gt; &lt;span class="pull-o"&gt;o&lt;/span&gt;f my life, fire&lt;span class="push-o"&gt;&lt;/span&gt; &lt;span class="pull-o"&gt;o&lt;/span&gt;f my florins&amp;thinsp;&amp;mdash;&amp;thinsp;my sin, my soul.&lt;span class="push-T"&gt;&lt;/span&gt; &lt;span class="pull-T"&gt;T&lt;/span&gt;he tip&lt;span class="push-o"&gt;&lt;/span&gt; &lt;span class="pull-o"&gt;o&lt;/span&gt;f the tongue tak&shy;ing a trip to 118° 19′ 43.5″.”</p>

  <p>&lt;span class="pull-double"&gt;“&lt;/span&gt;She’s faster than a 120′ 4″&lt;span class="push-w"&gt;&lt;/span&gt; &lt;span class="pull-w"&gt;w&lt;/span&gt;hale.” <em>Piña&lt;span class="push-c"&gt;&lt;/span&gt; &lt;span class="pull-c"&gt;c&lt;/span&gt;o&shy;ladas</em> &lt;span class="pull-w"&gt;w&lt;/span&gt;ere&lt;span class="push-w"&gt;&lt;/span&gt; &lt;span class="pull-w"&gt;w&lt;/span&gt;idely&lt;span class="push-c"&gt;&lt;/span&gt; &lt;span class="pull-c"&gt;c&lt;/span&gt;on&shy;sumed in Göt&shy;ter&shy;däm&shy;merung from 1880–1912. For the low price&lt;span class="push-o"&gt;&lt;/span&gt; &lt;span class="pull-o"&gt;o&lt;/span&gt;f $20&#8202;/&#8202;year from Ex&shy;hi&shy;bits&lt;span class="push-A"&gt;&lt;/span&gt; &lt;span class="pull-A"&gt;A&lt;/span&gt;–E…&lt;span class="push-T"&gt;&lt;/span&gt; &lt;span class="pull-T"&gt;T&lt;/span&gt;hen the <em>du&shy;plex</em> &lt;span class="pull-c"&gt;c&lt;/span&gt;ame for&shy;ward.&lt;span class="push-double"&gt;&lt;/span&gt; &lt;span class="pull-double"&gt;“&lt;/span&gt;Thrice the tower, he mounted the round gun&shy;rest,&lt;span class="push-single"&gt;&lt;/span&gt; &lt;span class="pull-single"&gt;‘&lt;/span&gt;awak&shy;ing’ &lt;span&lt;span class="push-c"&gt;&lt;/span&gt; &lt;span class="pull-c"&gt;c&lt;/span&gt;lass="small-caps"&gt;HTML&lt;/span&gt;. He&lt;span class="push-c"&gt;&lt;/span&gt; &lt;span class="pull-c"&gt;c&lt;/span&gt;an print a fixed num&shy;ber&lt;span class="push-o"&gt;&lt;/span&gt; &lt;span class="pull-o"&gt;o&lt;/span&gt;f dots in a square inch (for in&shy;stance, 600&#8202;×&#8202;600).”
  </p>

I investigated the sequential calling of modules that continuously transform the initial text, and notice that the HTML tags were not escaped by "quotes", "hyphenate" or "ligatures", but only after "smallCaps".

I couldn't understand why.

Punctuation substitution issue with inline elements

An example to illustrate the issue (note the quotation marks):

Input:

   <p>How about "<a href="/foo">that</a>" said the old man.</p>

Output:
How about ”that“ said the old man.

Expected:
How about “that” said the old man.

Why does this happen?

This is because each text node is handled individually. In the example, the three text nodes are How about ", that and " said the old man..

Solutions

Compute the text content of block elements (p tags, blockquotes) and run the substitution on that, instead of on the nodes individually?

Add support for avoiding widows

I would like to avoid the last 2 words in a paragraph from appearing on the last line by themselves.

For now I've been hacking around this by manually wrapping the last few words in a with a class that typeset is configured to ignore (to avoid soft-hyphens) and manually replacing spaces between those words with  .

Was this functionality intentionally omitted? If not, I'm happy to take a shot at implementing it myself.

Request: Widow control (widont)

This is a great library! Thanks so much for maintaining it.

I'd love to see an option for controlling widows. For instance typogr.js uses the "widont" pattern of replacing the space between the last two words in a block with &nbsp;.

(Typographically a widow is actually a single line on a wrapped paragraph rather than a single word on a wrapped line, but on the web the usage seems to be mostly about not leaving a single word, since multicolumn layouts are less common.)

Kerning pairs?

Add option to wrap a sequence of characters in a node which could be targeted? This would not be default behaviour and is naturally dependent on the typeface e.g.

typeset("Ave, Imperator, morituri te salutant", {kern: ["av", "at"]}) 
.kern-av {letter-spacing: 0.96em}
.kern-at {letter-spacing: 0.98em}

Abbreviations?

A cool feature would be to automatically encapsulate instances of manually configured—and perhaps predefined—abbreviations with <abbr title="…"></abbr>.

Screen readers cannot read output

Screen readers stumble on something like <span>a</span>t, reading it as 'A T' instead of 'at'.

This makes the optical margin alignment and hanging punctuation output unusable for people using screen readers.

We should be able to solve this using:

<p aria-label="at"><span>a</span>t</p>

Mid-paragraph push/pull spans not working

Hanging punctuation no longer works mid paragraph:

image

Captured on Chrome Version 78.0.3904.70 (Official Build) (64-bit)

Here's the breaking change to fix this courtesy of MB:

.pull-double, 
.push-double,
.pull-single,
.push-single {
display: inline-block
}

Hanging punctuation inside em inside a

Input:

<small><em>"A"</em></small>

Output:
<small><em><span class="pull-double">“</span>A”</em></small>

Expected:
<span class="push-double"></span><small><em><span class="pull-double">“</span>A”</em></small>

Typeset removing closing slash on void (singleton) elements

Void elements, or singletons, like img, hr, br and others contain a closing forward slash in (X)HTML validation, e.g. <img src="foo.jpg" />.

When Typeset processes content with HTML, it is removing those closing slashes, e.g. rendering the above as <img src="foo.jpg">.

Closing slashes are of course optional, but:

a) I don't think Typeset should be messing with tag syntax in the first place.
b) In HTML emails, using closing slashes is recommended for cross-email compatibility in all their crappy rendering engines.
c) In my case, I'm using MJML, which uses void/singleton elements for things like mj-image -- and in that case a tag without a closing slash isn't valid.

My specific use might be an edge case, but I'm sure I'm not the only one formatting HTML emails.

Can Typeset avoid changing HTML tags?

Option to enable/disable certain features

Lovely tool. Great work.

It might be nice to be able to enable or disable certain features such as hanging punctuation and hyphenation.

In my first attempted use of the tool, I found the hyphenation to be distracting for the particular type size and style I was using. There are probably cases where hanging punctuation is unnecessary.

I'm happy to work on a quick implementation of this to get the conversation rolling.

hanging punctuation for right-aligned blocks?

As part of my research on #52 I also studied whether the push/pull technique could be applied to blocks with text-align:right (that is, the punctuation would hang over the right margin instead of the left). I haven’t gotten it to work yet. It may be too complicated to be worthwhile. What I tried:

  • In the processor script, closing punctuation would need to be wrapped similar to opening punctuation, though in opposite order: the pull would precede the push. Also, the wrapping tag for the closing punctuation would want to be distinct from the opening (e.g., push-open and push-closed rather than just push). This part seemed tractable.

  • In the CSS, however, I couldn’t come up with a way of styling the push-closed and pull-closed to get analogous behavior with the usual push/pull pairs. The usual idea is that in the middle of a line, the two appear together, but at a line break, the push remains at the end of one line, the line break happens, and the pull appears at the beginning of the next. On the right edge, the pull would happen first, at the end of the line, and then assumedly you’d want the push to wrap to the next.

  • More troublesome still, I couldn’t come up with a way to toggle this behavior purely with CSS. That is, in any given text block, the text is either aligned left or right (or neither) so only the opening push/pull pairs or closing push/pull pairs should work (or neither). But there isn’t any way to write a CSS selector conditioned on the presence of another CSS property. If every right-aligned block was guaranteed to have, say, class="right", then you can have CSS selectors like .right push-closed and so on. But that requires the right-alignment to be encoded at “compile time”, rather than strictly in the CSS (where it should be).

Consolidate eachTextNode

Currently each typographic feature in the library runs src/eachTextNode.js. This repetition is inefficient.

We should instead compose a function of all the modules that need to modify text nodes, then pass that function to eachTextNode once.

HTML Entities are turned into actual characters

I have HTML with entities like &lt; and &gt; in it and when I pass this HTML through typeset, they are replaced with the actual < and > characters and so my HTML comes up incorrect. Check out the following example:

console.log(typeset(`
<!doctype html>
<html lang="en">

<p>Hello &lt;there&gt; you!</p>
`.trim()));

This produces the following output:

<!DOCTYPE html><html lang="en"><head></head><body><p>Hello <there> you!</there></p></body></html>

The &lt; and &gt; around the word there have now turned it into a <threre> tag, with a closing tag as well just before </p> in the final output.

Here's a REPL demonstrating this test case: https://repl.it/@sharat87/BigheartedDefiantCones

Thank you very much for your work.

Collab with Normalize-OpenType.css

Hey! Thanks for the shout-out in the README. I maintain Normalize-OpenType.css.

If you are thinking of incorperating any features, let me know. I’d be happy to help. I have also been using and contributing very small pieces to Typogr.js for a while which is in a very similar space. Perhaps having a consensus on a class name or tag between all three for small caps would be a start.

I have been partial to using abbr tags for small caps, but I can understand the appeal of just using .caps or .small-caps.

I would also recommend dropping the ligature support, these can be handled entirely with CSS now. My understanding is that and as a single glyph are kind of a deprecated or frowned-upon part of Unicode, since OpenType and font-feature-settings takes care of that now while preserving the f, l, and i as separate characters.

Let me know what you think! Nice job on the library so far, always nice to find other people that care this much about this stuff.

Assorted quote substitution bugs

language switch changes rules?

Found three small issues, all related to different behaviour when processing greek characters, instead of latin.

Problem 1: single quote before space

correctwrong
inputenglish' englishελληνικά' ελληνικά
outputenglish’ englishελληνικά′ ελληνικά

Problem 2: double quotes before full stop

correctwrong
input"english"."ελληνικά".
output“english”.“ελληνικά“.

Problem 3: double quotes before comma

correctwrong
input"english","ελληνικά",
output“english”,“ελληνικά“,

`childNode.data` is undefined

I'm testing the client version from Chrome, using jQuery:

Sometimes this line doesn't work because .data is undefined.

        childNode.data = doThis(childNode.data, childNode);

When it happens everything fails.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.