Giter Site home page Giter Site logo

mundschenk-at / php-typography Goto Github PK

View Code? Open in Web Editor NEW
67.0 6.0 6.0 8.6 MB

A PHP library for improving your web typography.

License: GNU General Public License v2.0

PHP 97.81% HTML 2.19%
php typography hyphenation smartquotes css-hooks composer-package

php-typography's Introduction

PHP-Typography

Build Status Latest Stable Version Quality Gate Status Coverage License

A PHP library for improving your web typography:

  • Hyphenation — over 50 languages supported
  • Space control, including:
    • widow protection
    • gluing values to units
    • forced internal wrapping of long URLs & email addresses
  • Intelligent character replacement, including smart handling of:
    • quote marks (‘single’, “double”)
    • dashes ( – )
    • ellipses (…)
    • trademarks, copyright & service marks (™ ©)
    • math symbols (5×5×5=53)
    • fractions (116)
    • ordinal suffixes (1st, 2nd)
  • CSS hooks for styling:
    • ampersands,
    • uppercase words,
    • numbers,
    • initial quotes & guillemets.

Requirements

  • PHP 7.4.0 or above
  • The mbstring extension

Installation

The best way to use this package is through Composer:

$ composer require mundschenk-at/php-typography
$ vendor/bin/update-iana.php

Basic Usage

  1. Create a Settings object and enable the fixes you want.
  2. Create a PHP_Typography instance and use it to process HTML fragments (or whole documents) using your defined settings.
$settings = new \PHP_Typography\Settings();
$settings->set_hyphenation( true );
$settings->set_hyphenation_language( 'en-US' );

$typo = new \PHP_Typography\PHP_Typography();

$hyphenated_html = $typo->process( $html_snippet, $settings );

Roadmap

Please have a look at ROADMAP file for upcoming releases.

License

PHP-Typography is licensed under the GNU General Public License 2 or later - see the LICENSE file for details.

php-typography's People

Contributors

jeffreydking avatar melindrea avatar mundschenk-at avatar scrutinizer-auto-fixer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

php-typography's Issues

Not displaying text after ,<br>

Describe the bug

I have a redactor field that has the following text in it:
<p>Test,<br>test,</p><p>test,<br>test,<br>test,</p><p>test</p>

In my twig file I have the following:

{% set paragraphs = block.text | split('</p>') %}
{% for para in paragraphs %}
	{{ para | typogrify }}
{% endfor %}

Output is the below:
image

Or as HTML

<p>Test,<br></p>
<p>test,<br><br></p>
<p>test</p>

Custom hyphenation configs

I stumbled upon a problem with French language and this led me to the idea that it would be nice to allow custom values to be passed to the hyphenation.

Hyphenating this text: Avec le Valais, on ne sait jamais à quoi s'attendre
Results into this: Avec le Valais, on né sait jamais à quoi s'attendre
Notice the vs the ne. A sign is added above the e.

My client says this is not correct (and since I do not remember my French I take his word for granted).

Is there any way to pass custom hyphenation configuration to the hyphenator?

same question posted: https://www.drupal.org/project/twig_typography/issues/2950828

Allow injection of <body> classes

Allowing classes to be added to the <body> element generated during process calls can make the "ignore classes" settings work more intuitively for some applications.

Simplify settings methods

The setter methods of the Settings class should be more straight-forward to use and only take arrays where appropriate. (A set of helper functions could be provided to convert formatted strings to arrays.)

&nbsp; appearing before ? and !

Using the default setup, it appears that a &nbsp; is being inserted before ? and ! characters. What is the setting to disable this?

Adjustable list of "apostrophe exceptions"

From @Jellby on April 25, 2018 16:54

Issue Overview

When a word starts with an apostrope (like '60s, 'em), the apostrophe is rendered as an opening quote (instead of "closing"). This is with version 5.3.4 just installed.

Steps to Reproduce (for bugs)

Just write the above examples with "smart quotes" enabled.

Expected Behavior

Ideally, the apostrophe should be rendered as a closing single quote.

Current Behavior

The apostrophe appears as an opening single quote.

Possible Solution

I know there's no bulletproof way of identifying apostrophes short of language recognition, but there could be some markup-like way of forcing an opening or closing quote, or a list of words that should get an apostrophe... If there is something like that, I didn't find it.

Copied from original issue: mundschenk-at/wp-typography#214

Generalize glyph remappings

Currently, there exists the specific remapping of the NARROW NO-BREAK SPACE to the regular NO-BREAK SPACE due to problems with browser compatibility. This mechanism should be extended to allow for other re-mappings (like the HYPHEN to the HYPHEN-MINUS).

Add Release Roadmap

First, thank you for your work, I very much appreciate your contributions.

I have a small suggestion about the verisoning that you're doing; typically we'll have something like this in our composer.json:

    "require": {
        "mundschenk-at/php-typography": "^5.0"
    },

So whenever you update the major version number (as recently happened to 6.0), that means that we will no longer receive updates.

This can be further complicated if we have a larger project where there are components or plugins that may also require the library, and thus become incompatible when one asks for ^5.0 and the other ^6.0

So it might be helpful if the major version in the semver is only changed when there are very large, breaking changes in the library. So for instance, if php-typography stayed at 6.x.x for a very long time (until there were major breaking changes to the API or major breaking architectural changes), that'd probably be ideal from an application developer's POV.

Just my two cents!

Space between every letter

Hi, I do not understand why all the text is broken into one letter per word...

renderedtext2

renderedText

The code is a basic trial:
`
$settings = new \PHP_Typography\Settings();
$settings->set_hyphenation( true );
if(user()->language->name == "fr"){
$settings->set_hyphenation_language( 'fr' );
$settings->set_french_punctuation_spacing( true );
}else{
$settings->set_hyphenation_language( 'en-US' );
}
$typo = new \PHP_Typography\PHP_Typography();
$test = "Ange Michael is 27-28 years old and grows cocoa. His father died of malaria in 2009. The son had to take over the hectare of cocoa trees on his own to support the family. His mother works as a housekeeper, and, year in, year out, Ange Michael manages to provide them with the bare necessities. He does not live with them, however, and has his own house. He would like to get married, has a girlfriend who is still studying, but as long as he cannot live decently, he cannot get married.";
$hyphenated_html = $typo->process( $test, $settings );

`

HTML special chars processing

Thanks for this powerfull library. Found strange behaviour of this typo

Settings

        $typoSettings = new Settings(false);

        $typoSettings->set_space_collapse();
        $typoSettings->set_smart_marks();
        $typoSettings->set_single_character_word_spacing();
        $typoSettings->set_unit_spacing();
        $typoSettings->set_numbered_abbreviation_spacing();

        $typoSettings->set_smart_quotes();
        $typoSettings->set_smart_quotes_primary();
        $typoSettings->set_smart_quotes_secondary();

Predictable (Correct)

input: He is robot, am i&nbsp;too?

process input (htmlentity()): He is robot, am i&amp;nbsp;too?

typo output: He is robot, am i&amp;nbsp;too?

Output string will we as input with &nbsp; as a text.

Unpredictable (Incorrect)

input: He is a robot, am i&nbsp;too?

process input: He is a robot, am i&amp;nbsp;too?

typo: He is a&nbsp;robot, am i&nbsp;too?

Output string will we without &nbsp; as a text, all become spaces.

PS: the same with & <=> &amp;

Orphan prepositions

Hi, thank you for an awesome library!

Please, is there a fix or settings to add non-breakable spaces after double chars prepositions as well?

PHP 7.3

I’m seeing some odd output from PHP 7.3.

Code

$settings = new \PHP_Typography\Settings();
$typography = new \PHP_Typography\PHP_Typography();

echo $typography->process('Adipiscing Vehicula Ridiculus Pharetra', $settings);

PHP 7.2 Output

Adipiscing Vehicula Ridiculus&nbsp;Pharetra

PHP 7.3 Output

A&nbsp;d&nbsp;i&nbsp;p&nbsp;­&nbsp;i&nbsp;s&nbsp;c&nbsp;­&nbsp;i&nbsp;n&nbsp;g&nbsp;V&nbsp;e&nbsp;h&nbsp;i&nbsp;c&nbsp;­&nbsp;u&nbsp;­&nbsp;l&nbsp;a&nbsp;R&nbsp;i&nbsp;d&nbsp;i&nbsp;c&nbsp;u&nbsp;­&nbsp;l&nbsp;u&nbsp;s&nbsp;P&nbsp;h&nbsp;a&nbsp;r&nbsp;e&nbsp;t&nbsp;r&nbsp;a&nbsp;nbsp;

Overzealous Roman numerals matcher

Standalone C, D and L are correctly detected as Roman numerals, but together with the suffix e (and possible some others), the result is problematic in (at least) French and Dutch.

No support for the uppercase Norwegian character 'Å'?

I'm using this on a website, through the plugin Craft Typogrify - and whenever I try to typogrify a word/sentence starting with a Norwegian character it seems to output nothing at all. I've reported an issue to the plugin-creator, but after further investigation, it seems that this behavior is the same when I try to use just this library directly as well.

Example:

$settings = new \PHP_Typography\Settings();
$settings->set_hyphenation( true );
$settings->set_hyphenation_language( 'en-US' );

$typo = new \PHP_Typography\PHP_Typography();

$hyphenated_html = $typo->process( $html_snippet, $settings );
var_dump($html_snippet, $hyphenated_html);
exit;

Output:

string(15) "Årø Bilsenter" string(0) ""

EDIT:

I've tested a little more - most Norwegian characters seem to work just fine, the exception being the uppercase Å.

Example:

$typo->process('æ', $settings);
$typo->process('ø', $settings);
$typo->process('å', $settings);
$typo->process('Æ', $settings);
$typo->process('Ø', $settings);
$typo->process('Å', $settings);

Output:

ø
æ
å
Æ
Ø
''

Reduce coupling

Some classes are very tightly coupled at the moment, these dependencies should be removed/reduced.

Break at beginning/break preferences

From @strarsis on October 12, 2017 13:6

Is it possible to let the browser break at a specific soft-hyphen?
Or let it prefer breaking at hyphens at end of a word instead at the beginning?
Alternatively, first word-breaking and resorting to breaking at
soft-hyphens if the word would overflow (as last resort)?
Are there JavaScript libraries that can make use of the soft-hyphens injected by this plugin?
Or can this plugin even enqueue on by itself for adding this functionality?

Copied from original issue: mundschenk-at/wp-typography#159

Not all settings are taken into account for the settings hash

$settings->get_hash() only considers the $data member of the Settings class, but not other relevant members such as $no_break_narrow_space, $primary_quote_style, $dash_style, etc.

This results in a hash collision for two Settings instances that, e.g., only differ in the dash style.

In my client code, I quick-fixed this by using

md5(print_r($settings, true))

as hash.

Update hyphenation patterns

There have been significant upstream changes to these patterns:

  • German
  • German (Traditional)
  • German (Swiss Traditional)
  • Latin (Classical)
  • Latin (Liturgical)
  • Spanish

Patterns with minor updates:

  • Amharic
  • Chinese pinyin (Latin)

French : ne bocomes né

When using the following settings:

$settings = new PHP_Typography\Settings();
$settings->set_hyphenation( false );
$settings->set_hyphenation_language( 'fr' );
$settings->set_french_punctuation_spacing( true );
$settings->set_smart_quotes_primary( "doubleGuillemets" );
$typo = new PHP_Typography\PHP_Typography();
$content = $typo->process($content,$settings);

the script replaces all ne by né .
My simple workaround

$content = str_replace(""," ne ",$content);

Or am I setting something wrog?

Keep units together (€)

Issue Overview

Units should be kept together, '€' had been added to the list of units.

Steps to Reproduce (for bugs)

  1. Have text on page that contains a price with a currency like "10 €" where its container should cause word breaking/hyphenation.
  2. Notice that still a break occurs between the number (here "10") and the currency symbol (here "€").
    The value and currency symbol are not wrapped into a wrapper element.
  3. Adding the "€" to the list of units in wp-typography settings doesn't seem to prevent breaking.

WordPress 4.9.8, wp-typography 5.4.2, Gutenberg editor page.
Recent Chrome on Windows 10 Pro x64.

Expected Behavior

No breaking between value and unit symbol.

Current Behavior

Prices with currency symbol (e.g. "10 €") are apparently not handled.

Square and cubic metre

Hi!

Square and cubic metre m2 and m3 aren't transformed to and .

When somebody type 250m2 or 250m3, it would be a good enhancement to convert 2 and 3 to glyphs ² and ³ (and not superscript).

Thanks!

Can’t use widows when hyphens disabled

I have the following settings but widows seems dependant on hyphenation being enabled. Is this a bug or expected behaviour? It would be very useful to dewidow without hyphenation.

$typographySettings = new Settings();
$typographySettings->set_dewidow(true);
$typographySettings->set_max_dewidow_length(20);
$typographySettings->set_hyphenation(true); // when false widows are not applied

$typography->process($page->body, $typographySettings);

does't work on laravel 5.8

php 7.3

vagrant@homestead:~/code/shy$ php -r "print_r(get_loaded_extensions());"
Array
(
...
    [22] => mbstring
...
	public function shy()
	{
		$settings = new \PHP_Typography\Settings();
		$settings->set_hyphenation( true );
		$settings->set_hyphenation_language( 'en-US' );

		$typo = new \PHP_Typography\PHP_Typography();

		$text1 = '<p>Make sure to place composers system-wide vendor bin directory in your so the laravel executable can be located by your system. </p>';

        $hyphenated_html = $typo->process( $text1, $settings );


		dd($hyphenated_html);
	}

output:

"<p>Make sure to place com­posers sys­tem-wide ven­dor bin direc­to­ry in your so the lar­avel exe­cutable can be locat­ed by your system.&nbsp;</p>"

Dewidowing seems to override unit spacing

When dewidowing, unit spaces don’t appear to be affected when at the end of a paragraph. As such 1 KB will be converted to 1&nbsp;KB, even when unit spacing is turned on. Expected result is 1KB.

Use non-breaking hyphen where appropriate

Use the NON-BREAKING HYPHEN character at the beginning of words (e.g. German Warenein- und -ausgang) and in compound words when one part is very short (1 or 2 letters).

License restrictions

We are interested in using PHP Typography in a plugin for Kirby CMS. From what I understand, the license of this library (GPL2) requires such a plugin to be released under the same license. Although Kirby's source is open it's released under a commercial license which makes me doubt that it can be combined with any plugin under GPL. Same is true for any other CMS under MIT license.

Is this limitation intended?
Would it be possible to open the license using LGPL, MIT or a dual model alongside GPL?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.