truncato is a Ruby library for truncating HTML strings keeping the markup valid.
In your Gemfile
gem 'truncato'
Truncato.truncate "<p>some text</p>", max_length: 4 => "<p>s...</p>"
Truncato.truncate "<p>some text</p>", max_length: 4, count_tags: false => "<p>some...</p>"
The configuration options are:
max_length
: The size, in characters, to truncate (30
by default)tail
: The string to append when the truncation occurs ('...' by default)count_tags
: Boolean value indicating whether tags size should be considered when truncating (true
by default)filtered_attributes
: Array of attribute names that will be removed from the output. This allows you to make the truncated string shorter by excluding the content of attributes you can discard in some given context, e.g HTMLstyle
attribute.
Truncato was designed with performance in mind. Its main motivation was that existing libs couldn't truncate a multiple-MB document into a few-KB one in a reasonable time. It uses the Nokogiri SAX parser.
There is a benchmark included that generates a synthetic XML of 4MB and truncates it to 400 KB. You can run the benchmark using
rake truncato:benchmark
There is a also a comparison benchmark that tests the previous data with other alternatives
rake truncato:vendor_compare
The results comparing truncato with other libs:
Truncato | truncate_html | HTML Truncator | peppercorn | |
---|---|---|---|---|
Time for truncating a 4MB XML document to 4KB | 1.5 s | 20 s | 220 s | 232 s |
rake spec