Comments (8)
@simonbate if you are using for publishing the DITA OT PDF5 plugin developed by Antenna House maybe they could help you further:
https://www.antennahouse.com/dita-pdf5-plugin
from dita-ot.
Unfortunately, we're using the OT pdf2 plugin as our base.
Our temporary solution is to ensure the sources do not include newlines, but that shouldn't be necessary.
from dita-ot.
@simonbate right you can probably define a Schematron rule to catch such problems.
I added a similar problem but for CHM and indexterm elements a couple of days ago: #4336
from dita-ot.
Testcase here: 4337.zip
<index-see>
and <index-see-also>
handle surrounding whitespace correctly; only <index-sort-as>
has the issue.
from dita-ot.
I think the fix would be inside src/main/java/org/dita/dost/reader/IndexTermReader.java
by applying the trimSpaceAtStart()
function somewhere, but I can't figure out where it should be. @raducoravu - if you see a potential fix for this, I would be happy to test it.
from dita-ot.
@chrispy-snps I'm afraid I do not have time to look into this, here "org.dita.dost.reader.IndexTermReader.characters(char[], int, int)" there seems to be an initial normalization using normalizeAndCollapseWhitespace which replaces all consecutive spaces with one.
And then there is this code for the index sort as:
else if (insideSortingAs && temp.length() > 0) {
final IndexTerm indexTerm = termStack.peek();
temp = trimSpaceAtStart(temp, indexTerm.getTermKey());
indexTerm.setTermKey(setOrAppend(indexTerm.getTermKey(), temp, false));
}
It's unclear to me what the trimSpaceAtStart does, it seems to not always remove the first space, but removes it when the second parameter to the method also starts with a space. But the "indexTerm.getTermKey()" is probably null at that moment. So maybe replace:
temp = trimSpaceAtStart(temp, indexTerm.getTermKey());
with:
temp = trimSpaceAtStart(temp, indexTerm.getTermKey() != null ? indexTerm.getTermKey() : temp);
so that the first time when the term key is not yet computed to always remove the first space?
from dita-ot.
So I think the IndexTermReader.java
code is used only by the htmlhelp
transformation. For pdf2
transformations, index term processing appears to be provided by the org.dita.index plugin. It looks like there is a unit test for <index-sort-as>
in there that could be modified to test and resolve the whitespace issue, but I haven't been able to build and test that plugin yet.
from dita-ot.
In org.dita.index, I filed the following issue:
#2: cannot build or test plugin
Once that is figured out, I hope to implement a fix there.
from dita-ot.
Related Issues (20)
- Installing plugin from directory with '@' character results in error HOT 2
- The "image-metadata" step attempts to read content of remote image
- abbreviated-form resolving is not the same depending of topic type HOT 1
- Weird chars should be removed before and after the [move-meta] Fatal Error for messy entities HOT 8
- Local non-DITA links in topics break with preprocess2 and onlytopic.in.map
- Cannot determine image width/height for external images, results in NaN HOT 2
- Allow defining artifacts and have DITA projects which depend on other projects
- Branch filtering has no effect on content reused via conref mechanism HOT 4
- in DITA-OT 4.2 prerelease, DOTX030W message shows angle brackets in encoded form HOT 1
- In `preprocess2`, DOTJ037W shows up twice when `validate` is set to `false` HOT 1
- In `preprocess2`, processing attempts to read a peer map reference inside a branch-filtered submap
- In `preprocess2`, nested subtopics of a branch-filtered submap are duplicated/unresolved HOT 3
- In `preprocess2`, filtered topics result in "failed to transform file" errors HOT 2
- Chunking behavior in learning maps is unexpected and not documented
- Add extension points to preprocess2
- Add validate subcommand HOT 3
- Variable definition values should not be included in `index.html` file HOT 1
- Better mark validation error in topic HOT 6
- DITA OT Project - percent encoded chars appear in PDF output file name HOT 5
- Duplicate string in DOTX010E log message HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dita-ot.