Comments (6)
what kind of help is required here? I'm a Frasi native speaker that can help change code and verify the results, however I'm not really sure if I know what piece of code is to be changed here. I took a short look at the latest version and can't really spot the place where the drawing of an element with unicode text is happening.
from flyingsaucer.
FYI, I tracked it down to this method com.lowagie.text.pdf.BaseFont#convertToBytes(java.lang.String)
and it looks like the encoding is always set to Cp1252
from which I would not expect much to render any non-latin chars. maybe properly setting the charset on that (don't know how) will fix the issue. eventually using a font that has proper characters too.
from flyingsaucer.
@mohamnag Hi. Wow, thank you for debugging this problem with fonts.
Yes, now I see: FS always uses encoding winansi
(which I guess means Cp1252
). I don't know why, but it was used from the very beginning 01.02.2006 :)
I think we can change this encoding. Can you provide a simple example of such html and font, so we could add this example to FS tests?
from flyingsaucer.
well I went on and used a custom font where I can set the encoding. the result was unfortunately still problematic.
lets take this sample HTML:
<html lang="fa">
<head>
<meta charset="UTF-8"/>
<title>Title</title>
<style>
.rtl-font {
font-family: Vazirmatn;
direction: rtl;
}
</style>
</head>
<body>
<div style="background-color: blue">
تست فارسی
</div>
<div class="rtl-font" style="background-color: green">
تست فارسی
</div>
<div dir="rtl" style="background-color: red; font-family: Vazirmatn">
تست فارسی
</div>
</body>
</html>
I have the font (can get it for free from https://github.com/rastikerdar/vazirmatn/releases/tag/v33.003) unzipped into resources directory and this is my Java code:
try (OutputStream outputStream = new FileOutputStream("build/pdf/method4.pdf")) {
// parse and improve HTML
Document document = Jsoup.parse(new File(inputHtml.getFile()), "UTF-8");
document.outputSettings().syntax(Document.OutputSettings.Syntax.xml);
var htmlString = document.html();
// initialize Flying Saucer
ITextRenderer renderer = new ITextRenderer();
SharedContext sharedContext = renderer.getSharedContext();
sharedContext.setPrint(true);
sharedContext.setInteractive(false);
renderer
.getFontResolver()
.addFont(
Main.class.getClassLoader().getResource("Vazirmatn/ttf/Vazirmatn-Regular.ttf").toString(),
BaseFont.IDENTITY_H,
true
);
renderer.setDocumentFromString(htmlString);
renderer.layout();
renderer.createPDF(outputStream);
// relative resources: see https://www.baeldung.com/java-html-to-pdf#dependencies-4
}
now this is the output that FS is giving me:
and this is what a browser gives me (ignoring the font not being applied):
there are two problems here:
- the connection between letters: farsi/arabic letters get connected and change shape based on position and neighbouring letters. this is somehow not handled
- the RTL orientation is not applied. the first letter
ت
should be positioned right most but is left most.
in general I would first go for solving this problem using a custom font (which for sure has all chars) and then maybe looking into fixing that charset for default font.
from flyingsaucer.
btw, you have probably seen this example of RTL rendering using OpenPDF but I just to mention it: https://github.com/LibrePDF/OpenPDF/blob/master/pdf-toolbox/src/test/java/com/lowagie/examples/fonts/styles/RightToLeft.java
I don't know if this is different than what FS is doing under the hood when working with OpenPDF but I couldn't find any of those methods being called.
from flyingsaucer.
I also found this post: https://groups.google.com/g/flying-saucer-users/c/n0CfuYfpQ6I/m/3iJIaZ4IAAAJ
and a whole thread there that is related to this ticket.
from flyingsaucer.
Related Issues (20)
- Application build fails using Flying Saucer Core 9.4.1 HOT 1
- how to fixed html table header HOT 2
- Left and right margin after :first page bug
- Left and right margin after :first page bug HOT 1
- Support for CSS Level 3 Modules HOT 1
- table-cell float:left fails PDF generation HOT 4
- FlyingSaucer can cleanup invalid html
- Support for HTML 5 HOT 3
- org.xhtmlrenderer.pdf.ITextRenderer.setDocumentFromString - missing 'baseUrl' parameter HOT 6
- FlyingSaucer could validate generated PDFs HOT 1
- ITextRenderer class, but no dependency on IText HOT 1
- Remove various old packages and libraries HOT 2
- No way to set Document directly in renderer HOT 2
- Internal logging is enabled by default contrary to what doc says HOT 3
- Support PDF digital signature HOT 3
- default for xr.util-logging.loggingEnabled in resources/conf/xhmlrenderer.conf changed HOT 1
- Adding an image to the full size of a pdf page
- Underline text decoration with Liberation Sans too thick HOT 2
- Bold highlighting of a list item marker not being applied HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flyingsaucer.