CUPS PDF creation, font encoding, garbled / glibberish text

Questions about Wine on Linux
Locked
marlemion
Newbie
Newbie
Posts: 2
Joined: Thu Mar 30, 2017 4:46 am

CUPS PDF creation, font encoding, garbled / glibberish text

Post by marlemion »

Hi,

I am running MSOffice 2007 on Wine 2.4 on Arch. It is running very well and I am using it to prepare PDF files from various office files in the background. The PDF files look very good and are searchable. However, they contain text, which is malformed. This depends on the font encoding, I suppose.

As a test setup I copied my Fonts directory from my windows machine to the respectuive wine prefeix/drive_c/windows/Fonts and also installed them in the truetype fonts directory on the linux machine (/usr/share/fonts/TTF + fc-cache and fc-cache-32). So the PDF printing picks up all necessary fonts.

Printing is done via cups-pdf. Currently at 3.0.1, but I have tried several versions starting from 2.6 inclduding and not indcluding several patches found on the internet:

https://github.com/alexivkin/CUPS-PDF-to-PDF
http://www.linuxquestions.org/questions ... 175440557/
https://launchpadlibrarian.net/15378121 ... port.patch

It seems that the cups-pdf driver is not the culprit here, as the result is always the same.

My test.docx is attached and contains two lines ('test') the first one in Calibri, the second one in Arial. When I print this docx with wine 2.4, Word 2007 and cups-pdf, a pdf file is created, which looks fine (s. attached), but when I copy and paste the text, I get:

'[garbled text]
test'

So apparently, the Calibri-font part is somehow problematic.

pdffonts test_wine.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
OQHKQC+ArialMT Type 1C WinAnsi yes yes no 10 0
YIGKZZ+Calibri Type 1C Custom yes yes no 8 0

I know I can directly save the docx as a pdf in word and I can also automate this via a makro, but I also would like to print from Outlook, which is not as easy to automate as Word via makros, but easily prints via tjhe /p command line switch.

When I directly save the docx as pdf, I get:

'Test
test'

pdffonts test_wine_savepdf.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
ABCDEE+Calibri TrueType WinAnsi yes yes no 5 0
Arial TrueType WinAnsi no no no 7 0

The difference is that the Calibri font is encoded WinAnsi and not custom.

I checked whether the cups-pdf driver might be the culprit by opening and printing the test.docx file with a native copy of libreoffice and I get:

'Test
test'

pdffonts test_lo.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
BAAAAA+Calibri TrueType WinAnsi yes yes yes 14 0
CAAAAA+ArialMT TrueType WinAnsi yes yes yes 9 0

So, basically, both the cups-pdf driver and wine/word are able to encode WinAnsi, but printing via Word apparently forces the cups-pdf driver to include custom-encoded font-subsets. I somewhere read that ghostscript cannot do anything related to reencoding fonts, so I assume that there is something wrong before cups-pdf starts.

Any help? Searchable PDFs would be very useful for me and I suppose also for others.

Thanks in advance!

PS: Unfortunately, I cannot add the files!
User avatar
dimesio
Moderator
Moderator
Posts: 13204
Joined: Tue Mar 25, 2008 10:30 pm

Re: CUPS PDF creation, font encoding, garbled / glibberish t

Post by dimesio »

File a bug.
marlemion
Newbie
Newbie
Posts: 2
Joined: Thu Mar 30, 2017 4:46 am

Re: CUPS PDF creation, font encoding, garbled / glibberish t

Post by marlemion »

Locked