gotcha

Author Topic: Incorrect text colors when converting from PDF > SVG  (Read 2617 times)

May 29, 2018, 10:29:42 PM
Read 2617 times

InkScapist

  • Sr. Newbie

  • Offline
  • **

  • 5
I'm using Inkscape 0.92.3 with Win7 in shell mode to batch-convert a bunch of PDF files which contain graphs created with a tool that uses Mathematica to SVG.

Here is a sample code snippet to illustrate the batch file options piped into the shell:

"--file=C:/Temp/graphs/sample.pdf" "--export-plain-svg=C:/Temp/graphs/sample.svg"

For some graphs Inkscape gets the text colours wrong. For example, when I convert this PDF file

original.gif
*original.gif
(14.46 kB . 594x339)
(viewed 601 times)


I get this SVG result:

inkscape.gif
*inkscape.gif
(18.9 kB . 712x391)
(viewed 541 times)


When I import the same file in the desktop instance the same happens for the default "Internal import" option, but not for the "Poppler/Cairo import" option. However, since I need small file sizes converting text to lines is not an option.

Is this a known bug? Is there a workaround or fix available?

Thank you.
« Last Edit: May 31, 2018, 03:03:42 PM by InkScapist »
  • 0.92.3
  • Win7

May 30, 2018, 01:30:30 PM
Reply #1

brynn

  • Administrator

  • Offline
  • ******

  • 3,941
  • Gender
    Female

    Female
    • Inkscape Community
Welcome to the forum!

Can you share the PDF file and the resulting PNG file, so that we can investigate? 

I'll search, as soon as I post this, to see if there is an existing report for this.  Although, my searching doesn't always turn up an existing report, so you're welcome to search yourself as well, if you like.  https://launchpad.net/inkscape

My first thought is that maybe there's some conflict with how the colors are named or coded.  But mostly this is over my head, and I'll be interested to learn the outcome.

Edit
I think I'll move this to Beyond the Basics board.
  • Inkscape version 0.92.3
  • Windows 7 Pro, 64-bit
Inkscape Tutorials (and manuals)                      Inkscape Community Gallery                        Inkscape for Cutting Design                     



"Be ashamed to die until you have won some victory for humanity" - Horace Mann                       

May 30, 2018, 05:44:08 PM
Reply #2

InkScapist

  • Sr. Newbie

  • Offline
  • **

  • 5
Hi Brynn,

thank you for your investigation. Please find the original PDF and the converted SVG attached.
  • 0.92.3
  • Win7

May 30, 2018, 06:52:31 PM
Reply #3

brynn

  • Administrator

  • Offline
  • ******

  • 3,941
  • Gender
    Female

    Female
    • Inkscape Community
I found a couple of bugs with search terms "pdf text color"

https://bugs.launchpad.net/inkscape/+bug/885244
https://bugs.launchpad.net/inkscape/+bug/1671789
https://bugs.launchpad.net/inkscape/+bug/1488308

As I said before, I probably don't have the right skills to evaluate what's causing this problem.  I can't even be sure if those bugs I found are exactly the same problem as yours.  However, I can make some observations.

When I open the PDF in Inkscape, using the Internal Import, where the text becomes red, I also notice that the text has become Symbols.  I can only guess that something about changing the text to symbols might be part of the problem.

If I choose the Poppler Cairo import, I end up with green text -- not green paths.  Oh, but wait.  I just realized about this:

When I import the same file in the desktop instance the same happens for the default "Internal import" option, but not for the "Poppler/Cairo import" option. However, since I need small file sizes converting text to lines is not an option.

If you are converting the PDFs to PNGs, the text will not become lines or paths, or any vector element.  Once it's converted to the raster format, everything is mapped to the pixels.

So the problem is happening by opening the PDF in Inkscape, using the Internal Import option (or whatever is the equivalent part of your shell/batch/code process (which I don't understand at all)), and not any other part of your process.  At first, I thought you were suspicious of the PDF to PNG conversion being the problem.  But the problem seems to be PDF to SVG, and only with Internal Import.

I wonder if there would be any way for you to specify that Inkscape uses the PopplerCairo option, via the shell/batch/code process (sorry, I just don't know the proper language).  (is it commandline?)
  • Inkscape version 0.92.3
  • Windows 7 Pro, 64-bit
Inkscape Tutorials (and manuals)                      Inkscape Community Gallery                        Inkscape for Cutting Design                     



"Be ashamed to die until you have won some victory for humanity" - Horace Mann                       

May 31, 2018, 03:14:53 PM
Reply #4

InkScapist

  • Sr. Newbie

  • Offline
  • **

  • 5
I'm converting to SVG, apologies for the confusion. I've corrected the one instance where my original post said "PNG".

The shell mode is acutally a feature of Inkscape (check the command line option "--shell").

Inkscape shows the same bug when just using the regular mode, ie.:

inkscape.com "--file=Employment by Wage Level.pdf" "--export-plain-svg=C:/Temp/graphs/sample.svg"

I'm not aware of any command line options (https://inkscape.org/en/doc/inkscape-man.html#OPTIONS) to specify the other import filter.
  • 0.92.3
  • Win7

May 31, 2018, 06:30:14 PM
Reply #5

Moini

  • IC Mentor

  • Offline
  • ******

  • 1,568
    • VektorRascheln
Quote (selected)
I'm not aware of any command line options (https://inkscape.org/en/doc/inkscape-man.html#OPTIONS) to specify the other import filter.

You can't be, here are the related bug reports:

https://bugs.launchpad.net/inkscape/+bug/1506043
https://bugs.launchpad.net/inkscape/+bug/1507740

As for the bug with the color itself, I haven't looked into it.

June 02, 2018, 11:51:25 AM
Reply #6

brynn

  • Administrator

  • Offline
  • ******

  • 3,941
  • Gender
    Female

    Female
    • Inkscape Community
I'm converting to SVG, apologies for the confusion. I've corrected the one instance where my original post said "PNG".

Oh, ok.  I still can't help very much with this.  But I do want to make sure this gets attention from developers.  If any of the bugs I found seem to fit your situation, you're welcome to add more comments, if you think you have some relevant info about the problem, which hasn't already been mentioned.

Or if your problems don't seem to fit, or if you're not sure if they fit, please make a new bug report.  Be sure to include your Inkscape version and operating system, as well as the relevant SVG and PDF files.

Actually, in struggling to understand, it seems like you've run into 2 different issues -- not having option to use the poppler/cairo import, and the internal import giving the wrong color.

But anyway, aside from helping with Inkscape development, either reading the relevant bug report, adding new comments, or posting a new report if necessary, you could potentially learn a workaround.
  • Inkscape version 0.92.3
  • Windows 7 Pro, 64-bit
Inkscape Tutorials (and manuals)                      Inkscape Community Gallery                        Inkscape for Cutting Design                     



"Be ashamed to die until you have won some victory for humanity" - Horace Mann                       

June 03, 2018, 04:42:21 PM
Reply #7

InkScapist

  • Sr. Newbie

  • Offline
  • **

  • 5
I did some analyis of all of our PDF files that we need to convert: Inkscape gets font colours wrong for 63% of them!

I then tried to narrow down what might throw Inkscape, but nothing I did to the PDF had an effect (e.g RGB instead of CMYK, unembedding or replacing the Mathematica Sans font, printing into a new PDF, etc.).

If I convert the same file from GIF to SVG the colours are preserved.

It really looks like a bug in Inkscape to me. What a pity, I would have loved using this tool.
  • 0.92.3
  • Win7

June 04, 2018, 03:25:51 PM
Reply #8

InkScapist

  • Sr. Newbie

  • Offline
  • **

  • 5
I have found another tool, Mutool (https://mupdf.com/docs/manual-mutool-convert.html), which has no problem converting my graphs.

As I cannot invest more time investigating Inkscape, I have now switched to that tool.

Thank you for your help so far.
  • 0.92.3
  • Win7

November 26, 2018, 06:08:10 AM
Reply #9

mark

  • Sr. Newbie

  • Offline
  • **

  • 5
Was there ever a bug report raised for the text colour issue?  Or a fix?

We are seeing similar issues that upon converting a PDF to an SVG using inkscape command line, the text colour is changing.

Using the poplar/cairo import option (even if it existed as a command line option) would not work for us as we have a requirement to keep text as text so it can be selectable/parsable.


November 26, 2018, 01:39:54 PM
Reply #10

brynn

  • Administrator

  • Offline
  • ******

  • 3,941
  • Gender
    Female

    Female
    • Inkscape Community
By re-reading this thread, it doesn't appear that a new bug report was made.  Normally I would have made the report myself (if the user didn't want to) but I don't understand this problem well enough.  (I really have no idea what shell mode is.  I do understand what commandline is, in general, but not at all how to use it.)  I would not be able to answer developers questions, or even supply example files or test files.

I also did not search any further for existing reports.  But you're welcome to do that, and you're also welcome to make a report if you don't find one.  https://launchpad.net/inkscape  I wish I could make the report, since the op apparently didn't want to get involved.  But the problem is too far outside my understanding, to be able to make a potentially successful report.

As an open source program, Inkscape is developed by its community.  If a report already exists, it might have a fix or workaround.  If there is no report, it's just not going to get fixed.  Someone who understands it (better than me) needs to make the report.

If you're unsure what kind of info needs to be included in a bug report, you can read this:  https://inkscape.org/contribute/report-bugs/
  • Inkscape version 0.92.3
  • Windows 7 Pro, 64-bit
Inkscape Tutorials (and manuals)                      Inkscape Community Gallery                        Inkscape for Cutting Design                     



"Be ashamed to die until you have won some victory for humanity" - Horace Mann                       

December 18, 2018, 07:55:10 AM
Reply #11

mark

  • Sr. Newbie

  • Offline
  • **

  • 5
Thanks Brynn

I did do a search originally, and only found this thread.

I'll do some more searching, and raise the right reports if one does not exist.

Mark

December 19, 2018, 07:47:32 AM
Reply #12

TimiZero

  • Jr. Member

  • Offline
  • ***

  • 17
  • Gender
    Male

    Male
Hi, I did try to open the pdf of the original OP in Inkscape and can confirm the text colors changed. But not sure if it's a bug or simply a limitation of Inkscape text handling.

This is what it looks like after imported using internal method. Please look at the text box/text field area of the blue text
Screenshot_2018-12-19_22-27-23.png
*Screenshot_2018-12-19_22-27-23.png
(208.12 kB . 1075x838)
(viewed 392 times)

As you can see, both the subtitle and label are in the same text area. This is probably the reason why poppler chose either 1 of the color instead of showing both. Same thing to the red text.

However, we don't know if this is due to the original software that made the pdf (i mean by the person originally set those text fields to one instead of separately), or simply Inkscape/Poppler screwed up when importing the text. For OP case, I think the more effective solution would be to include "convert text to path option" in command line, but haven't tried Inkscape command from terminal myself so I might be wrong. Otherwise he/she can use other tools that's more suitable for batch conversion like Ghostscript? or even Poppler on its own.

@Mark
I think SVG might not be the right format for preserving text formatting? but again I could be wrong. The reason is, from my observation any text I have in SVG would not shown properly (in terms of kerning, positioning) in my image viewer (i'm using ristretto in xubuntu) and also in Firefox. Even in pdf, it would not display correctly unless you embed the font.

I don't know SVG specification, but if it couldn't embed font, then there is no point in having text as text. Because other person who use different computer could not see it properly unless they have that specific font installed.
  • 0.92.4
  • xubuntu 18.04

December 22, 2018, 02:14:34 PM
Reply #13

brynn

  • Administrator

  • Offline
  • ******

  • 3,941
  • Gender
    Female

    Female
    • Inkscape Community
I did do a search originally, and only found this thread.

I'll do some more searching, and raise the right reports if one does not exist.

I meant searching the bug tracker, not the internet.  https://launchpad.net/inkscape



TimiZero, thanks for your investigation.  I'm probably going to regret asking this, because the answer is probably over my head.  But I'm just too curious.

Why would having a few blank line spaces between lines of text, in a single text object, befuddle Poppler?  Or is it because of where the text is placed, with part of one text object lying between, or in the blank line spaces, of the other text object?

Have you seen something like that before, causing a problem?
  • Inkscape version 0.92.3
  • Windows 7 Pro, 64-bit
Inkscape Tutorials (and manuals)                      Inkscape Community Gallery                        Inkscape for Cutting Design                     



"Be ashamed to die until you have won some victory for humanity" - Horace Mann                       

December 27, 2018, 06:36:56 AM
Reply #14

TimiZero

  • Jr. Member

  • Offline
  • ***

  • 17
  • Gender
    Male

    Male
Why would having a few blank line spaces between lines of text, in a single text object, befuddle Poppler?
Actually I was thinking that it should have been separated objects to retain the intended color, but not sure how the poppler works anyway, and this lead to you next question

Have you seen something like that before, causing a problem?
I just realized that I was just assuming in the previous post, therefore i run some simple test to replicate the problem;

First, I created pdf in Inkscape and re-imported back to inkscape
Here how it looks
Screenshot_2018-12-27_21-48-12.png
*Screenshot_2018-12-27_21-48-12.png
(111.1 kB . 929x722)
(viewed 374 times)


Despite of only 1 object used when creating the text, now we have 2 objects, thus 2 colors. You can also refer to the svg and pdf attached for further investigation

I also created a pdf in scribus v1.4.7, but poppler couldn't render whitespace properly leading to all gliphs converted to separate text, but color remained. (no attachment)

Finally created another pdf using LibreOffice, and here is the result
Screenshot_2018-12-27_21-45-45.png
*Screenshot_2018-12-27_21-45-45.png
(92.98 kB . 754x629)
(viewed 372 times)

With 3 separate objects this time

So my conclusion on this simple test; the text importing behaviour rely on from which software a pdf is generated. In the OP case, it might be because of that particular pdf generator software?? i don't know
  • 0.92.4
  • xubuntu 18.04

December 27, 2018, 12:09:18 PM
Reply #15

brynn

  • Administrator

  • Offline
  • ******

  • 3,941
  • Gender
    Female

    Female
    • Inkscape Community
Thanks for your testing.  Were you using shell mode, like the op?  Hopefully someone who understands all this better than me can make the bug report.  Then, developers can investigate.
  • Inkscape version 0.92.3
  • Windows 7 Pro, 64-bit
Inkscape Tutorials (and manuals)                      Inkscape Community Gallery                        Inkscape for Cutting Design                     



"Be ashamed to die until you have won some victory for humanity" - Horace Mann                       

December 28, 2018, 10:04:41 AM
Reply #16

TimiZero

  • Jr. Member

  • Offline
  • ***

  • 17
  • Gender
    Male

    Male
Were you using shell mode, like the op?
No, just opened the file by right clicking --> open with inkscape.

Hopefully someone who understands all this better than me can make the bug report
I think the problem seems to be related to the source software of the OP which we don't have the information. All 3 tests i did, gave correct results regarding the text colors. So, high chances it's bug/problem with opening a pdf generated by OP's software.

Oh, seems like Mark is having the same problem. Maybe he can give us more info?
We are seeing similar issues that upon converting a PDF to an SVG using inkscape command line, the text colour is changing.
  • 0.92.4
  • xubuntu 18.04

December 29, 2018, 07:55:27 AM
Reply #17

TimiZero

  • Jr. Member

  • Offline
  • ***

  • 17
  • Gender
    Male

    Male
I just realized the software information can be obtained by looking at the pdf properties itself...  :duh:  :b1:

So in OP case, it was created by software called Wolfram Mathematica. After a quick googling, it's a paid software so no way to reproduce unless somebody already has this software installed.

But, there's another interesting point to note, which is the pdf version of OP's file, that is 1.6. Maybe this could be the reason somehow? I don't have any way to produce pdf1.6, as the highest possible version could be produced by all my software is up to 1.5 only.
  • 0.92.4
  • xubuntu 18.04