I have been trying to import some HTML pages which are mostly text into Powerpoint 2003 presentations - keeping the text as text. A little bit of editing is needed too.
The obvious method to get it into PPT does not work: open the page in a browser, select all, copy, open a slide in PPT, paste - result is an unformatted pile of text. Open the saved html in Word, select all, copy, paste into PPT, again, the same pile of unformatted text.
This version of PPT will accept (in theory) vector input in WMF or EMF formats. (Also CGM, which is supposedly supported by uniconverter, but does not appear as an option in 0.48 Inkscape on Windoews.)
So open page in a browser, print to pdf.
Opened the PDF in Inkscape which works fairly well. Can edit the resulting objects and text _seems_ to be text. Can click the "A" icon and select a few words by sliding over them. HOWEVER, save as WMF or EMF format, and Insert -> Picture -> From file (use EMF one) in PPT, and the text which was imported from PDF is always converted into drawn objects. Each letter becomes its own drawn object. I think that is probably an issue with uniconverter. A further description of what happens is here:
viewtopic.php?f=5&t=10521
Strangely, if in the same inkscape document, after the PDF is imported, a line of text like "This is normal text" is created using the "Create and Edit Text Objects" button, and it is exported along with the imported PDF text, that line is imported into PPT correctly. So Inkscape seems to have two types of text, and the imported PDF text is somehow cursed as far as export to EMF/WMF is concerned.
Can somebody please explain how one can distinguish between these two types of text, and tell me if there is a way to convert the cursed
text into normal text? In the example attached all of the text is cursed except for the line "This is some text entered as a text box".
Two kinds of text???
Two kinds of text???
- Attachments
-
- saf_test.svg
- example svg with cursed and normal text
- (81.36 KiB) Downloaded 157 times
Re: Two kinds of text???
I should add, that during the Save as... (to EMF/WMF) is performed a dialog pops up:
EMF convert
Convert texts to paths [ ]
[Cancel] [OK]
I do NOT check that box, and just click OK.
EMF convert
Convert texts to paths [ ]
[Cancel] [OK]
I do NOT check that box, and just click OK.
Re: Two kinds of text???
I edited the example down to just two lines, one of each type. The text that stays as text when saved as an EMF is in a <flowRegion> <flowPara> section, whereas the cursed text (from the PDF, converts to drawn objects) is not. Here is the code from the SVG. Simplified file is attached.
Is there some simple way to convert the latter type to the former?
Is there some simple way to convert the latter type to the former?
Code: Select all
transform="matrix(0.8,0,0,-0.8,551.51097,962.3509)"><flowRegion
id="flowRegion3832"><rect
id="rect3834"
width="183.65956"
height="77.190247"
x="-334.04745"
y="464.30777" /></flowRegion><flowPara
id="flowPara3836">This is some text entered as a text box</flowPara></flowRoot><text
transform="scale(1,-1)"
id="text3050"
x="268.96753"
y="-651.39398"><tspan
style="font-size:8.76000023px;font-variant:normal;font-weight:normal;font-stretch:normal;writing-mode:lr-tb;fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:none;font-family:Courier New;-inkscape-font-specification:CourierNew"
x="268.96753 274.36633 279.76511 285.28391 290.68271 296.08148 301.48029 306.99908 312.39786 317.79666 323.19543 328.59424 334.11304 339.51181 344.91061 350.30939 355.70819 361.22699 366.62576 372.02457 377.42334 382.94214 388.34094 393.73972 399.13852 404.53729 410.05609 415.4549"
y="-651.39398"
sodipodi:role="line"
id="tspan3052">), Biology Division, Caltech</tspan></text>
- Attachments
-
- saf_test.svg
- simplified example
- (7.32 KiB) Downloaded 255 times
Re: Two kinds of text???
Hmm. Flowed seems to be a red herring. Added one line more of text by click on 'A' icon, click on document, and just start typing. That one
is also successfully moved via EMF into PPT, but it has no flow attributes.
Aha, the "x" value for the ones that import as text have a single value, whereas the "x'" value for the ones that break up on importing
are a list - looks like the PDF specified where every letter goes and that was imported into the SVG. That is:
x="162.36569"
vs.
x="268.96753 274.36633 279.76511 285.28391 290.68271 296.08148 301.48029 306.99908 312.39786 317.79666 323.19543 328.59424 334.11304 339.51181 344.91061 350.30939 355.70819 361.22699 366.62576 372.02457 377.42334 382.94214 388.34094 393.73972 399.13852 404.53729 410.05609 415.4549"
Do the experiment. With an editor, eliminate all but the first entry in that long list. Open that in inkscape, save as EMF, insert into PPT. Good, it stayed text! That can now be selected and edited normally within PPT.
Now, is there some way to convert text with lists of X to just the first X, without resorting to editing the SVG file???
is also successfully moved via EMF into PPT, but it has no flow attributes.
Aha, the "x" value for the ones that import as text have a single value, whereas the "x'" value for the ones that break up on importing
are a list - looks like the PDF specified where every letter goes and that was imported into the SVG. That is:
x="162.36569"
vs.
x="268.96753 274.36633 279.76511 285.28391 290.68271 296.08148 301.48029 306.99908 312.39786 317.79666 323.19543 328.59424 334.11304 339.51181 344.91061 350.30939 355.70819 361.22699 366.62576 372.02457 377.42334 382.94214 388.34094 393.73972 399.13852 404.53729 410.05609 415.4549"
Do the experiment. With an editor, eliminate all but the first entry in that long list. Open that in inkscape, save as EMF, insert into PPT. Good, it stayed text! That can now be selected and edited normally within PPT.
Now, is there some way to convert text with lists of X to just the first X, without resorting to editing the SVG file???
Code: Select all
<text
xml:space="preserve"
style="font-size:16px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;font-family:Arial;-inkscape-font-specification:Arial"
x="162.36569"
y="500.24118"
id="text3038"
sodipodi:linespacing="125%"
transform="matrix(0.8,0,0,-0.8,0,792)"><tspan
sodipodi:role="line"
id="tspan3040"
x="162.36569"
y="500.24118">This text was not dragged.</tspan></text>
- Attachments
-
- saf_test.svg
- modified again, x list reduced to x one value
- (7.63 KiB) Downloaded 250 times
Re: Two kinds of text???
mathog wrote:So open page in a browser, print to pdf.
Opened the PDF in Inkscape which works fairly well. Can edit the resulting objects and text _seems_ to be text. Can click the "A" icon and select a few words by sliding over them.
These are not really "two kinds of text": the difference is that the text resulting from importing a PDF file in Inkscape is absolutely kerned (each letter is absolutely positioned on the page).mathog wrote:Strangely, if in the same inkscape document, after the PDF is imported, a line of text like "This is normal text" is created using the "Create and Edit Text Objects" button, and it is exported along with the imported PDF text, that line is imported into PPT correctly. So Inkscape seems to have two types of text, and the imported PDF text is somehow cursed as far as export to EMF/WMF is concerned.
Please read the notes on 'Importing PDF > Editing text' in in the release notes of Inskcape 0.46 which explains why (this also applies to current version 0.48.2):
PDF and AI import > Text editing tips wrote:Text editing tips: Any text imported from PDF or AI has each letter's precise place on the page fixed. While this preserves the exact appearance (e.g. justification of text blocks) of the imported document, it makes editing such text difficult: deleting text fails to contract the text line and inserting text fails to expand it, i.e. typed letters overlay the existing letters. (However, you still can replace a letter with another letter of about the same width, although you may need to kern it into place with Alt+arrows.)
To work around this, select the text object you want to edit and use Text > Remove manual kerns command. This will remove the exact positioning information, so if the text block was justified it will lose justification, but instead you will be able to edit it as usual.
Note that there is a way to select even a single line in a text block. For this, open the XML editor, expand the <svg:text> tree branch corresponding to your text, and select any of the <svg:tspan> objects under it. Now you can remove manual kerns from this line only. After you finish editing the line, you can manually justify it back, for example by adding spaces, manual kerns (Alt+arrows), or by adjusting letterspacing (select the whole line and use Alt+> or Alt+<).
The native PDF/AI importer is based on the poppler library and was implemented by Miklós Erdélyi as part of the Google Summer of Code 2007.
Re: Two kinds of text???
Right, manual kerning. Unfortunately,
select all
text -> remove manual kerning
massacres the layout of the text. Bizarrely, some of the text moves vertically, and others move to odd positions horizontally.. Undo "remove manual kerns" and the page is NOT restored.
See the attached images for all of these effects. Hmm, they are showing up in the wrong order. The order should be before.png, unkerned.png, and then undo_unkern.png.
select all
text -> remove manual kerning
massacres the layout of the text. Bizarrely, some of the text moves vertically, and others move to odd positions horizontally.. Undo "remove manual kerns" and the page is NOT restored.
See the attached images for all of these effects. Hmm, they are showing up in the wrong order. The order should be before.png, unkerned.png, and then undo_unkern.png.
- Attachments
-
- Undo the remove manual kerns - result is a big mess.
- undo_unkern.png (76.64 KiB) Viewed 2008 times
-
- After text -> remove manual kerns
- unkerned.png (77.24 KiB) Viewed 2008 times
-
- Imported from PDF, edited down to just a couple of lines
- before.png (77.33 KiB) Viewed 2008 times
Re: Two kinds of text???
Well this is interesting. Opened a copy of the SVG and edited out the "X" kerning information - and the same thing happened as with text -> remove manual kerning. Bizarre. This SVG file is attached. Used diff to verify that no y value had been touched:
Yet WHO definitely moves from one line to another, just as it did for the other method.
Code: Select all
$ diff saf_test_kernissue.svg saf_test_kernissue_edited_to_unkern.svg
127c127
< x="126.1731 131.5719 136.97064"
---
> x="126.1731"
138c138
< x="142.37299 147.89178 153.29053 158.68933"
---
> x="142.37299"
153c153
< x="169.49487 175.01367 180.41241 185.81122"
---
> x="169.49487"
158c158
< x="191.21527 196.73407 202.13287"
---
> x="191.21527"
173c173
< x="212.93488 218.33362 223.85242 229.25122 234.65002 240.04883"
---
> x="212.93488"
182c182
< x="142.37482 147.89362 153.29236 158.69116 164.08997 169.48877 175.00757 180.40631 185.80511 191.20392 196.72272 202.12152 207.52026 212.91907 218.31787 223.83667 229.23547 234.63422 240.03302 245.55182 250.95062 256.34943"
---
> x="142.37482"
Yet WHO definitely moves from one line to another, just as it did for the other method.
- Attachments
-
- saf_test_kernissue_edited_to_unkern.svg
- (8.21 KiB) Downloaded 257 times