How to select text in PDF when entire page is an image

Post questions on how to use or achieve an effect in Inkscape.
luvmywife
Posts: 5
Joined: Wed Sep 19, 2012 6:29 am

How to select text in PDF when entire page is an image

Postby luvmywife » Wed Sep 19, 2012 6:32 am

I upload a pdf page. I choose the selection tool (arrow). I click on the text that I want to change but the entire page is selected, not the text. The bottom shows the page as an image. How can I select text and change it? I have attached the pdf in question.
Attachments
Pages from DP 600.050c Pharmanex DL Number and Expiration Date Calculation.pdf
(17.43 KiB) Downloaded 320 times

User avatar
druban
Posts: 1917
Joined: Fri Nov 20, 2009 10:48 pm

Re: How to select text in PDF when entire page is an image

Postby druban » Wed Sep 19, 2012 8:26 am

Your file seems fine, maybe your import PDF dialog does not say 'import text as text'? That might be a reason the text is becoming an image? Any way ... Imported text will not ever be in a big selectable block, just so you know... But all your text is editable ... here's the svg.
Attachments
Pages from.svg
(37.92 KiB) Downloaded 300 times
Your mind is what you think it is.

luvmywife
Posts: 5
Joined: Wed Sep 19, 2012 6:29 am

Re: How to select text in PDF when entire page is an image

Postby luvmywife » Thu Sep 20, 2012 1:16 am

Your experience is different from mine. I have attached yet another file. When I open the pdf into Inkscape the option "Import text as text" is selected. I click on the arrow in Inkscape. I click on the text and the entire page is an image with black arrows on the outside of the page. I cannot select and modify any text. I don't know what I am doing wrong. I don't know why you are able to modify text and I am not.
Attachments
Pages from SOP 600.010c Label Printing.pdf
(241.06 KiB) Downloaded 198 times

User avatar
druban
Posts: 1917
Joined: Fri Nov 20, 2009 10:48 pm

Re: How to select text in PDF when entire page is an image

Postby druban » Thu Sep 20, 2012 1:39 am

Hi, in PM you communicated that you're still having probs. Since this is not a terribly unusual problem it might be of benefit to new users to have this discussion here where it will be searchable in the future, so I hope that's OK with you.
First, are you on a Mac? If so then someone else will be along to help, and you can ignore my further comments.
If not, then, picking up where we left off, step by step:
Were you able to open the .svg file I attached in the last post and select the text by clicking on it? If the answer is no, then your problems are in your Inkscape installation - or worse!
If yes, then: how are you opening the Pdf in Inkscape? Try starting Inkscape and then file>open and browse to the pdf, in the dialog that comes up make sure that the option to "Import text as text" is chosen. Do you not see this option? Then stop right here and let us address that issue.
If you do see it, then continue and when the file opens, do ctrl-a to select all. Look down in the status bar. Does it say ".... blah de blah Group something"? If no, please post exactly what it says. If yes, press control-U to ungroup. Does it still say group something? Ungroup again, and keep doing it at least until your status line starts showing 'text' among the other things in the list. Now you should be able to select text objects to edit.

Want to leave your groups as they are for some reason? when your file first opens try control-clicking on a bit of text. It should be selected as text (check the status line) and editable.
Stilll no joy? Then we will address other possibilities, and need your Inkscape version no. and what platform (linux/mac/windows) you are working on. You may need to upgrade your Inkscape to a current version.
Your mind is what you think it is.

luvmywife
Posts: 5
Joined: Wed Sep 19, 2012 6:29 am

Re: How to select text in PDF when entire page is an image

Postby luvmywife » Thu Sep 20, 2012 4:03 am

I am on a PC using Win XP. Inkscape version 0.48.
Yes I was able to open the .svg and select the text and edit.
I open the pdf as you described. Open Inkscape then file>open and browse to the pdf, in the dialog that comes up I make sure "Import text as text" is chosen and it opens.
I do a CTRL-A and the status bar says the same thing as before CTRL-A: "Image 2544x3295 embedded in layer" (I have attached a jpeg printscreen)

User avatar
brynn
Posts: 10309
Joined: Wed Sep 26, 2007 4:34 pm
Location: western USA
Contact:

Re: How to select text in PDF when entire page is an image

Postby brynn » Thu Sep 20, 2012 4:58 am

In your first file, I can select the text. It shows as Text in the status bar. But all I can do is move it or delete it. I can't for example, fix a typo. I can backspace, and it works as expected, but when I try to type a new letter, it types it over top of exisiting text, leaving the space open. If I instead highlight a few letters and type in new ones, the same thing. The first new letter is placed over an exisiting letter, and all the rest go on top of it.

In your 2nd file, the entire page is an image, and I cannot select the text.

luvmywife
Posts: 5
Joined: Wed Sep 19, 2012 6:29 am

Re: How to select text in PDF when entire page is an image

Postby luvmywife » Thu Sep 20, 2012 7:35 am

Forget the first image. How do I edit text in the 2nd image where the entire page is one image? Every PDF I open in Inkscape is an image with the black arrows on the outside of the image. I cannot select any text.
FYI, I am scanning the hardcopy via a photocopy/scanner which automatically creates PDFs on my Win computer. This is the source of my PDFs

User avatar
brynn
Posts: 10309
Joined: Wed Sep 26, 2007 4:34 pm
Location: western USA
Contact:

Re: How to select text in PDF when entire page is an image

Postby brynn » Thu Sep 20, 2012 9:12 am

Oh, well you won't be able to select the text. At least not that I know of. When you scan a document, it makes an image, a raster image....hhmmm....well actually it can probably either make an image or a text document. In my experience, scanners are not very good at creating text documents, at least not my scanner, lol. I don't know if you have the scanner set for image or text documut, but as long as Inkscape identifies it as an image, you won't be able to edit the text.

User avatar
druban
Posts: 1917
Joined: Fri Nov 20, 2009 10:48 pm

Re: How to select text in PDF when entire page is an image

Postby druban » Thu Sep 20, 2012 9:21 am

luvmywife wrote:Forget the first image.

Does that mean that following my instructions you are able to edit the original PDF that you first posted but that was not your problem after all? (Sorry to go back to that but there's no way to know what to do if we don't follow an organized route of inquiry, you being there and me being here, wherever here may be...)
luvmywife wrote:FYI, I am scanning the hardcopy via a photocopy/scanner which automatically creates PDFs on my Win computer. This is the source of my PDFs

Unless you have some very fancy OCR software this method will NEVER generate editable text, even though it might as a matter of course wrap the image in a PDF wrapper. End of story. Once again, I need to know that the first PDF you posted (I just can't forget it! :) ) was not generated in this manner, correct? Did you post it by mistake?
Every PDF I open in Inkscape is an image with the black arrows on the outside of the image.

Can we stick to one file so we are all on the same ... ahem... page?
Here's a suggestion - open the attached pdf with inkscape and please tell us what happens. How many text objects do you see? 1, 2, or 0? More?
also, have you considered the bugfix release Inkscape 0.48.2? Maybe that's the problem?
Attachments
test pdf.pdf
(41.74 KiB) Downloaded 207 times
Your mind is what you think it is.

rich2005
Posts: 55
Joined: Fri Mar 30, 2012 9:06 pm

Re: How to select text in PDF when entire page is an image

Postby rich2005 » Thu Sep 20, 2012 9:07 pm

Could I suggest that perhaps Inkscape is perhaps not the most appropriate tool for this job. LibreOffice now edits pdf reasonably well.

The first pdf is composed of editable elements. Not a problem.

The second pdf has one line that edits "date printed.." and the rest is very obviously a bitmap. You can edit this by overprinting in a box. Of course you can do this in Inkscape as well. Tedious but the other option is to edit in a bitmap editor such as Gimp which is just as tedious.

example http://i.imgur.com/adPO7.jpg

luvmywife
Posts: 5
Joined: Wed Sep 19, 2012 6:29 am

Re: How to select text in PDF when entire page is an image

Postby luvmywife » Fri Sep 21, 2012 2:05 am

To reply to druban.
1. I said to forget the first image because it is an anomaly. I have no idea why it is editable. I thought I had scanned it like all the others but the fact that it is editable and all my others aren't has me confused. Thus, being confused, I said to just forget it. However, If I can find out what I did differently with it, then I can solve the problem. But I can't figure out why it is different and editable. I will keep experimenting.

2. Someone said I could edit the 2nd by overprinting in a box. How is that done?
3. On your attached PDF, the status bar says, "Group of 37 objects in layer"
4. How else would I create a PDF that is editable, without using the photcopier/scanner?
5. Would Adobe Illustrator CS edit a PDF image? Can anything edit a PDF image?

I'm sorry, I'm a newbie and pretty dumb at all this and thank everyone for their input and patience.

User avatar
druban
Posts: 1917
Joined: Fri Nov 20, 2009 10:48 pm

Re: How to select text in PDF when entire page is an image

Postby druban » Fri Sep 21, 2012 7:10 am

luvmywife wrote:I have no idea why it is editable. I thought I had scanned it like all the others but the fact that it is editable and all my others aren't has me confused.

Me too! Does your scanner have an OCR option? That's Optical Character Recognition. It's the only way to generate editable text from a physical object through a scanner. Basically you scan the page, aligning it as carefully as possible with v and h axes. Then you give the resulting image to a program that 'reads' the letters and makes a best guess at what the original text was. The results can range from startling to amusing to accurate. At this point it's your best bet, and in fact, if you actually did scan in that first document, it's the only way you could have gotten it so neatly editable.

To summarize what everyone is (probably) telling you: A pdf is a file format that can hold a combination of pictures and text and can be many pages or one page. Its original purpose was not to be edited but to be viewed and printed on many platforms and look essentially the same.

However, in spite of the original intent, it can be opened, and its contents separated and edited according to their original form. If a picture was put in, the best you can get out is a picture. If text was originally put in, then, depending on your tool, you can get text back out.
It's important to realize that a picture of text is, as far as software is concerned, pretty much the same as a picture of Elvis... But just as some software can recognize faces and maybe even put a name to them, there's software that can try to read the text in a picture and that's what I am referring to above.
The cost of such a program can vary from free (results are rather surreal but sometimes quite poetic) to very expensive (can transcribe a doctor's handwriting accurately). If your project budget allows it, you could buy such a program. When people say you can overtype, they mean, AFAIK, that you enter text by typing each letter while looking at the picture underneath.
Your mind is what you think it is.

rich2005
Posts: 55
Joined: Fri Mar 30, 2012 9:06 pm

Re: How to select text in PDF when entire page is an image

Postby rich2005 » Fri Sep 21, 2012 6:51 pm

For the OP, a little video.

http://youtu.be/e4BuL-OKJc4 (3 mins) not exactly Screenscasters.

It is one way using Inkscape, it is a lot easier with LO but that is another application.

Don't try changing your 10 Euro notes into 100's ;)


Return to “Help with using Inkscape”