A pdf could have been made from a layered or text file which will open in Acrobat and allow you to highlight text for copying or editing. Sometimes pdf files come from programs like Photohshop which flatten the text and don’t allow highlighting in Acrobat.
OCR stands for Optical Character Recognition. There are tons of OCR programs that will read an image file and turn the text to editable letters, and Adobe Acrobat has added this to its available features.
I had someone send me a flattened pdf file which would not allow me to copy and paste text. Of course I did not want to retype 35 pages so I thought how can I grab this text.
- First I opened the file in Adobe Acrobat
- Open Tools on the right.
- I tried to use the text tool under Content, Edit Text & Objects but could not grab any text.
- Next I selected Recognize Text and clicked the first option I saw “In this file”
- The built in OCR ran through all the pages and converted the text to actual letters I could select.
- I again went into Content, Edit Text & Objects and selected the text tool.
- This time it let me select the text, which I was able to copy and paste into my presentation file.
That just saved me hours of typing (and hours of proofing because I type horribly). So as I started working with this type for the most part it was very easy to highlight then copy and paste to my presentation but like any OCR sometimes the software has a hard time recognizing some of the type.
- To refine the OCR conversion I went back into Recognize Text, and selected Find All Suspects under OCR Suspects.
It went through my text and highlighted text the areas it could not distinguish. I clicked Accept and Find which converted that word to text and moved on to the next. Now with this extra step I was able to make even better selections to copy and paste actual characters of text.
It’s not perfect, like most OCR software, but what a great feature to have.