Retyping long documents is a task that most of us would rather avoid. Thanks to Infix PDF Editor, we can. Infix can acquire a page from a scanner and convert the scanned image to editable text. And because its editing features are intuitive, it makes the whole process really straightforward.
Being able to edit a scanned document can come in really handy in a variety of situations:
- You can re-use content that you’ve already created and printed, even if the original digital file has been lost, without having to retype everything.
- It’s possible change other people’s work and reprint it.
- You can fill in forms that you’ve picked up on paper. Instead of manually writing on the form and posting it back, you can type clear, legible answers and email it instead.
In this article, we’ll look at how text is scanned, recognised and edited in Infix PDF Editor. This article is relevant to the Standard and Pro versions of the application.
How Scanning Works
Infix PDF Editor can import a page directly from your scanner, so there’s no need to use a third party application to obtain the image. You can initiate the scan from the Document menu, under Pages.
- Choose Insert From Scanner to add pages to the PDF you’re working on.
- Choose Create From Scanner to create a new file.
As you work your way through the scan wizard dialog boxes, make sure you tick the Recognise Text (OCR) checkbox. You’ll need to select the right language as well.
Finally, you’ll have to choose if you want to acquire Editable Text or a Searchable Image.
Briefly, the choice you make here will depend on your plans for the document. Is it more important for you to edit the PDF content (Editable Text) or retain the look of the document (Searchable Image)? If this doesn’t make sense, you can read more about Editable Text vs Searchable Image on page 140 of the Infix PDF Editor user guide.
Note: you can also trigger this process on a document you’ve opened. Click Document and Recognise Text and follow the same instructions.
Correcting Text in the Scan
The software used to convert scanned images to text is called optical character recognition, or OCR. This is a standard term used in many different applications. Infix PDF Editor uses the same technology to convert the scanned analogue image to characters that can be edited digitally.
Editing Hidden Text
You then process scanned text using OCR Mode. You activate OCR Mode under Document->OCR Corrections.
OCR Mode is an alternative editing mode that reveals text content that would ordinarily be hidden from the reader. Hidden text is index for searching, so it’s important that you correct mistakes before distributing the document.