Want to copy text from a scanned PDF file? It’s simple, but with the right tool.
Today, PDFs are one of the main types of documents used on the internet. It’s secure, easy for viewing, easy for sharing, and also takes little space on your hard drive. Awesome.
The number 1 problem of working with pdf comes into play when the need arises for you to make edits or copy the content on the pdf file – especially if it is a scanned pdf. Not being able to edit PDFs is great in terms of security but sucks as a collaborative format.
In this step by step tutorial, I’m going to show you four easy ways to extract text from any scanned pdf using the right tools, online or offline.
Read Also: How to open pdf in Word
How to Extract text from scanned pdf
It’s very easy to copy text from any pdf file except for a scanned document.
To copy text from scanned pdf, you first of all need to use an Optical Character Recognition (OCR) tool (onlineocr.net for example) to convert the document into an editable format such as Microsoft Word. Only then can you copy the content from the editable file. Plain and simple.
Let me now show you some of the best tools (OCR software) that you can use to copy text from any scanned pdf file.
NOTE: With all the options addressed in here, the quality of the output depends entirely on the quality of your pdf document. To get an accurate text from a scanned document, the text in the pdf shouldn’t be dark and blurry, it should be clear enough to read, and it should not also be flipped upside down. This way, you will get a very accurate text from your files.
With this in mind, let’s get started.
Option 1: Copy text from scanned pdf online
There are a gazillion online tools that can help you convert pdf to several different formats such as Word, Excel, or even PowerPoint. A lot of these tools don’t support scanned PDFs.
This is where onlineocr.net comes into play.
onlineocr.net is an online tool specialized in converting any scanned PDF file into a format that allows you to copy the text or even make edits to it.
NOTE: This option works on both Mac and Windows.
Obey the following step by step instructions on how to use this tool.
- Open your browser and go to onlineocr.net
You’ll be taken to a page with these beautiful and easy to use controls. Use these controls to upload your pdf file, select the language of the file and then the output format.
- Using the Select button, choose the scanned pdf file from your PC.
Wait for upload to compete.
- Specify the Language of the scanned document. This is to ensure accuracy in the conversion process.
- Also, select the output format you want. In most cases, Microsoft Word will be the ideal format. If your scanned pdf contains lots of tables, you should select Excel instead.
- Then click on the Convert button.
You’ll have to wait for the tool to perform its magic. It may take a while depending on the number of pages and file size of your pdf file. Small files convert much faster than big files.
When the tool successfully converts the file, you’ll see a download button that you can use to download the converted file to a format you choose earlier (Word or Excel or Plain Text).
You’ll also see a text box that contains the text of the scanned pdf. Thus, you can either copy the portion you want from this text box or download and use the output file as a Microsoft Word document.
- To copy the text, just select the portion you need and press Ctrl+C to copy. Then switch over to wherever you need it and press Ctrl+V to paste.
This is how you may copy text from scanned pdf online for free.
Option 2: Convert scanned PDF with Google Docs OCR
Another free option to extract and copy text from pdf is through the hidden Google Docs OCR.
Using google docs, you can edit and copy text from a scanned pdf with ease.
NOTE: Copying text from scanned pdf with Google Docs works on both Mac and Windows. However, it does not work on Mobile devices at the time of this writing.
Obey the following instructions on how to do it.
- Launch your browser and login to your Google drive account.
Everybody who owns a Gmail account also has a Google Drive account. If you don’t know how to get to your Google Drive account, simply log into your Gmail account and navigate to the top right corner of the windows, you’ll see this dotted square, click on it to show all the Google apps. Now click on the Drive icon. It opens in a new tab.
- Upload the scanned PDF file to your Google Drive.
On the top left corner, you’ll see the New button, click on it and select File upload from the drop-down. Navigate to the location of your scanned pdf file and upload it.
In no time you should see a message on the bottom right corner of your window informing you that your file has been uploaded.
- Right-click on the uploaded file and go to Open with > Google Docs.
The scanned document will open in Google Docs in a new tab.
Google Docs will then convert every text on the scanned pdf into an editable format.
You’ll have to wait for a while for the conversion process to complete. The fewer the pages the faster.
As seen in the screenshot below, Google Docs has converted the scanned pdf into an editable format.
When you switch back to your Google Drive tab, you’ll notice a new Google Docs version of the scanned document. You can download this document from there.
- Copy the text you need or download as Microsoft Word
Scroll to the portion of the document that has the text you need to copy, select the text and press Ctrl+C to copy. Switch over to where you need the text and press Ctrl+V to paste.
However, if you need all the text in the document, it is better to download it as a Microsoft Word document. This way, you’ll have a complete editable Word version of your scanned PDF.
To download, go to File > Download > Microsoft Word (.docs).
This is how you may extract the text from any PDF document using Google Docs.
NOTE: This same method can be used to extract text from images. The output will be a Google Docs file with the image as well as the text on the image. However, converting from PDF gives better output than images.
Option 3: Copy Text Using PDF Element
PDF Element by Wondershare is one of the leading pdf file processors that allows a simple way to create, edit and convert PDF documents on both Windows and Mac.
Once you get the software on your device, everything works completely offline.
Thus, to copy text from scanned documents using this software, go to pdf.wondershare.com and download PDF Element software (Free trial).
After downloading this software, follow the on-screen instruction to install and launch the app.
When the app is ready for use, obey the following instructions to extract text from any pdf file:
- Click on the Open File… button to select the scanned pdf file whose text you want to copy.
After opening the file in this app, you’ll see a notification bar at the top saying that your document is a scanned PDF and therefore need to be converted with OCR.
- Click the Perform OCR button.
- You may be asked to download the OCR component if it wasn’t included in your installation package. When that happens, hit the Click to Download button.
- Wait for the download to finish. This may take a while depending on your internet download speed.
- After the download is complete, you’ll be given this dialog that says that you can now turn any scanned document into editable or searchable files. Click on OK to proceed.
- PDF Element will now begin to extract all the text from the scanned document into an editable or searchable format. Wait for this process to complete.
- Now, you should be able to copy and even edit the text from the extracted format.
At this point, you can go to File>Save As to save the scanned pdf as an editable Word document.
This is how you may use PDF element to extract text from a scanned pdf file.
Option 4: Copy Text from Scanned PDF with Adobe Acrobat Pro
When it comes to anything concerning PDFs, scanned or not, Adobe Acrobat Pro is your safe bet.
The price is the only reason why I wouldn’t recommend this software for your pdf related tasks. Either than that, it is the elephant in the room and will continue to rule for a very long time.
This package provides everything you need to deal with PDFs. You can create, edit, and extract or copy text from any scanned pdf document without wasting much time.
If you don’t have Adobe Acrobat DC Pro installed on your system, you can download the free trial here. It’s free for 7 days only.
If you are still reading this, I’m pretty sure you have downloaded and installed the software. Or maybe it already exists on your computer. If that’s like you, obey the instructions below to copy text from any scanned pdf.
- Open the Scanned PDF using Adobe Acrobat DC Pro software.
- On the tools pane at the right, click the Edit PDF tool.
- In just a moment, the scanned pdf will be converted to an editable or searchable document.
Wen this happens, you can now freely copy the text on any portion of the document.
You can also save the file as a Microsoft Word document. This way, you’ll be able to edit the pdf content right inside Word.
Conclusion
All the four methods discussed in this article are very good ways one can easily extract and copy text from any scanned pdf document.
I will recommend the first two options if you don’t usually deal with pdf documents on a daily basis. It provides a quick way to get the job done without having to install any OCR software.
However, if your work or organization requires you to work with PDF files, I’ll recommend that you invest some money on either the PDF Element or Adobe Acrobat Pro. Both options are great and will only make your work so much simple and fun.
For any questions or suggestions, please let me know in the comments section below.