What is a PDF?
“Portable Document Format” or PDF is a file format which allows the author to preserve her file in its original form, complete with text, images, and other formatting features. By its nature, PDF is not a file format conducive to editing, but rather for documents intended for final distribution.
This can make doing translations from PDFs complicated despite various types of software and strategies for working around the semi-permanent nature of these files.
Working with PDFs
You may come across either of the two types of PDFs: application-generated and scanned PDFs.
Working with the former type is much less complicated, as the document was originally created with another computer application and then converted. You will be able to extract lines, paragraphs, pages, and entire sections of the document so as to save it as a Word document.
If you receive a scanned PDF, your work becomes more complicated, as these are images and therefore cannot be edited. You can purchase OCR (optical character recognition) software such as ABBYY FineReader, Textbridge, or ReadIris which will read the image and then convert it into text.
Whether you are working from application-generated or scanned PDFs, preserving the formatting of the original document is a challenge, especially if the document is rich in images and tables. Once you either extract or convert text, you will be able to manipulate the Word or Rich Text document so as to match the formatting of the PDF, but this is time consuming and not very feasible, especially if your contract with the client doesn’t reflect the extra time and effort necessary to return a perfectly formatted finished document.
If it’s important to your client that you produce a translation formatted exactly as the original was, request that they send the source files rather than a PDF. This will ensure that you are able to efficiently provide the client with a well-formatted translation.
You can then convert the finished document into a PDF for the client or return the Word or Rich Text document to them so that their graphic designers can lay out and format the piece as they wish.