How to Read and Add Comments to a PDF Document
Explore how to read and add comments to specific keywords within PDF documents using Python and the PyMuPDF library. This lesson helps you efficiently insert, view, and manage comments directly in PDFs, which supports better collaboration and review processes in document handling.
We'll cover the following...
Introduction
Comments act like revision tools for people reviewing and exchanging PDF documents.
Conventionally, comments added to a document represent inquiries, ideas, or concerns about a particular section or keyword within this document.
In general, you can add comments to any PDF document, unless security constraints have been applied to the document that prohibit commenting.
Objective
When we need to communicate with our colleagues about the content in a PDF document, it’s more straightforward to insert your comments in the PDF itself, rather than formalizing and dispatching these comments in an email or through any other communication channels.
This lesson will lay out the steps required to read and put comments on a specific keyword in a PDF document, while using a command-line utility developed in the Python programming language.
Requirements
We need the following libraries to add comments to a PDF document:
PyMuPDF
This third-party Python library has been developed by Artifex Software Inc., and provides support for MuPDF. It can run on multiple platforms like Windows, Linux, and Mac.
Filetype
A dependency-free Python library allows concluding the type, as well as the MIME type, of a file or a buffer by checking their signatures.
| Library | Version |
|---|---|
| PyMuPDF | 1.18.9 |
| Filetype | 1.0.7 |
Code explanation
The comment_pdf is the main function of our utility. It can perform the following:
- Open the selected PDF document (Line 11).
- Iterate through its pages and ignore the pages unselected (Lines 14-19).
- Using the function
searchFor, search for the keyword (parametersearch_text) to put a comment on. This function will return a list representing the positions of the found instances of this keyword (Line 22). - Loop through the found instances of the keyword (Line 27).
- Enclose each found instance with a bounding box, dashed and colored in blue (Lines 29-31).
- Add the supplied comment to the found instance (Lines 34-38).
- Save the comment set (Line 40).
- Save the processed document to the output file (Line 43).
- Close the document.
- Display a summary of the executed process.
Test scenario
Let’s put a comment on a specified keyword in a sample PDF document.
Execute the code snippet below and look into the output generated:
We will notice that the value we searched for is enclosed by a dashed blue box, and that once we hover the mouse over it, the comment we specified is displayed. See below:
We highly encourage you to change the code snippet and develop your own test cases as well.
Conclusion
Comments become a necessity when developing documents conjointly with others. In short, they are used to emphasize areas that need attention.