mirror of
https://github.com/kanzure/pdfparanoia.git
synced 2024-12-04 23:15:52 +01:00
47bc734318
Some publishers generate pdfs with the watermarks inside the text of a page, in which case the object needs to be replaced. This deflates the object and uses plaintext instead. While this increases the size of the pdf, it is also effective for removing watermarks from the stream. |
||
---|---|---|
pdfparanoia | ||
tests | ||
.gitignore | ||
Makefile | ||
MANIFEST.in | ||
README.md | ||
requirements.txt | ||
setup.py |
pdfparanoia
pdfparanoia is a PDF watermark removal library for academic papers.
Installing
Simple.
sudo pip install pdfparanoia
or,
sudo python setup.py install
Usage
import pdfparanoia
pdf = pdfparanoia.scrub(open("nmat91417.pdf", "rb"))
file_handler = open("output.pdf", "wb")
file_handler.write(pdf)
file_handler.close()
Changelog
- 0.0.9 - AIP: better checks for false-positives; IEEE: remove stdout garbage.
- 0.0.8 - ieee support
- 0.0.1 - initial commit
License
BSD.