mirror of
https://github.com/kanzure/pdfparanoia.git
synced 2024-06-17 08:09:51 +02:00
65 lines
1.2 KiB
Markdown
65 lines
1.2 KiB
Markdown
# pdfparanoia
|
|
|
|
pdfparanoia is a PDF watermark removal library for academic papers. Some publishers include private information like institution names, personal names, ip addresses, timestamps and other identifying information in watermarks on each page.
|
|
|
|
## Installing
|
|
|
|
Simple.
|
|
|
|
``` bash
|
|
sudo pip install pdfparanoia
|
|
```
|
|
|
|
or,
|
|
|
|
``` bash
|
|
sudo python setup.py install
|
|
```
|
|
|
|
pdfparanoia is written for python2.7+ or python 3.
|
|
You will also need to manually install "pdfminer" if you do not use pip to install pdfparanoia.
|
|
|
|
## Usage
|
|
|
|
``` python
|
|
import pdfparanoia
|
|
|
|
pdf = pdfparanoia.scrub(open("nmat91417.pdf", "rb"))
|
|
|
|
with open("output.pdf", "wb") as file_handler:
|
|
file_handler.write(pdf)
|
|
```
|
|
|
|
or from the shell,
|
|
|
|
``` bash
|
|
pdfparanoia --verbose input.pdf -o output.pdf
|
|
```
|
|
|
|
and,
|
|
|
|
``` bash
|
|
cat input.pdf | pdfparanoia > output.pdf
|
|
```
|
|
|
|
## Supported
|
|
|
|
* AIP
|
|
* IEEE
|
|
* JSTOR
|
|
* RSC
|
|
* SPIE (sort of)
|
|
|
|
## Changelog
|
|
|
|
* 0.0.13 - RSC
|
|
* 0.0.12 - SPIE
|
|
* 0.0.11 - pdfparanoia command-line interface. Use it by either piping in pdf data, or specifying a path to a pdf in the first argv slot.
|
|
* 0.0.10 - JSTOR
|
|
* 0.0.9 - AIP: better checks for false-positives; IEEE: remove stdout garbage.
|
|
* 0.0.8 - IEEE
|
|
|
|
## License
|
|
|
|
BSD.
|