Commit Graph

6 Commits

Author SHA1 Message Date
Bryan Bishop caed396870 SPIE watermark removal
This is slightly broken because the SPIE plugin removes more than just
watermarks. For some reason it seems to also remove images and large
blocks of text from the paper. However, the object that is being removed
is tiny. In the unit testing sample, the removed object is pdf stream
55.

For now, SPIE is partially disabled until this is fixed. The problem
does not originate from the other plugins.

fixes #20
2013-02-11 23:52:59 -06:00
Bryan Bishop e108a43e26 make eraser handle more pdf formats 2013-02-07 03:56:18 -06:00
Bryan Bishop 47bc734318 replace_object_with - alternative removal method
Some publishers generate pdfs with the watermarks inside the text of a
page, in which case the object needs to be replaced. This deflates the
object and uses plaintext instead. While this increases the size of the
pdf, it is also effective for removing watermarks from the stream.
2013-02-06 17:27:12 -06:00
Bryan Bishop 8eb8797eeb support pdf formats with whitespace line endings
JSTOR pdfs have whitespace at the end of each line in their pdfs. Though
their watermarks are not yet removable, this supports parsing their
files in the future or any other publisher that does similar things.

see #1
2013-02-05 19:07:28 -06:00
Bryan Bishop 14f1439c76 ieee watermark removal 2013-02-05 04:49:56 -06:00
Bryan Bishop d8fc6c1d8f initial commit 2013-02-05 03:10:14 -06:00