Bryan Bishop
681af5c175
upgrade nosetests from python2.7
2020-09-07 09:25:12 -05:00
Bryan Bishop
906150e033
render markdown on pypi
...
fixes #54
2020-09-07 09:21:19 -05:00
Bryan Bishop
585a0ac3a4
Merge pull request #51 from ashwini0529/master
...
fix python3 installation error
2017-06-04 20:55:45 -05:00
Ashwini Purohit
5669e4e289
fixes python3 installation error
...
Fixes invalid syntax error
2017-06-05 06:30:56 +05:30
Bryan Bishop
c594ff41d6
version bump to: v0.0.16
2016-05-29 14:09:00 -05:00
Bryan Bishop
1a01757f44
README: fix typo in russian text
...
http://gnusha.org/logs/2016-02-25.log
2016-02-25 08:45:27 -06:00
Bryan Bishop
5cc682e2c5
Merge pull request #38 from fmap/pdfminer-api
...
PDFMiner made breaking interface changes
2013-12-06 15:27:42 -08:00
vi
e95374ec04
getobj can raise PDFObjectNotFound
2013-12-07 07:23:55 +08:00
vi
95a249d8ab
Package: use a version of PDFMiner since the interface change ( #37 ).
2013-12-07 07:23:47 +08:00
vi
380bc289b3
Adapt to PDFMiner's breaking interface changes ( #37 ).
2013-12-07 07:23:34 +08:00
Bryan Bishop
713776af67
version bump to: v0.0.15
2013-09-16 15:14:27 -05:00
Bryan Bishop
e9e0ea4467
Merge pull request #34 from kanzure/fixsciencemag
...
Fix another syntax error in sciencemag
2013-09-16 13:12:37 -07:00
Bryan Bishop
61e67d2c4a
Merge pull request #33 from kanzure/fixsetup
...
Fix setup.py to not have a syntax error
2013-09-16 13:12:29 -07:00
Bryan Bishop
28bf8f5825
fix another syntax error in sciencemag
...
How were these missed??
2013-09-16 15:11:42 -05:00
Bryan Bishop
1ff513389f
wow, how did setup.py stay like that for so long?
2013-09-16 15:09:59 -05:00
Bryan Bishop
cc7d14d173
WIP of "AdBlock for Science"
...
The purpose of adblock for science is to remove nasty ads from papers,
which at the moment means only papers from Science Magazine as published
by the American Association for the Advancement of Science (AAAS).
I am really annoyed that I have to write an ad blocker... for science
papers.
2013-07-19 21:31:30 -05:00
Bryan Bishop
71aaf23285
io.StringIO fallback for py3k
2013-07-19 21:27:06 -05:00
Bryan Bishop
528eae7e46
minor py3k-compat changes
2013-07-19 21:26:12 -05:00
Bryan Bishop
59a71a7cd3
use io.StringIO when py3k
2013-07-19 21:25:42 -05:00
Bryan Bishop
c3e590f22f
Revert "fixed self-referential package install and cleaned out __init__.py"
...
This reverts commit 2275565fb2
.
__init__.py is needed for ./bin/pdfparanoia to work.
Conflicts:
setup.py
2013-07-19 20:19:44 -05:00
Cathal Garvey
f3e4b74b69
Amended readme for those not using pip to install.
2013-07-14 11:10:15 +01:00
Cathal Garvey
6030778089
Made dependencies vary by version to select py3k port of pdfminer if using Py3k
2013-07-14 11:01:27 +01:00
Bryan Bishop
1070605316
use explicit imports
...
Mock tests will be much easier if explicit imports are used everywhere
instead of the previous format.
2013-07-09 01:47:18 -05:00
Bryan Bishop
cafb7330b6
bump version to 0.0.14
2013-07-09 01:35:23 -05:00
Bryan Bishop
13d388e7ee
Merge pull request #31 from delinquentme/master
...
Make pdfparanoia install on systems without pdfminer.
2013-07-08 23:35:52 -07:00
Carl Crott
2275565fb2
fixed self-referential package install and cleaned out __init__.py
2013-07-08 22:48:45 -07:00
Bryan Bishop
388d2b289e
README: write russian intro
2013-05-21 14:02:33 -05:00
Bryan Bishop
d54f6e826c
Merge pull request #27 from DonnchaC/rsc
...
Check PDF is from the RSC before cleaning.
2013-05-13 13:06:23 -07:00
Donncha O'Cearbhaill
c673d77ec6
Check PDF is from the RSC before cleaning
2013-05-13 21:01:52 +01:00
Bryan Bishop
404e3577e0
Merge pull request #26 from DonnchaC/rsc
...
Watermark removal for Royal Society of Chemistry.
2013-05-13 12:52:40 -07:00
Donncha O'Cearbhaill
18140d838d
Adding support for PDF's from pubs.rsc.org
2013-05-13 20:28:35 +01:00
Bryan Bishop
9d26a0aa01
Merge pull request #25 from semorrison/master
...
comparediffs, a tool to download, scrub, and compare PDFs
2013-05-02 11:39:25 -07:00
Scott Morrison
2ec1ca21a6
hash bang
2013-05-02 23:29:07 +10:00
Scott Morrison
11b59bd544
fixing README
2013-05-02 23:27:49 +10:00
Scott Morrison
702f2e2895
adding README for tests/diff/
2013-05-02 23:25:10 +10:00
Scott Morrison
74649a1a05
merging urls.denied back into urls
2013-05-02 23:18:11 +10:00
Scott Morrison
27ad746861
adding a few more URLs for testing
2013-05-02 23:16:15 +10:00
Scott Morrison
2fb3783dea
comparediffs seems to be working nicely
2013-05-02 22:45:34 +10:00
Scott Morrison
54b6ab070a
initial attempt to pairwise diff testing
2013-05-02 20:06:18 +10:00
Bryan Bishop
6abfe2a380
Merge pull request #23 from cathalgarvey/master
...
Updated terminal script to use argparse.
2013-03-24 23:12:57 -07:00
Cathal Garvey
db514ff744
Fixed a few bugs so reading from stdin now works. Involves a potentially costly recast of file contents as StringIO.
2013-03-21 23:49:03 +00:00
Cathal Garvey
95e92420c9
Modified the "pdfparanoia" script in bin/ so it uses Argparse and the "with" context statement.
...
As python 2.6 was already commented as a potential environment, there seemed little
reason to not use Argparse rather than a sys.argv popping system; argparse offers
automatically generated usage documentation and can offer useful errors when input
is incorrect.
The "with" context statement is also highly excellent and should be used wherever
legacy support for old-timers using 2.6 is not needed.
2013-03-21 23:37:34 +00:00
Bryan Bishop
0d1da12f71
README: increase explaining
2013-02-14 03:43:23 -06:00
Bryan Bishop
ee483ab986
Merge pull request #21 from zooko/verbose-option
...
Verbosity argument.
2013-02-14 01:39:19 -08:00
Zooko O'Whielacronx
503b8aead5
add -v -v mode which prints out the details (potentially sensitive, potentially bulky)
...
remove spie, which appears to do nothing
2013-02-13 21:08:49 +00:00
Zooko O'Whielacronx
9204b2e17e
fix up verbose printouts, don't print out large data
2013-02-13 20:56:33 +00:00
Zooko O'Whielacronx
56cc7719da
add a "--verbose" option that writes to stderr if it finds anything to omit
...
Also cleaned up some flakes noticed by pyflakes, and make the scrub() be @classmethod instead of @staticmethod so I could use the class for the verbose output.
caveats:
* there are no unit tests of this patch
* now your logs of your stderr have potentially sensitive information in them
* the implementation of arg parsing is very low-tech; (a *good* way to do arg parsing is the "argparse" module)
2013-02-13 19:58:47 +00:00
Bryan Bishop
caed396870
SPIE watermark removal
...
This is slightly broken because the SPIE plugin removes more than just
watermarks. For some reason it seems to also remove images and large
blocks of text from the paper. However, the object that is being removed
is tiny. In the unit testing sample, the removed object is pdf stream
55.
For now, SPIE is partially disabled until this is fixed. The problem
does not originate from the other plugins.
fixes #20
2013-02-11 23:52:59 -06:00
Bryan Bishop
9d7fd1dbb6
README: add command-line usage
2013-02-10 01:29:58 -06:00
Bryan Bishop
775b927b42
pdfparanoia command-line interface
2013-02-09 09:44:48 -06:00