fixing README
This commit is contained in:
parent
702f2e2895
commit
11b59bd544
|
@ -1,15 +1,17 @@
|
||||||
`comparediffs` provides a tools for downloading a PDF from two different sources, running pdfparanoia on the files, comparing the outputs byte-for-byte,
|
`comparediffs` provides a tool which
|
||||||
and reporting the results.
|
* downloads a PDF from two different sources,
|
||||||
|
* runs `pdfparanoia` on both files, and
|
||||||
|
* compares the outputs byte-for-byte.
|
||||||
|
|
||||||
Typical usage is to first establish two `ssh` tunnels to hosts with access to the literature, e.g. via
|
Typical usage is to first establish two `ssh` tunnels to hosts with access to the literature, e.g. via
|
||||||
`ssh -D 1080 host1` and `ssh -D 1081 host2`. You can then invoke `comparediffs` via
|
`ssh -D 1080 host1` and `ssh -D 1081 host2`. You can then invoke `comparediffs` via
|
||||||
|
|
||||||
./comparediffs localhost:1080 localhost:1081 < urls
|
./comparediffs localhost:1080 localhost:1081 < urls
|
||||||
|
|
||||||
where urls is a file containing one URL per line (e.g. the example file in this directory).
|
where `urls` is a file containing one URL per line (e.g. the example file in this directory).
|
||||||
|
|
||||||
`comparediffs` creates a subdirectory `pdf/`, in which is stores PDFs. It won't try to download the same PDF twice, so if you fix pdfparanoia you'll
|
`comparediffs` creates a subdirectory `pdf/`, in which is stores PDFs. It won't try to download the same PDF twice, so if you make changes to `pdfparanoia` you'll
|
||||||
need to clean out some or all of this subdirectory.
|
want to clean out some or all of this subdirectory.
|
||||||
|
|
||||||
It's easy to see which PDFs pdfparanoia failed on, as it leaves copies of the scrubbed files with suffixes `.1.cleaned.pdf` and `.2.cleaned.pdf`.
|
It's easy to see which PDFs `pdfparanoia` failed on afterwards, as it leaves copies of the scrubbed files with suffixes `.1.cleaned.pdf` and `.2.cleaned.pdf`.
|
||||||
When pdfparanoia succeeds (or isn't even needed, because the downloaded files were identical), the scrubbed files are removed.
|
When `pdfparanoia` succeeds (or isn't even needed, because the downloaded files were identical), the scrubbed files are removed.
|
||||||
|
|
Loading…
Reference in New Issue