2018-03-19 00:09:00 +01:00
|
|
|
```
|
2018-06-12 18:55:22 +02:00
|
|
|
_____ _____ _____ ___
|
2018-03-19 00:09:00 +01:00
|
|
|
| | _ |_ _|_ | Keep you data,
|
|
|
|
| | | | | | | | _| trash your meta!
|
|
|
|
|_|_|_|__|__| |_| |___|
|
2018-06-12 18:55:22 +02:00
|
|
|
|
2018-03-19 00:09:00 +01:00
|
|
|
```
|
|
|
|
|
2018-04-23 00:11:34 +02:00
|
|
|
This software is currently in **beta**, please don't use it for anything
|
|
|
|
critical.
|
|
|
|
|
2018-05-14 22:59:42 +02:00
|
|
|
# Metadata and privacy
|
|
|
|
|
2018-06-12 18:55:22 +02:00
|
|
|
Metadata consist of information that characterizes data.
|
|
|
|
Metadata are used to provide documentation for data products.
|
|
|
|
In essence, metadata answer who, what, when, where, why, and how about
|
|
|
|
every facet of the data that are being documented.
|
2018-05-14 22:59:42 +02:00
|
|
|
|
2018-06-12 18:55:22 +02:00
|
|
|
Metadata within a file can tell a lot about you.
|
|
|
|
Cameras record data about when a picture was taken and what
|
|
|
|
camera was used. Office documents like PDF or Office automatically adds
|
|
|
|
author and company information to documents and spreadsheets.
|
|
|
|
Maybe you don't want to disclose those information on the web.
|
2018-05-14 22:59:42 +02:00
|
|
|
|
|
|
|
This is precisely the job of MAT2: getting rid, as much as possible, of
|
|
|
|
metadata.
|
2018-03-27 21:15:30 +02:00
|
|
|
|
2018-04-01 01:06:56 +02:00
|
|
|
# Requirements
|
|
|
|
|
|
|
|
- `python3-mutagen` for audio support
|
|
|
|
- `python3-gi-cairo` and `gir1.2-poppler-0.18` for PDF support
|
|
|
|
- `gir1.2-gdkpixbuf-2.0` for images support
|
|
|
|
- `libimage-exiftool-perl` for everything else
|
|
|
|
|
2018-04-14 21:35:45 +02:00
|
|
|
Please note that MAT2 requires at least Python3.5, meaning that it
|
2018-06-12 18:59:03 +02:00
|
|
|
doesn't run on [Debian Jessie](https://packages.debian.org/jessie/python3),
|
2018-03-19 00:09:00 +01:00
|
|
|
|
2018-06-13 18:49:44 +02:00
|
|
|
# Running the test suite
|
2018-03-19 00:09:00 +01:00
|
|
|
|
|
|
|
```bash
|
|
|
|
$ python3 -m unittest discover -v
|
|
|
|
```
|
2018-04-03 21:37:46 +02:00
|
|
|
|
2018-07-01 23:35:04 +02:00
|
|
|
# How to use MAT2
|
2018-05-14 22:59:42 +02:00
|
|
|
|
|
|
|
```bash
|
2018-07-09 00:17:59 +02:00
|
|
|
usage: mat2 [-h] [-v] [-l] [-s | -L] [files [files ...]]
|
2018-07-01 23:35:04 +02:00
|
|
|
|
|
|
|
Metadata anonymisation toolkit 2
|
|
|
|
|
|
|
|
positional arguments:
|
|
|
|
files
|
|
|
|
|
|
|
|
optional arguments:
|
|
|
|
-h, --help show this help message and exit
|
|
|
|
-v, --version show program's version number and exit
|
|
|
|
-l, --list list all supported fileformats
|
|
|
|
-s, --show list all the harmful metadata of a file without removing
|
|
|
|
them
|
|
|
|
-L, --lightweight remove SOME metadata
|
2018-05-14 22:59:42 +02:00
|
|
|
```
|
|
|
|
|
2018-07-09 00:17:59 +02:00
|
|
|
# Notes about detecting metadata
|
|
|
|
|
|
|
|
While MAT2 is doing its very best to display metadata when the `--show` flag is
|
|
|
|
passed, it doesn't mean that a file is clean from any metadata if MAT2 doesn't
|
|
|
|
show any. There is no reliable way to detect every single possible metadata for
|
|
|
|
complex file formats.
|
|
|
|
|
|
|
|
This is why you shouldn't rely on metadata's presence to decide if your file must
|
|
|
|
be cleaned or not.
|
|
|
|
|
2018-06-13 18:49:44 +02:00
|
|
|
# Related software
|
2018-04-03 21:37:46 +02:00
|
|
|
|
2018-07-09 00:17:59 +02:00
|
|
|
- The first iteration of [MAT](https://mat.boum.org)
|
2018-04-03 21:37:46 +02:00
|
|
|
- [Exiftool](https://sno.phy.queensu.ca/~phil/exiftool/mat)
|
|
|
|
- [pdf-redact-tools](https://github.com/firstlookmedia/pdf-redact-tools), that
|
2018-04-23 00:11:34 +02:00
|
|
|
tries to deal with *printer dots* too.
|
2018-04-03 21:37:46 +02:00
|
|
|
- [pdfparanoia](https://github.com/kanzure/pdfparanoia), that removes
|
|
|
|
watermarks from PDF.
|
2018-05-14 22:59:42 +02:00
|
|
|
|
2018-06-07 00:09:53 +02:00
|
|
|
# Contact
|
|
|
|
|
|
|
|
If possible, use the [issues system](https://0xacab.org/jvoisin/mat2/issues).
|
|
|
|
If you think that a more private contact is needed (eg. for reporting security issues),
|
|
|
|
you can email Julien (jvoisin) Voisin at `julien.voisin+mat@dustri.org`,
|
|
|
|
using the gpg key `9FCDEE9E1A381F311EA62A7404D041E8171901CC`.
|
|
|
|
|
2018-05-14 22:59:42 +02:00
|
|
|
# License
|
|
|
|
|
|
|
|
This program is free software: you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU Lesser General Public License as published by
|
|
|
|
the Free Software Foundation, either version 3 of the License, or
|
|
|
|
(at your option) any later version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU Lesser General Public License
|
|
|
|
along with this program. If not, see <http://www.gnu.org/licenses/>.
|
|
|
|
|
|
|
|
Copyright 2018 Julien (jvoisin) Voisin <julien.voisin+mat2@dustri.org>
|
|
|
|
|
|
|
|
# Thanks
|
|
|
|
|
2018-06-12 18:59:51 +02:00
|
|
|
MAT2 wouldn't exist without:
|
2018-05-14 22:59:42 +02:00
|
|
|
|
2018-06-04 23:50:55 +02:00
|
|
|
- the [Google Summer of Code](https://summerofcode.withgoogle.com/);
|
|
|
|
- the fine people from [Tails]( https://tails.boum.org);
|
2018-05-14 22:59:42 +02:00
|
|
|
- friends
|
|
|
|
|
|
|
|
Many thanks to them!
|
|
|
|
|