1
0
Fork 0
mirror of synced 2025-07-03 20:07:28 +02:00

This is mat2, not MAT2

Closes #131
This commit is contained in:
georg 2019-11-28 02:15:20 +00:00 committed by jvoisin
parent 6e52661cfb
commit 697cb36b81
13 changed files with 58 additions and 58 deletions

View file

@ -4,7 +4,7 @@ Implementation notes
Lightweight cleaning mode
-------------------------
Due to *popular* request, MAT2 is providing a *lightweight* cleaning mode,
Due to *popular* request, mat2 is providing a *lightweight* cleaning mode,
that only cleans the superficial metadata of your file, but not
the ones that might be in **embedded** resources. Like for example,
images in a PDF or an office document.
@ -19,7 +19,7 @@ are entirely removed.
deleted. For example journalists that are editing a document to erase
mentions sources mentions.
- Or they are aware of it, and will likely not expect MAT2 to be able to keep
- Or they are aware of it, and will likely not expect mat2 to be able to keep
the revisions, that are basically traces about how, when and who edited the
document.
@ -27,15 +27,15 @@ are entirely removed.
Race conditions
---------------
MAT2 does its very best to avoid crashing at runtime. This is why it's checking
if the file is valid __at parser creation__. MAT2 doesn't take any measure to
mat2 does its very best to avoid crashing at runtime. This is why it's checking
if the file is valid __at parser creation__. mat2 doesn't take any measure to
ensure that the file is not changed between the time the parser is
instantiated, and the call to clean or show the metadata.
Symlink attacks
---------------
MAT2 output predictable filenames (like yourfile.jpg.cleaned).
mat2 output predictable filenames (like yourfile.jpg.cleaned).
This may lead to symlink attack. Please check if you OS prevent
against them
@ -65,10 +65,10 @@ didn't remove any *deep metadata*, like the ones in embedded pictures. This was
on of the reason MAT was abandoned: the absence of satisfying solution to
handle PDF. But apparently, people are ok with [pdf redact
tools](https://github.com/firstlookmedia/pdf-redact-tools), that simply
transform the PDF into images. So this is what's MAT2 is doing too.
transform the PDF into images. So this is what's mat2 is doing too.
Of course, it would be possible to detect images in PDf file, and process them
with MAT2, but since a PDF can contain a lot of things, like images, videos,
with mat2, but since a PDF can contain a lot of things, like images, videos,
javascript, pdf, blobs, … this is the easiest and safest way to clean them.
Images handling
@ -81,7 +81,7 @@ XML attacks
-----------
Since our threat model conveniently excludes files crafted to specifically
bypass MAT2, fileformats containing harmful XML are out of our scope.
But since MAT2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities)
bypass mat2, fileformats containing harmful XML are out of our scope.
But since mat2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities)
to process XML, it's "only" vulnerable to DoS, and not memory corruption:
odds are that the user will notice that the cleaning didn't succeed.

View file

@ -1,4 +1,4 @@
.TH MAT2 "1" "May 2019" "MAT2 0.9.0" "User Commands"
.TH mat2 "1" "May 2019" "mat2 0.9.0" "User Commands"
.SH NAME
mat2 \- the metadata anonymisation toolkit 2
@ -32,7 +32,7 @@ show program's version number and exit
list all supported fileformats
.TP
\fB\-\-check\-dependencies\fR
check if MAT2 has all the dependencies it needs
check if mat2 has all the dependencies it needs
.TP
\fB\-V\fR, \fB\-\-verbose\fR
show more verbose status information
@ -41,7 +41,7 @@ show more verbose status information
how to handle unknown members of archive-style files (policy should be one of: abort, omit, keep)
.TP
\fB\-s\fR, \fB\-\-show\fR
list harmful metadata detectable by MAT2 without
list harmful metadata detectable by mat2 without
removing them
.TP
\fB\-L\fR, \fB\-\-lightweight\fR

View file

@ -3,7 +3,7 @@ Threat Model
The Metadata Anonymisation Toolkit 2 adversary has a number
of goals, capabilities, and counter-attack types that can be
used to guide us towards a set of requirements for the MAT2.
used to guide us towards a set of requirements for the mat2.
This is an overhaul of MAT's (the first iteration of the software) one.
@ -53,7 +53,7 @@ Adversary
user. This is the strongest position for the adversary to
have. In this case, the adversary is capable of inserting
arbitrary, custom watermarks specifically for tracking
the user. In general, MAT2 cannot defend against this
the user. In general, mat2 cannot defend against this
adversary, but we list it for completeness' sake.
- The adversary created the document for a group of users.
@ -65,7 +65,7 @@ Adversary
- The adversary did not create the document, the weakest
position for the adversary to have. The file format is
(most of the time) standard, nothing custom is added:
MAT2 must be able to remove all metadata from the file.
mat2 must be able to remove all metadata from the file.
Requirements
@ -73,28 +73,28 @@ Requirements
* Processing
- MAT2 *should* avoid interactions with information.
- mat2 *should* avoid interactions with information.
Its goal is to remove metadata, and the user is solely
responsible for the information of the file.
- MAT2 *must* warn when encountering an unknown
format. For example, in a zipfile, if MAT2 encounters an
- mat2 *must* warn when encountering an unknown
format. For example, in a zipfile, if mat2 encounters an
unknown format, it should warn the user, and ask if the
file should be added to the anonymised archive that is
produced.
- MAT2 *must* not add metadata, since its purpose is to
- mat2 *must* not add metadata, since its purpose is to
anonymise files: every added items of metadata decreases
anonymity.
- MAT2 *should* handle unknown/hidden metadata fields,
- mat2 *should* handle unknown/hidden metadata fields,
like proprietary extensions of open formats.
- MAT2 *must not* fail silently. Upon failure,
MAT2 *must not* modify the file in any way.
- mat2 *must not* fail silently. Upon failure,
mat2 *must not* modify the file in any way.
- MAT2 *might* leak the fact that MAT2 was used on the file,
- mat2 *might* leak the fact that mat2 was used on the file,
since it might be uncommon for some file formats to come
without any kind of metadata, an adversary might suspect that
the user used MAT2 on certain files.
the user used mat2 on certain files.