parent
6e52661cfb
commit
697cb36b81
13 changed files with 58 additions and 58 deletions
|
@ -4,7 +4,7 @@ Implementation notes
|
|||
Lightweight cleaning mode
|
||||
-------------------------
|
||||
|
||||
Due to *popular* request, MAT2 is providing a *lightweight* cleaning mode,
|
||||
Due to *popular* request, mat2 is providing a *lightweight* cleaning mode,
|
||||
that only cleans the superficial metadata of your file, but not
|
||||
the ones that might be in **embedded** resources. Like for example,
|
||||
images in a PDF or an office document.
|
||||
|
@ -19,7 +19,7 @@ are entirely removed.
|
|||
deleted. For example journalists that are editing a document to erase
|
||||
mentions sources mentions.
|
||||
|
||||
- Or they are aware of it, and will likely not expect MAT2 to be able to keep
|
||||
- Or they are aware of it, and will likely not expect mat2 to be able to keep
|
||||
the revisions, that are basically traces about how, when and who edited the
|
||||
document.
|
||||
|
||||
|
@ -27,15 +27,15 @@ are entirely removed.
|
|||
Race conditions
|
||||
---------------
|
||||
|
||||
MAT2 does its very best to avoid crashing at runtime. This is why it's checking
|
||||
if the file is valid __at parser creation__. MAT2 doesn't take any measure to
|
||||
mat2 does its very best to avoid crashing at runtime. This is why it's checking
|
||||
if the file is valid __at parser creation__. mat2 doesn't take any measure to
|
||||
ensure that the file is not changed between the time the parser is
|
||||
instantiated, and the call to clean or show the metadata.
|
||||
|
||||
Symlink attacks
|
||||
---------------
|
||||
|
||||
MAT2 output predictable filenames (like yourfile.jpg.cleaned).
|
||||
mat2 output predictable filenames (like yourfile.jpg.cleaned).
|
||||
This may lead to symlink attack. Please check if you OS prevent
|
||||
against them
|
||||
|
||||
|
@ -65,10 +65,10 @@ didn't remove any *deep metadata*, like the ones in embedded pictures. This was
|
|||
on of the reason MAT was abandoned: the absence of satisfying solution to
|
||||
handle PDF. But apparently, people are ok with [pdf redact
|
||||
tools](https://github.com/firstlookmedia/pdf-redact-tools), that simply
|
||||
transform the PDF into images. So this is what's MAT2 is doing too.
|
||||
transform the PDF into images. So this is what's mat2 is doing too.
|
||||
|
||||
Of course, it would be possible to detect images in PDf file, and process them
|
||||
with MAT2, but since a PDF can contain a lot of things, like images, videos,
|
||||
with mat2, but since a PDF can contain a lot of things, like images, videos,
|
||||
javascript, pdf, blobs, … this is the easiest and safest way to clean them.
|
||||
|
||||
Images handling
|
||||
|
@ -81,7 +81,7 @@ XML attacks
|
|||
-----------
|
||||
|
||||
Since our threat model conveniently excludes files crafted to specifically
|
||||
bypass MAT2, fileformats containing harmful XML are out of our scope.
|
||||
But since MAT2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities)
|
||||
bypass mat2, fileformats containing harmful XML are out of our scope.
|
||||
But since mat2 is using [etree](https://docs.python.org/3/library/xml.html#xml-vulnerabilities)
|
||||
to process XML, it's "only" vulnerable to DoS, and not memory corruption:
|
||||
odds are that the user will notice that the cleaning didn't succeed.
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH MAT2 "1" "May 2019" "MAT2 0.9.0" "User Commands"
|
||||
.TH mat2 "1" "May 2019" "mat2 0.9.0" "User Commands"
|
||||
|
||||
.SH NAME
|
||||
mat2 \- the metadata anonymisation toolkit 2
|
||||
|
@ -32,7 +32,7 @@ show program's version number and exit
|
|||
list all supported fileformats
|
||||
.TP
|
||||
\fB\-\-check\-dependencies\fR
|
||||
check if MAT2 has all the dependencies it needs
|
||||
check if mat2 has all the dependencies it needs
|
||||
.TP
|
||||
\fB\-V\fR, \fB\-\-verbose\fR
|
||||
show more verbose status information
|
||||
|
@ -41,7 +41,7 @@ show more verbose status information
|
|||
how to handle unknown members of archive-style files (policy should be one of: abort, omit, keep)
|
||||
.TP
|
||||
\fB\-s\fR, \fB\-\-show\fR
|
||||
list harmful metadata detectable by MAT2 without
|
||||
list harmful metadata detectable by mat2 without
|
||||
removing them
|
||||
.TP
|
||||
\fB\-L\fR, \fB\-\-lightweight\fR
|
||||
|
|
|
@ -3,7 +3,7 @@ Threat Model
|
|||
|
||||
The Metadata Anonymisation Toolkit 2 adversary has a number
|
||||
of goals, capabilities, and counter-attack types that can be
|
||||
used to guide us towards a set of requirements for the MAT2.
|
||||
used to guide us towards a set of requirements for the mat2.
|
||||
|
||||
This is an overhaul of MAT's (the first iteration of the software) one.
|
||||
|
||||
|
@ -53,7 +53,7 @@ Adversary
|
|||
user. This is the strongest position for the adversary to
|
||||
have. In this case, the adversary is capable of inserting
|
||||
arbitrary, custom watermarks specifically for tracking
|
||||
the user. In general, MAT2 cannot defend against this
|
||||
the user. In general, mat2 cannot defend against this
|
||||
adversary, but we list it for completeness' sake.
|
||||
|
||||
- The adversary created the document for a group of users.
|
||||
|
@ -65,7 +65,7 @@ Adversary
|
|||
- The adversary did not create the document, the weakest
|
||||
position for the adversary to have. The file format is
|
||||
(most of the time) standard, nothing custom is added:
|
||||
MAT2 must be able to remove all metadata from the file.
|
||||
mat2 must be able to remove all metadata from the file.
|
||||
|
||||
|
||||
Requirements
|
||||
|
@ -73,28 +73,28 @@ Requirements
|
|||
|
||||
* Processing
|
||||
|
||||
- MAT2 *should* avoid interactions with information.
|
||||
- mat2 *should* avoid interactions with information.
|
||||
Its goal is to remove metadata, and the user is solely
|
||||
responsible for the information of the file.
|
||||
|
||||
- MAT2 *must* warn when encountering an unknown
|
||||
format. For example, in a zipfile, if MAT2 encounters an
|
||||
- mat2 *must* warn when encountering an unknown
|
||||
format. For example, in a zipfile, if mat2 encounters an
|
||||
unknown format, it should warn the user, and ask if the
|
||||
file should be added to the anonymised archive that is
|
||||
produced.
|
||||
|
||||
- MAT2 *must* not add metadata, since its purpose is to
|
||||
- mat2 *must* not add metadata, since its purpose is to
|
||||
anonymise files: every added items of metadata decreases
|
||||
anonymity.
|
||||
|
||||
- MAT2 *should* handle unknown/hidden metadata fields,
|
||||
- mat2 *should* handle unknown/hidden metadata fields,
|
||||
like proprietary extensions of open formats.
|
||||
|
||||
- MAT2 *must not* fail silently. Upon failure,
|
||||
MAT2 *must not* modify the file in any way.
|
||||
- mat2 *must not* fail silently. Upon failure,
|
||||
mat2 *must not* modify the file in any way.
|
||||
|
||||
- MAT2 *might* leak the fact that MAT2 was used on the file,
|
||||
- mat2 *might* leak the fact that mat2 was used on the file,
|
||||
since it might be uncommon for some file formats to come
|
||||
without any kind of metadata, an adversary might suspect that
|
||||
the user used MAT2 on certain files.
|
||||
the user used mat2 on certain files.
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue