1
0
Fork 0
mat2/doc/threat_model.md

101 lines
3.4 KiB
Markdown
Raw Permalink Normal View History

2018-04-01 15:36:45 +02:00
Threat Model
============
2018-06-19 23:39:06 +02:00
2018-04-01 15:36:45 +02:00
The Metadata Anonymisation Toolkit 2 adversary has a number
of goals, capabilities, and counter-attack types that can be
2019-11-28 03:15:20 +01:00
used to guide us towards a set of requirements for the mat2.
2018-04-01 15:36:45 +02:00
This is an overhaul of MAT's (the first iteration of the software) one.
Warnings
--------
Mat only removes standard metadata from your files, it does _not_:
- anonymise their content (the substance and the form)
2018-04-01 15:36:45 +02:00
- handle watermarking
2018-06-19 23:39:06 +02:00
- handle steganography nor homoglyphs
- handle stylometry
2018-04-01 15:36:45 +02:00
- handle any non-standard metadata field/system
- handle file-system related metadata
2018-04-01 15:36:45 +02:00
If you really want to be anonymous format that does not contain any
metadata, or better : use plain-text ASCII without trailing spaces.
2018-06-19 23:39:06 +02:00
And as usual, think twice before clicking.
2018-04-01 15:36:45 +02:00
Adversary
2018-06-19 23:39:06 +02:00
---------
2018-04-01 15:36:45 +02:00
* Goals:
- Identifying the source of the document, since a document
always has one. Who/where/when/how was a picture
taken, where was the document leaked from and by
whom, ...
- Identify the author; in some cases documents may be
anonymously authored or created. In these cases,
identifying the author is the goal.
- Identify the equipment/software used. If the attacker fails
to directly identify the author and/or source, his next
goal is to determine the source of the equipment used
to produce, copy, and transmit the document. This can
2018-06-19 23:39:06 +02:00
include the model of camera used to take a photo or a film,
which software was used to produce an office document, …
2018-04-01 15:36:45 +02:00
* Adversary Capabilities - Positioning
2018-06-19 23:39:06 +02:00
2018-04-01 15:36:45 +02:00
- The adversary created the document specifically for this
user. This is the strongest position for the adversary to
have. In this case, the adversary is capable of inserting
arbitrary, custom watermarks specifically for tracking
2019-11-28 03:15:20 +01:00
the user. In general, mat2 cannot defend against this
2018-06-19 23:39:06 +02:00
adversary, but we list it for completeness' sake.
2018-04-01 15:36:45 +02:00
- The adversary created the document for a group of users.
In this case, the adversary knows that they attempted to
limit distribution to a specific group of users. They may
or may not have watermarked the document for these
users, but they certainly know the format used.
2018-06-19 23:39:06 +02:00
- The adversary did not create the document, the weakest
position for the adversary to have. The file format is
(most of the time) standard, nothing custom is added:
2019-11-28 03:15:20 +01:00
mat2 must be able to remove all metadata from the file.
2018-06-19 23:39:06 +02:00
2018-04-01 15:36:45 +02:00
Requirements
2018-06-19 23:39:06 +02:00
------------
2018-04-01 15:36:45 +02:00
* Processing
2018-06-19 23:39:06 +02:00
2019-11-28 03:15:20 +01:00
- mat2 *should* avoid interactions with information.
2018-04-01 15:36:45 +02:00
Its goal is to remove metadata, and the user is solely
responsible for the information of the file.
2019-11-28 03:15:20 +01:00
- mat2 *must* warn when encountering an unknown
format. For example, in a zipfile, if mat2 encounters an
2018-04-01 15:36:45 +02:00
unknown format, it should warn the user, and ask if the
file should be added to the anonymised archive that is
produced.
2019-11-28 03:15:20 +01:00
- mat2 *must* not add metadata, since its purpose is to
2018-04-01 15:36:45 +02:00
anonymise files: every added items of metadata decreases
anonymity.
2019-11-28 03:15:20 +01:00
- mat2 *should* handle unknown/hidden metadata fields,
2018-04-01 15:36:45 +02:00
like proprietary extensions of open formats.
2019-11-28 03:15:20 +01:00
- mat2 *must not* fail silently. Upon failure,
mat2 *must not* modify the file in any way.
2019-11-28 03:15:20 +01:00
- mat2 *might* leak the fact that mat2 was used on the file,
since it might be uncommon for some file formats to come
without any kind of metadata, an adversary might suspect that
2019-11-28 03:15:20 +01:00
the user used mat2 on certain files.