jvoisin
565cb66d14
Minor simplification in how we're handling xml for office files
2018-07-19 22:55:08 +02:00
jvoisin
5a7c7f35f7
Remove print
from libmat, and use the logging
module instead
...
This should close #28
2018-07-10 21:30:38 +02:00
jvoisin
080d6769ca
Make pylint even happier
2018-07-09 01:11:44 +02:00
jvoisin
8c21006e6c
Fix some pep8 issues spotted by pyflakes
2018-07-08 22:40:36 +02:00
jvoisin
f49aa5cab7
Achieve 100% coverage!
2018-07-08 22:27:37 +02:00
jvoisin
ad3e7ccee8
Bump coverage for office files and fix some related crashes
2018-07-08 21:35:45 +02:00
jvoisin
ca01484126
Silence a mypy's stupid warning
2018-07-08 17:12:17 +02:00
jvoisin
f9bc022c96
Add defusedxml as an (optional) way to prevent XML-based attacks
...
Those attacks are DoS-only.
2018-07-08 17:07:26 +02:00
jvoisin
85455a4419
Fix a mistake in office file revisions handling
2018-07-07 18:05:54 +02:00
jvoisin
893f58554a
Improve a bit the formatting of the code thanks to pyflakes3
2018-07-02 00:22:05 +02:00
jvoisin
bee56a57ce
Remove docx revisions
2018-07-01 23:16:14 +02:00
jvoisin
02f7605ac1
MAT2 is now cleaning revisions from odt files!
2018-07-01 21:09:20 +02:00
jvoisin
80fc4ffb40
Remove the thumbnails from libreoffice files
2018-07-01 17:29:05 +02:00
jvoisin
177184ac67
Massively simplify how we're cleaning office files
2018-06-27 21:48:46 +02:00
jvoisin
5b38bd7ccd
Improve the reliability of the office parser
2018-06-21 23:18:59 +02:00
jvoisin
846a261465
Fix some linter warnings
2018-06-21 23:07:21 +02:00
jvoisin
09e748fa4c
Refactor how offices files are handled
...
- xml files are no longer considered harmless
- Factorization of the `remove_all` method for office files
- Explicit whitelist are used
- Blacklist are used to skip files completely
- Non-blacklisted files are _still cleaned_
- Unsupported files are still triggering an error
2018-06-21 23:02:41 +02:00
jvoisin
a89dae054a
Minor simplification of the office-related code
2018-06-21 21:24:53 +02:00
jvoisin
545887af98
Minor code simplification
2018-06-10 20:20:32 +02:00
jvoisin
7dad77a785
Make the parsing of office format's metadata more robust
2018-06-10 20:20:00 +02:00
jvoisin
8c7979aae3
Add some tests for non-supported embedded fileformats
2018-06-10 20:19:35 +02:00
jvoisin
6a1b0b31f0
Add more typing and use mypy in the CI
2018-06-04 23:20:30 +02:00
jvoisin
38fae60b8b
Rename some files to simplify packaging
...
- the `src` folder is now `libmat2`
- the `main.py` script is now `mat2.py`
2018-05-18 23:52:40 +02:00