Antoine Tenart
0e3c2c9b1b
libmat2: audio: not all id3 types have a text attribute
...
Not all id3 types have a text attribute (such as mutagen.id3.APIC or
mutagen.id3.UFID). This leads to the get_meta helper to crash when
trying to access the text attribute of an object which does not have it.
Fixes it by checking the text attribute is available before accessing
it.
Signed-off-by: Antoine Tenart <antoine.tenart@ack.tf>
2019-03-23 00:32:44 +01:00
Brolf
5ac91cd4f9
Refactor {black,white}list into {block,allow}list
...
Closes #96
2019-03-05 23:13:42 +00:00
georg
c3f097a82b
fix typo
2019-03-01 22:00:23 +00:00
jvoisin
55214206b5
Improve the previous commit
...
- More tests
- More documentation
- Minor code cleanup
2019-02-27 23:53:07 +01:00
jvoisin
73d2966e8c
Improve epub support
2019-02-27 23:04:38 +01:00
jvoisin
eb2e702f37
Document the previous commit
2019-02-25 15:37:44 +01:00
jvoisin
545dccc352
In archive-based formats, the mimetype
file comes first
...
This should improve epub compatibility,
along with other formats as a side-effect
2019-02-24 23:32:32 +01:00
jvoisin
524bae5972
<title> is also an html metadata
2019-02-23 20:47:26 +01:00
jvoisin
c757a9b7ef
Fix a bug in css cleaning
...
It's not mandatory to actually have a comment inside
comment delimiter, like `/**/`.
2019-02-23 20:21:11 +01:00
jvoisin
02ff21b158
Implement epub support
2019-02-20 16:28:11 -08:00
jvoisin
a81b7658a8
Make the mandatory metadata warning generic
...
This should close #95 .
2019-02-10 21:46:13 +01:00
jvoisin
6e63e03b86
Streamline a bit the previous commit
2019-02-09 15:23:16 +01:00
Poncho
a71488d459
bind mount /etc/ld.so.cache to the sandbox
...
without /etc/ld.so.cache available in the sandbox, tests fail on gentoo with:
/usr/bin/ffmpeg: error while loading shared libraries: libstdc++.so.6:
cannot open shared object file: No such file or directory
2019-02-09 09:49:51 +01:00
jvoisin
6ef6aaa222
Improve a bit get_meta for libreoffice files
2019-02-08 23:23:56 +01:00
jvoisin
6cc034e81b
Add support for html files
2019-02-08 23:05:18 +01:00
jvoisin
e1dd439fc8
Use of the archive refactoring for the office documents too
2019-02-07 22:19:37 +01:00
jvoisin
b9a62d798a
Refactor a bit office get_meta handling
...
This should make easier to get more metadata from
archive-based file formats.
2019-02-04 00:31:26 +01:00
jvoisin
433609f8ea
Implement .gif support
2019-02-03 21:01:58 +01:00
intrigeri
e8c1bb0e3c
Whenever possible, use bwrap for subprocesses
...
This should closes #90
2019-02-03 19:18:41 +01:00
jvoisin
8e84ba547a
Add support for wmv
2019-02-02 19:19:36 +01:00
jvoisin
04bb8c8ccf
Add mp4 support
2018-10-28 07:41:04 -07:00
jvoisin
3a070b0ab7
Add support for zip files
2018-10-25 11:56:46 +02:00
jvoisin
283e5e5787
Improve archive-based parser's robustness against corrupted embedded files
2018-10-25 11:56:12 +02:00
jvoisin
513d897ea0
Implement get_meta() for archives
2018-10-25 11:29:50 +02:00
jvoisin
5a9dc388ad
Minor refactorisation of how we're checking for exiftool's presence
2018-10-25 11:05:06 +02:00
jvoisin
fe885babee
Implement lightweight cleaning for jpg
2018-10-24 19:35:07 +02:00
jvoisin
9a81b3adfd
Improve type annotation coverage
2018-10-23 16:32:28 +02:00
jvoisin
f1a071d460
Implement lightweight cleaning for png and tiff
2018-10-23 16:22:11 +02:00
jvoisin
38df679a88
Optimize the handling of problematic files
2018-10-23 13:49:58 +02:00
jvoisin
44f267a596
Improve problematic filenames support
2018-10-22 16:56:05 +02:00
jvoisin
83389a63e9
Test mat2's reliability wrt. corrupted video files
2018-10-22 13:42:04 +02:00
jvoisin
e70ea811c9
Implement support for .avi files, via ffmpeg
...
- This commit introduces optional dependencies (namely ffmpeg):
mat2 will spit a warning when trying to process an .avi file
if ffmpeg isn't installed.
- Since metadata are obtained via exiftool, this commit
also refactors a bit our exfitool wrapper.
2018-10-22 12:58:01 +02:00
jvoisin
2ba38dd2a1
Bump mypy typing coverage
2018-10-12 14:32:09 +02:00
jvoisin
b832a59414
Refactor lightweight mode implementation
2018-10-12 11:49:24 +02:00
jvoisin
b9dbd12ef9
Implement recursive metadata for FLAC files
...
Since FLAC files can contain covers, it makes sense
to parse their metadata
2018-10-11 19:52:47 +02:00
jvoisin
b2e153b69c
Delete pictures of FLAC files
2018-10-11 18:15:11 +02:00
jvoisin
0d25b18d26
Improve both the typing and the comments
2018-10-05 17:07:58 +02:00
jvoisin
d0f3534eff
Hide unsupported extensions in mat2 -l
2018-10-05 12:43:21 +02:00
jvoisin
8e98593b02
Trash word/people.xml in office files
2018-10-04 16:28:20 +02:00
georg
34fbd633fd
libmat2: fix shebang
...
Relates 0a2a398c9c
2018-10-03 18:38:28 +00:00
jvoisin
5a5c642a46
Don't break office files for MS Office
...
We didn't take the whitelist into account while
removing dangling files from [Content_types].xml
2018-10-03 16:38:05 +02:00
jvoisin
1b356b8c6f
Improve mat2's cli reliability
...
- Replace some class members by instance members
- Don't thread the cleaning process anymore for now
2018-10-03 15:22:36 +02:00
jvoisin
c67bbafb2c
Use [Content_Types].xml to improve MS Office coverage
2018-10-02 11:55:42 -07:00
georg
5b606f939d
fix typo
2018-10-02 16:01:24 +00:00
jvoisin
652b8e519f
Files processed via MAT2 are now accepted without warnings by MS Office
2018-10-01 12:25:37 -07:00
jvoisin
81a3881aa4
Please mypy
2018-09-30 19:55:17 +02:00
jvoisin
e342671ead
Remove dangling references in MS Office's [Content_types].xml
2018-09-30 19:53:18 +02:00
jvoisin
719cdf20fa
Second pass of minor formatting
2018-09-24 20:15:07 +02:00
jvoisin
2e243355f5
Fix some minor formatting issues
2018-09-24 19:50:24 +02:00
jvoisin
174d4a0ac0
Implement rsid stripping for office files
...
MS Office XML rsid is a "unique identifier used to track the editing session
when the physical character representing this section mark was last formatted."
See the following links for details:
- https://msdn.microsoft.com/en-us/library/office/documentformat.openxml.wordprocessing.previoussectionproperties.rsidrpr.aspx
- https://blogs.msdn.microsoft.com/brian_jones/2006/12/11/whats-up-with-all-those-rsids/ .
2018-09-24 18:03:59 +02:00