An infinite loop may occur when scanning some malformed ZIP files.
I introduced this issue in 96c00b6d80
with this line:
```c
// decrement coff by 1 to account for the increment at the end of the loop
coff -= 1;
```
The problem is that the function may return 0, which should
indicate that there are no more files. The result was that
`coff` would stay the same and the loop would repeat.
This issue is in 1.5 development and affects the 1.5.0 beta but
does not affect any production versions.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1534
Special thanks to Sophie0x2E for an initial fix, proposed in
https://github.com/Cisco-Talos/clamav/pull/1539
In review, I was uncomfortable with other existing code and
decided to to a more significant overhaul of the error handling
in the ZIP module.
In addition to cleanup, this commit has some functional changes:
- When parsing a central directory file header inside of
`parse_central_directory_file_header()`, it will now fail out if the
"extra length" or "comment length" fields would exceced the length of
the archive. That doesn't mean the associated local file header won't
be parsed later, but it won't use the central directory file header
to find it. Instead, the ZIP module will have to find the local file
header by searching for extra records not listed in the central directory.
This change was mostly to tidy up complex error handling.
- Add two FTM new signatures to identify split ZIP archives.
This signature identifies the first segment (first file) in a split or
spanned ZIP archive. It may also be found on a single-segment "split"
archive, depending on the ZIP archiver.
```
0:0:504b0708504b0304:ZIP (First segment split/spanned):CL_TYPE_ANY:CL_TYPE_ZIP
```
Practically speaking, this new signature makes it so ClamAV identifies
the file as a ZIP right away without having to rely on SFX_ZIP detection.
Extraction is then handled by the ZIP `cli_unzip` function rather than
extracting each with `cli_unzip_single` which handles SFX_ZIP entries.
Note: ClamAV isn't capable of finding additional files on disk to support
handling the additional segments. So it doesn't make any difference with
handling those other files.
This signature is for single-segment split/spanned archives, depending
on the ZIP archiver.
```
0:0:504b0303504b0304:ZIP (Single-segment split/spanned):CL_TYPE_ANY:CL_TYPE_ZIP
```
Like the first one, this also means we won't rely on SFX_ZIP detection
and will treat this files as regular ZIPs.
- Added a test file to verify that ClamAV can extract a single-file
"split" ZIP.
- Added a clamscan test with test files to verify that scanning a split
archive across two segments correctly extracts the properly formed zip
file entries. Sadly, we can't join the segments to extract everything.