clamav/unit_tests/input
Val S. a77a271fb5
Reduce unnecessary scanning of embedded file FPs (#1571)
When embedded file type recognition finds a possible embedded file, it
is being scanned as a new embedded file even if it turns out it was a
false positive and parsing fails. My solution is to pre-parse the file
headers as little possible to determine if it is valid. If possible,
also determine the file size based on the headers. That will make it so
we don't have to scan additional data when the embedded file is not at
the very end.

This commit adds header checks prior to embedded ZIP, ARJ, and CAB
scanning. For these types I was also able to use the header checks to
determine the object size so as to prevent excessive pattern matching.

TODO: Add the same for RAR, EGG, 7Z, NULSFT, AUTOIT, IShield, and PDF.

This commit also removes duplicate matching for embedded MSEXE.
The embedded MSEXE detection and scanning logic was accidentally
creating an extra duplicate layer in between scanning and detection
because of the logic within the `cli_scanembpe()` function.
That function was effectively doing the header check which this commit
adds for ZIP, ARJ, and CAB but minus the size check.
Note: It is unfortunately not possible to get an accurage size from PE
file headers.
The `cli_scanembpe()` function also used to dump to a temp file for no
reason since FMAPs were extended to support windows into other FMAPs.
So this commit removes the intermediate layer as well as dropping a temp
file for each embedded PE file.

Further, this commit adds configuration and DCONF safeguards around all
embedded file type scanning.

Finally, this commit adds a set of tests to validate proper extraction
of embedded ZIP, ARJ, CAB, and MSEXE files.

CLAM-2862

Co-authored-by: TheRaynMan <draynor@sourcefire.com>
2025-09-23 15:57:28 -04:00
..
bytecode_scanfiles XOR test files; clean up tests directory 2021-07-17 10:39:27 -07:00
bytecode_sigs Tests: add allmatch regression tests 2022-10-19 13:13:57 -07:00
clamav_hdb_scanfiles ZIP: Fix infinite loop + significant code cleanup 2025-08-11 18:14:19 -04:00
embedded_testfiles Reduce unnecessary scanning of embedded file FPs (#1571) 2025-09-23 15:57:28 -04:00
freshclam_testfiles Codesign: fix test files & upgrade clamav-signature-util for related fix 2025-05-05 16:54:07 -04:00
htmlnorm_reffiles XOR test files; clean up tests directory 2021-07-17 10:39:27 -07:00
htmlnorm_scanfiles HTML <style> image extraction improvement 2023-02-07 22:02:02 -06:00
other_scanfiles ZIP: Fix infinite loop + significant code cleanup 2025-08-11 18:14:19 -04:00
other_sigs Add parser for ALZ archives 2024-04-15 10:03:02 -07:00
pe_allmatch Bump copyright dates for 2025 2025-02-14 10:24:30 -05:00
signing Fix several codesign bugs 2025-03-29 20:38:08 -04:00
verify Fix several codesign bugs 2025-03-29 20:38:08 -04:00
clamav.hdb Swap clean cache from MD5 to SHA2-256 2025-08-14 21:23:30 -04:00
CMakeLists.txt ZIP: Fix infinite loop + significant code cleanup 2025-08-11 18:14:19 -04:00
COPYING more tests for regex 2008-07-25 16:03:04 +00:00
README XOR test files; clean up tests directory 2021-07-17 10:39:27 -07:00
virusaction-test.sh XOR test files; clean up tests directory 2021-07-17 10:39:27 -07:00
xor_testfile.py CMake: Fix race condition with parallel builds 2021-09-27 13:03:24 -07:00

clam.exe is an extremely small (544 bytes!) MZ+PE executable that prints
a nice message :-) You can use it to test attachment scanning in your ClamAV
based mail scanner.

NOTE: upon request the testfiles are not shipped anymore
Instead they are dynamically generated at make time.