clamav

mirror of https://github.com/Cisco-Talos/clamav.git synced 2025-10-19 10:23:17 +00:00

Author	SHA1	Message	Date
Val S.	d758c00537	Tests: Fix a couple of valgrind complaints (#1554 ) Fix valgrind issues regarding: - Unclosed log file descriptor in libclamav unit test program. Also need to disable debug logging for `iconv_cache_destroy()` for this or else it will try to use that file descriptor after `main()` exits. - Unclosed socket file descriptor in ClamDScan when doing `ping()` function. CLAM-2872	2025-09-09 12:35:14 -04:00
Valerie Snyder	13c4788f36	FIPS & FIPS-like limits on hash algs for cryptographic uses ClamAV will not function when using a FIPS-enabled OpenSSL 3.x. This is because ClamAV uses MD5 and SHA1 algorithms for a variety of purposes including matching for malware detection, matching to prevent false positives on known-clean files, and for verification of MD5-based RSA digital signatures for determining CVD (signature database archive) authenticity. Interestingly, FIPS had been intentionally bypassed when creating hashes based whole buffers and whole files (by descriptor or `FILE`-pointer): `78d4a9985a` Note: this bypassed FIPS the 1.x way with: `EVP_MD_CTX_set_flags(ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW);` It was NOT disabled when using `cl_hash_init()` / `cl_update_hash()` / `cl_finish_hash()`. That likely worked by coincidence in that the hash was already calculated most of the time. It certainly would have made use of those functions if the hash had not been calculated prior: `78d4a9985a/libclamav/matcher.c (L743)` Regardless, bypassing FIPS entirely is not the correct solution. The FIPS restrictions against using MD5 and SHA1 are valid, particularly when verifying CVD digital siganatures, but also I think when using a hash to determine if the file is known-clean (i.e. the "clean cache" and also MD5-based and SHA1-based FP signatures). This commit extends the work to bypass FIPS using the newer 3.x method: `md = EVP_MD_fetch(NULL, alg, "-fips");` It does this for the legacy `cl_hash*()` functions including `cl_hash_init()` / `cl_update_hash()` / `cl_finish_hash()`. It also introduces extended versions that allow the caller to choose if they want to bypass FIPS: - `cl_hash_data_ex()` - `cl_hash_init_ex()` - `cl_update_hash_ex()` - `cl_finish_hash_ex()` - `cl_hash_destroy_ex()` - `cl_hash_file_fd_ex()` See the `flags` parameter for each. Ironically, this commit does NOT use the new functions at this time. The rational is that ClamAV may need MD5, SHA1, and SHA-256 hashes of the same files both for determining if the file is malware, and for determining if the file is clean. So instead, this commit will do a checks when: 1. Creating a new ClamAV scanning engine. If FIPS-mode enabled, it will automatically toggle the "FIPS limits" engine option. When loading signatures, if the engine "FIPS limits" option is enabled, then MD5 and SHA1 FP signatures will be skipped. 2. Before verifying a CVD (e.g. also for loading, unpacking when verification enabled). If "FIPS limits" or FIPS-mode are enabled, then the legacy MD5-based RSA method is disabled. Note: This commit also refactors the interface for `cl_cvdverify_ex()` and `cl_cvdunpack_ex()` so they take a `flags` parameters, rather than a single `bool`. As these functions are new in this version, it does not break the ABI. The cache was already switched to use SHA2-256, so that's not a concern for checking FIPS-mode / FIPS limits options. This adds an option for `freshclam.conf` and `clamd.conf`: FIPSCryptoHashLimits yes And an equivalent command-line option for `clamscan` and `sigtool`: --fips-limits You may programmatically enable FIPS-limits for a ClamAV engine like this: ```C cl_engine_set_num(engine, CL_ENGINE_FIPS_LIMITS, 1); ``` CLAM-2792	2025-08-14 22:39:15 -04:00
Valerie Snyder	f7e60d566f	Record unique object-id for each layer scanned Every time we push a new map onto the scanning recursion context, give it a unique object id number, which counts from zero. Moved the location where we add metadata for each file from the "cli_magic_scan" function over to the "recursion stack push" function. Include a "path" as a parameter for creating a new fmap, and rename some related variables and functions to be more intuitive. CLAM-2796 See also: CLAM-2485, CLAM-2626	2025-08-14 21:23:33 -04:00
Valerie Snyder	aa7b7e9421	Swap clean cache from MD5 to SHA2-256 Change the clean-cache to use SHA2-256 instead of MD5. Note that all references are changed to specify "SHA2-256" now instead of "SHA256", for clarity. But there is no plan to add support for SHA3 algorithms at this time. Significant code cleanup. E.g.: - Implemented goto-done error handling. - Used `uint8_t ` instead of `unsigned char `. - Use `bool` for boolean checks, rather than `int. - Used `#defines` instead of magic numbers. - Removed duplicate `#defines` for things like hash length. Add new option to calculate and record additional hash types when the "generate metadata JSON" feature is enabled: - libclamav option: `CL_SCAN_GENERAL_STORE_EXTRA_HASHES` - clamscan option: `--json-store-extra-hashes` (default off) - clamd.conf option: `JsonStoreExtraHashes` (default 'no') Renamed the sigtool option `--sha256` to `--sha2-256`. The original option is still functional, but is deprecated. For the "generate metadata JSON" feature, the file hash is now stored as "sha2-256" instead of "FileMD5". If you enable the "extra hashes" option, then it will also record "md5" and "sha1". Deprecate and disable the internal "SHA collect" feature. This option had been hidden behind C #ifdef checks for an option that wasn't exposed through CMake, so it was basically unavailable anyways. Changes to calculate file hashes when they're needed and no sooner. For the FP feature in the matcher module, I have mimiced the optimization in the FMAP scan routine which makes it so that it can calculate multiple hashes in a single pass of the file. The `HandlerType` feature stores a hash of the file in the scan ctx to prevent retyping the exact same data more than once. I removed that hash field and replaced it with an attribute flag that is applied to the new recursion stack layer when retyping a file. This also closes a minor bug that would prevent retyping a file with an all-zero hash. :) The work upgrading cache.c to support SHA2-256 sized hashes thanks to: https://github.com/m-sola CLAM-255 CLAM-1858 CLAM-1859 CLAM-1860	2025-08-14 21:23:30 -04:00
Valerie Snyder	0cc5d75093	ZIP: Fix infinite loop + significant code cleanup An infinite loop may occur when scanning some malformed ZIP files. I introduced this issue in `96c00b6d80` with this line: ```c // decrement coff by 1 to account for the increment at the end of the loop coff -= 1; ``` The problem is that the function may return 0, which should indicate that there are no more files. The result was that `coff` would stay the same and the loop would repeat. This issue is in 1.5 development and affects the 1.5.0 beta but does not affect any production versions. Fixes: https://github.com/Cisco-Talos/clamav/issues/1534 Special thanks to Sophie0x2E for an initial fix, proposed in https://github.com/Cisco-Talos/clamav/pull/1539 In review, I was uncomfortable with other existing code and decided to to a more significant overhaul of the error handling in the ZIP module. In addition to cleanup, this commit has some functional changes: - When parsing a central directory file header inside of `parse_central_directory_file_header()`, it will now fail out if the "extra length" or "comment length" fields would exceced the length of the archive. That doesn't mean the associated local file header won't be parsed later, but it won't use the central directory file header to find it. Instead, the ZIP module will have to find the local file header by searching for extra records not listed in the central directory. This change was mostly to tidy up complex error handling. - Add two FTM new signatures to identify split ZIP archives. This signature identifies the first segment (first file) in a split or spanned ZIP archive. It may also be found on a single-segment "split" archive, depending on the ZIP archiver. ``` 0:0:504b0708504b0304:ZIP (First segment split/spanned):CL_TYPE_ANY:CL_TYPE_ZIP ``` Practically speaking, this new signature makes it so ClamAV identifies the file as a ZIP right away without having to rely on SFX_ZIP detection. Extraction is then handled by the ZIP `cli_unzip` function rather than extracting each with `cli_unzip_single` which handles SFX_ZIP entries. Note: ClamAV isn't capable of finding additional files on disk to support handling the additional segments. So it doesn't make any difference with handling those other files. This signature is for single-segment split/spanned archives, depending on the ZIP archiver. ``` 0:0:504b0303504b0304:ZIP (Single-segment split/spanned):CL_TYPE_ANY:CL_TYPE_ZIP ``` Like the first one, this also means we won't rely on SFX_ZIP detection and will treat this files as regular ZIPs. - Added a test file to verify that ClamAV can extract a single-file "split" ZIP. - Added a clamscan test with test files to verify that scanning a split archive across two segments correctly extracts the properly formed zip file entries. Sadly, we can't join the segments to extract everything.	2025-08-11 18:14:19 -04:00
Val Snyder	8d485b9bfd	FIPS-compliant CVD signing and verification Add X509 certificate chain based signing with PKCS7-PEM external signatures distributed alongside CVD's in a custom .cvd.sign format. This new signing and verification mechanism is primarily in support of FIPS compliance. Fixes: https://github.com/Cisco-Talos/clamav/issues/564 Add a Rust implementation for parsing, verifying, and unpacking CVD files. Now installs a 'certs' directory in the app config directory (e.g. <prefix>/etc/certs). The install location is configurable. The CMake option to configure the CVD certs directory is: `-D CVD_CERTS_DIRECTORY=PATH` New options to set an alternative CVD certs directory: - Commandline for freshclam, clamd, clamscan, and sigtool is: `--cvdcertsdir PATH` - Env variable for freshclam, clamd, clamscan, and sigtool is: `CVD_CERTS_DIR` - Config option for freshclam and clamd is: `CVDCertsDirectory PATH` Sigtool: - Add sign/verify commands. - Also verify CDIFF external digital signatures when applying CDIFFs. - Place commonly used commands at the top of --help string. - Fix up manpage. Freshclam: - Will try to download .sign files to verify CVDs and CDIFFs. - Fix an issue where making a CLD would only include the CFG file for daily and not if patching any other database. libclamav.so: - Bump version to 13:0:1 (aka 12.1.0). - Also remove libclamav.map versioning. Resolves: https://github.com/Cisco-Talos/clamav/issues/1304 - Add two new API's to the public clamav.h header: ```c extern cl_error_t cl_cvdverify_ex(const char file, const char certs_directory); extern cl_error_t cl_cvdunpack_ex(const char file, const char dir, bool dont_verify, const char *certs_directory); ``` The original `cl_cvdverify` and `cl_cvdunpack` are deprecated. - Add `cl_engine_field` enum option `CL_ENGINE_CVDCERTSDIR`. You may set this option with `cl_engine_set_str` and get it with `cl_engine_get_str`, to override the compiled in default CVD certs directory. libfreshclam.so: Bump version to 4:0:0 (aka 4.0.0). Add sigtool sign/verify tests and test certs. Make it so downloadFile doesn't throw a warning if the server doesn't have the .sign file. Replace use of md5-based FP signatures in the unit tests with sha256-based FP signatures because the md5 implementation used by Python may be disabled in FIPS mode. Fixes: https://github.com/Cisco-Talos/clamav/issues/1411 CMake: Add logic to enable the Rust openssl-sys / openssl-rs crates to build against the same OpenSSL library as is used for the C build. The Rust unit test application must also link directly with libcrypto and libssl. Fix some log messages with missing new lines. Fix missing environment variable notes in --help messages and manpages. Deconflict CONFDIR/DATADIR/CERTSDIR variable names that are defined in clamav-config.h.in for libclamav from variable that had the same name for use in clamav applications that use the optparser. The 'clamav-test' certs for the unit tests will live for 10 years. The 'clamav-beta.crt' public cert will only live for 120 days and will be replaced before the stable release with a production 'clamav.crt'.	2025-03-26 19:33:25 -04:00
Micah Snyder	4b5130d50a	Tests: remove dead code Remove check for 'srcdir' and 'unrar_disabled' variables. These were only used by legacy Automake tooling. Resolves: https://github.com/Cisco-Talos/clamav/issues/1447	2025-02-20 10:42:16 -05:00
Micah Snyder	45c6938be8	Remove libbz2 dead code As of ClamAV 0.105, libbz2 is required. There is also no option to disable bz2 support. This commit removes the dead code associated with the old build option.	2024-04-13 12:34:15 -04:00
Micah Snyder	71ff5c579c	Remove libxml2 dead code As of ClamAV 0.105, libxml2 is required. There is also no option to disable PCRE support. This commit removes the dead code associated with the old build option.	2024-04-13 12:34:15 -04:00
RainRat	143d23c326	Fix typos and remove duplicate #include	2024-04-10 19:31:46 -04:00
Micah Snyder	e48dfad49a	Windows: Fix C/Rust FFI compat issue + Windows compile warnings Primarily this commit fixes an issue with the size of the parameters passed to cli_checklimits(). The parameters were "unsigned long", which varies in size depending on platform. I've switched them to uint64_t / u64. While working on this, I observed some concerning warnigns on Windows, and some less serious ones, primarily regarding inconsistencies with `const` parameters. Finally, in `scanmem.c`, there is a warning regarding use of `wchar_t *` with `GetModuleFileNameEx()` instead of `GetModuleFileNameExW()`. This made me realize this code assumes we're not defining `UNICODE`, which would have such macros use the 'A' variant. I have fixed it the best I can, although I'm still a little uncomfortable with some of this code that uses `char` or `wchar_t` instead of TCHAR. I also remove the `if (GetModuleFileNameEx) {` conditional, because this macro/function will always be defined. The original code was checking a function pointer, and so this was a bug when integrating into ClamAV. Regarding the changes to `rijndael.c`, I found that this module assumes `unsigned long` == 32bits. It does not. I have corrected it to use `uint32_t`.	2024-04-09 10:35:22 -04:00
Micah Snyder	902623972d	Remove max-allocation limits where not required The cli_max_malloc, cli_max_calloc, and cli_max_realloc functions provide a way to protect against allocating too much memory when the size of the allocation is derived from the untrusted input. Specifically, we worry about values in the file being scanned being manipulated to exhaust the RAM and crash the application. There is no need to check the limits if the size of the allocation is fixed, or if the size of the allocation is necessary for signature loading, or the general operation of the applications. E.g. checking the max-allocation limit for the size of a hash, or for the size of the scan recursion stack, is a complete waste of time. Although we significantly increased the max-allocation limit in a recent release, it is best not to check an allocation if the allocation will be safe. It would be a waste of time. I am also hopeful that if we can reduce the number allocations that require a limit-check to those that require it for the safe scan of a file, then eventually we can store the limit in the scan- context, and make it configurable.	2024-03-15 13:18:47 -04:00
Micah Snyder	8e04c25fec	Rename clamav memory allocation functions We have some special functions to wrap malloc, calloc, and realloc to make sure we don't allocate more than some limit, similar to the max-filesize and max-scansize limits. Our wrappers are really only needed when allocating memory for scans based on untrusted user input, where a scan file could have bytes that claim you need to allocate some ridiculous amount of memory. Right now they're named: - cli_malloc - cli_calloc - cli_realloc - cli_realloc2 ... and these names do not convey their purpose This commit renames them to: - cli_max_malloc - cli_max_calloc - cli_max_realloc - cli_max_realloc2 The realloc ones also have an additional feature in that they will not free your pointer if you try to realloc to 0 bytes. Freeing the memory is undefined by the C spec, and only done with some realloc implementations, so this stabilizes on the behavior of not doing that, which should prevent accidental double-free's. So for the case where you may want to realloc and do not need to have a maximum, this commit adds the following functions: - cli_safer_realloc - cli_safer_realloc2 These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when MPOOL is disabled (e.g. because mmap-support is not found), so as to match the behavior in the mpool_realloc/2 functions that do not make use of the allocation-limit.	2024-03-15 13:18:47 -04:00
Micah Snyder	6d6e04ddf8	Optimization: replace limited allocation calls There are a large number of allocations for fix sized buffers using the `cli_malloc` and `cli_calloc` calls that check if the requested size is larger than our allocation threshold for allocations based on untrusted input. These allocations will always be higher than the threshold, so the extra stack frame and check for these calls is a waste of CPU. This commit replaces needless calls with A -> B: - cli_malloc -> malloc - cli_calloc -> calloc - CLI_MALLOC -> MALLOC - CLI_CALLOC -> CALLOC I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by the max-allocation threshold, when MMAP is found/enabled. But the alternative was set to cli_malloc / cli_calloc when disabled. I changed those as well. I didn't change the cli_realloc/2 calls because our version of realloc not only implements a threshold but also stabilizes the undefined behavior in realloc to protect against accidental double-free's. It may be worth implementing a cli_realloc that doesn't have the threshold built-in, however, so as to allow reallocaitons for things like buffers for loading signatures, which aren't subject to the same concern as allocations for scanning possible malware. There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC, because it appears to be allocating based on untrusted input.	2024-03-15 13:18:47 -04:00
Micah Snyder	e389c3edac	Tests: add 3 test case for OneNote 2007, 2010, and a recent webapp export	2023-12-11 15:18:41 -05:00
RainRat	caf324e544	Fix typos (no functional changes)	2023-11-26 18:01:19 -05:00
Micah Snyder	a08377f9cf	Coverity-344513: Fix use-after-free in unit test error condition.	2023-04-26 10:43:13 -07:00
Micah Snyder	0bd2ae26bc	Scanners: Remove allmatch checks + significant code cleanup Also fixed a number of conditions where magic_scan() critical errors may be ignored. To ensure that the scan truly aborts for signature matches (not in allmatch mode) and for timeouts, the `ctx->abort` option is now set in these two conditions, and checked in several spots in magic_scan(). Additionally, I've consolidated some of the "scan must halt" type of checks (mostly large switch statements) into a function so that we can use the exact same logic in a few places in magic_scan(). I've also fixed a few minor warnings and code format issues.	2022-10-19 13:13:57 -07:00
Micah Snyder	8340e55660	Tests: unit tests for cl_load(), cl_cvdverify(), cl_cvdunpack() Some basic testing is needed for the new cl_cvdunpack() API, so this commit adds basic unit tests for that. For reasons unknown, a number of cl_* API's have stubs for unit tests that weren't filled out. The CVD load/verify ones in particular required access to a signed CVD. We actually ship a very basic signed CVD with the databases now, so I added tests for those while I was at it.	2022-10-12 21:46:54 -07:00
mko-x	a21cc6dcd7	Add explicit log level parameter to application logging API * Added loglevel parameter to logg() * Fix logg and mprintf internals with new loglevels * Update all logg calls to set loglevel * Update all mprintf calls to set loglevel * Fix hidden logg calls * Executed clam-format	2022-02-15 15:13:55 -08:00
Micah Snyder	0354482e16	Fix issues reading from uncompressed nested files The fmap module provides a mechanism for creating a mapping into an existing map at an offset and length that's used when a file is found with an uncompressed archive or when embedded files are found with embedded file type recognition in scanraw(). This is the "fmap_duplicate()" function. Duplicate fmaps just reference the original fmap's 'data' or file handle/descriptor while allowing the caller to treat it like a new map using offsets and lengths that don't account for the original/actual file dimensions. fmap's keep track of this with m->nested_offset & m->real_len, which admittedly have confusing names. I found incorrect uses of these in a handful of locations. Notably: - In cli_magic_scan_nested_fmap_type(). The force-to-disk feature would have been checking incorrect sizes and may have written incorrect offsets for duplicate fmaps. - In XDP parser. - A bunch of places from the previous commit when making dupe maps. This commit fixes those and adds lots of documentation to the fmap.h API to try to prevent confusion in the future. nested_offset should never be referenced outside of fmap.c/h. The fmap_* functions for accessing or reading map data have two implementations, mem_* or handle_*, depending the data source. I found issues with some of these so I made a unit test that covers each of the functions I'm concerned about for both types of data sources and for both original fmaps and nested/duplicate fmaps. With the tests, I found and fixed issues in these fmap functions: - handle_need_offstr(): must account for the nested_offset in dupe maps. - handle_gets(): must account for nested_offset and use len & real_len correctly. - mem_need_offstr(): must account for nested_offset in dupe maps. - mem_gets(): must account for nested_offset and use len & real_len correctly. Moved CDBRANGE() macro out of function definition so for better legibility. Fixed a few warnings.	2021-10-25 16:02:29 -07:00
Micah Snyder	4a9cff9214	CMake: support Xcode builds Xcode (and perhaps some other generators?) do not like targets that have only object files. See: https://cmake.org/cmake/help/latest/command/add_library.html#object-libraries And: https://cmake.org/pipermail/cmake/2016-May/063479.html This issue manifests when using `-G Xcode` on macOS as the library dylibs being missing when linking with other binaries. This commit removes the object libraries for libclamav, libfreshclam, libclamunrar_iface, libclamunrar, libclammspack, and (lib)common because they were used by static or shared libs that didn't themselves have any added sources. Add getter & setter for the debug flag, so it isn't referenced by unit tests or other code that links with libclamav. This is needed because global variables are exported symbols on Windows.	2021-08-18 13:53:34 -07:00
Andy Ragusa	c4af06c317	Fix ENABLE_UNRAR=off build Cmake errors out when the ENABLE_UNRAR=off option is used. This commit addresses that.	2021-07-31 11:17:27 -07:00
Micah Snyder	201e1b12a7	XOR test files; clean up tests directory The split test files are flagged by some AV's because they look like broken executables. Instead of splitting the test files to prevent detections, we should encrypt them. This commit replaces the "reassemble testfiles" script with a basic "XOR testfiles" script that can be used to encrypt or decrypt test files. This commit also of course then replaces all the split files with xor'ed files. The test and unit_tests directories were a bit of a mess, so I reorganized them all into unit_tests with all of the test files placed under "unit_tests/input" using subdirectories for different types of files.	2021-07-17 10:39:27 -07:00
Micah Snyder	afbf0b6180	Fix Windows text file EOL conversion issues On Windows, files open()'ed without the O_BINARY flag will have new-line LF (aka \n) converted to CRLF (aka \r\n) automatically when read from or written to. This is undesirable for all scan targets AND temp files because it affects pattern matching and with hashing. This commit converts a handful of instances throughout the codebase where it appears that O_BINARY was mistakenly omitted and could result in unexpected behavior on Windows. Git on Windows also converts LF -> CRLF for "text" files, for editing purposes. This is problematic for scan files and test files that should match verbatim. We can prevent this issue by marking .ref test files as "binary" in the .gitattributes file and by always opening scan files and temp files as binary. In this commit I've also removed the `ChangeLog merge=cl-merge` line that was once used to reduce ChangeLog merge conflicts by using the gnulib git-merge-changlog tool. This project now categorizes changes in the NEWS.md. For finer detail, git commit history is fully accessible on github.com.	2021-02-25 11:41:28 -08:00
Micah Snyder	2552cfd0d1	CMake: Add CTest support to match Autotools checks An ENABLE_TESTS CMake option is provided so that users can disable testing if they don't want it. Instructions for how to use this included in the INSTALL.cmake.md file. If you run `ctest`, each testcase will write out a log file to the <build>/unit_tests directory. As with Autotools' make check, the test files are from test/.split and unit_tests/.split files, but for CMake these are generated at build time instead of at test time. On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled libraries can be loaded when running tests. On Windows systems, CTest will identify and collect all library dependencies and assemble a temporarily install under the build/unit_tests directory so that the libraries can be loaded when running tests. The same feature is used on Windows when using CMake to install to collect all DLL dependencies so that users don't have to install them manually afterwards. Each of the CTest tests are run using a custom wrapper around Python's unittest framework, which is also responsible for finding and inserting valgrind into the valgrind tests on Posix systems. Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by default, if Valgrind can be found. There's no need to set VG=1. CTest's memcheck module is NOT supported, because we use Python to orchestrate our tests. Added a bunch of Windows compatibility changes to the unit tests. These were primarily changing / to PATHSEP and making adjustments to use Win32 C headers and ifdef out the POSIX ones which aren't available on Windows. Also disabled a bunch of tests on Win32 that don't work on Windows, notably the mmap ones and FD-passing (i.e. FILEDES) ones. Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate warnings on Windows where json.h is included after inttypes.h because json-c's inttypes replacement relies on it. This is a it of a hack and may be removed if json-c fixes their inttypes header stuff in the future. Add preprocessor definitions on Windows to disable MSVC warnings about CRT secure and nonstandard functions. While there may be a better solution, this is needed to be able to see other more serious warnings. Add missing file comment block and copyright statement for clamsubmit.c. Also change json-c/json.h include filename to json.h in clamsubmit.c. The directory name is not required. Changed the hash table data integer type from long, which is poorly defined, to size_t -- which is capable of storing a pointer. Fixed a bunch of casts regarding this variable to eliminate warnings. Fixed two bugs causing utf8 encoding unit tests to fail on Windows: - The in_size variable should be the number of bytes, not the character count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8 transcoding test to only transcode half the bytes. - It turns out that the MultiByteToWideChar() API can't transcode UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer and flip the bytes on each uint16_t. This but was causing the UTF16-BE to UTF8 tests to fail. I also split up the utf8 transcoding tests into separate tests so I could see all of the failures instead of just the first one. Added a flags parameter to the unit test function to open testfiles because it turns out that on Windows if a file contains the \r\n it will replace it with just \n if you opened the file as a text file instead of as binary. However, if we open the CBC files as binary, then a bunch of bytecode tests fail. So I've changed the tests to open the CBC files in the bytecode tests as text files and open all other files as binary. Ported the feature tests from shell scripts to Python using a modified version of our QA test-framework, which is largely compatible and will allow us to migrate some QA tests into this repo. I'd like to add GitHub Actions pipelines in the future so that all public PR's get some testing before anyone has to manually review them. The clamd --log option was missing from the help string, though it definitely works. I've added it in this commit. It appears that clamd.c was never clang-format'd, so this commit also reformats clamd.c. Some of the check_clamd tests expected the path returned by clamd to match character for character with original path sent to clamd. However, as we now evaluate real paths before a scan, the path returned by clamd isn't going to match the relative (and possibly symlink-ridden) path passed to clamdscan. I fixed this test by changing the test to search for the basename: <signature> FOUND within the response instead of matching the exact path. Autotools: Link check_clamd with libclamav so we can use our utility functions in check_clamd.c.	2021-02-25 11:41:26 -08:00
Micah Snyder	e4e3149368	Fix fmap-duplicate performance issue The fmap_duplicate function is used create a new fmap with a view into an existing fmap. When the new view is a different size than the old fmap, a new hash must be calculated for the duplicate fmap. However, when the duplicated fmap is the same size as the original fmap, the hash will be the same and there's no point recalculating. The issue is apparent when scanning large EXE files because the hash was being calculated at the beginning and end of the scan. Digging into this issue revealed that hash calculations for fmaps were also being performed at the wrong place. For scans of maps we use fmap_duplicate() early in the process to apply the name API argument to the duplicate fmap. Fixing the logic so we doing recalculate the hash revealed that we never calculated hashes for fmap's created from buffers in the first place, so that also had to be fixed be relocating where the hash is calculated. I also found that fmap_duplicate()'s offset argument used an off_t, though it and all caller offsets are not allowed to be negative. This was a bit of tangent to fix a bunch of off_t variables and paramters that should've been size_t. Added a couple unit tests to verify that making duplicate fmaps, and duplicate-duplicate fmaps works as expected after the change. Changed CLI_ISCONTAINED() and CLI_ISCONTAINED2() macros to cast to size_t, because pointers and buffer sizes may not be negative, and these two macros do not rely on substraction.	2021-01-28 12:54:50 -08:00
Micah Snyder (micasnyd)	f2787f77d1	Remove 'v' typo from unit test source Remove what looks to be a copypaste typo. The build & unit tests seem to work fine despite the typo.	2020-08-24 15:07:09 -07:00
Micah Snyder (micasnyd)	9e20cdf6ea	Add CMake build tooling This patch adds experimental-quality CMake build tooling. The libmspack build required a modification to use "" instead of <> for header #includes. This will hopefully be included in the libmspack upstream project when adding CMake build tooling to libmspack. Removed use of libltdl when using CMake. Flex & Bison are now required to build. If -DMAINTAINER_MODE, then GPERF is also required, though it currently doesn't actually do anything. TODO! I found that the autotools build system was generating the lexer output but not actually compiling it, instead using previously generated (and manually renamed) lexer c source. As a consequence, changes to the .l and .y files weren't making it into the build. To resolve this, I removed generated flex/bison files and fixed the tooling to use the freshly generated files. Flex and bison are now required build tools. On Windows, this adds a dependency on the winflexbison package, which can be obtained using Chocolatey or may be manually installed. CMake tooling only has partial support for building with external LLVM library, and no support for the internal LLVM (to be removed in the future). I.e. The CMake build currently only supports the bytecode interpreter. Many files used include paths relative to the top source directory or relative to the current project, rather than relative to each build target. Modern CMake support requires including internal dependency headers the same way you would external dependency headers (albeit with "" instead of <>). This meant correcting all header includes to be relative to the build targets and not relative to the workspace. For example, ... ```c include "../libclamav/clamav.h" include "clamd/clamd_others.h" ``` ... becomes: ```c // libclamav include "clamav.h" // clamd include "clamd_others.h" ``` Fixes header name conflicts by renaming a few of the files. Converted the "shared" code into a static library, which depends on libclamav. The ironically named "shared" static library provides features common to the ClamAV apps which are not required in libclamav itself and are not intended for use by downstream projects. This change was required for correct modern CMake practices but was also required to use the automake "subdir-objects" option. This eliminates warnings when running autoreconf which, in the next version of autoconf & automake are likely to break the build. libclamav used to build in multiple stages where an earlier stage is a static library containing utils required by the "shared" code. Linking clamdscan and clamdtop with this libclamav utils static lib allowed these two apps to function without libclamav. While this is nice in theory, the practical gains are minimal and it complicates the build system. As such, the autotools and CMake tooling was simplified for improved maintainability and this feature was thrown out. clamdtop and clamdscan now require libclamav to function. Removed the nopthreads version of the autotools libclamav_internal_utils static library and added pthread linking to a couple apps that may have issues building on some platforms without it, with the intention of removing needless complexity from the source. Kept the regular version of libclamav_internal_utils.la though it is no longer used anywhere but in libclamav. Added an experimental doxygen build option which attempts to build clamav.h and libfreshclam doxygen html docs. The CMake build tooling also may build the example program(s), which isn't a feature in the Autotools build system. Changed C standard to C90+ due to inline linking issues with socket.h when linking libfreshclam.so on Linux. Generate common.rc for win32. Fix tabs/spaces in shared Makefile.am, and remove vestigial ifndef from misc.c. Add CMake files to the automake dist, so users can try the new CMake tooling w/out having to build from a git clone. clamonacc changes: - Renamed FANOTIFY macro to HAVE_SYS_FANOTIFY_H to better match other similar macros. - Added a new clamav-clamonacc.service systemd unit file, based on the work of ChadDevOps & Aaron Brighton. - Added missing clamonacc man page. Updates to clamdscan man page, add missing options. Remove vestigial CL_NOLIBCLAMAV definitions (all apps now use libclamav). Rename Windows mspack.dll to libmspack.dll so all ClamAV-built libraries have the lib-prefix with Visual Studio as with CMake.	2020-08-13 00:25:34 -07:00
Micah Snyder	07a66adc75	Fix bug added in previous patch, fixup unit tests to use newly added sanitized_basename parameter.	2020-08-11 11:45:06 -07:00
Micah Snyder (micasnyd)	8db5fcae6f	Add unit tests for conv to UTF-8 Also relocated codepage table from msdoc.h to entconv.h Also adds new macros for codepages to reduce use of magic numbers when referencing code pages elsewhere in libclamav.	2020-07-24 14:45:44 -07:00
Micah Snyder	50455664a7	libclamav: Fix fmap leak in bytecode runtime Fixes an fmap leak in the bytecode switch_input() API. The switch_input() API provides a way to read from an extracted file instead of reading from the current file. The issue is that the current implementation fails to free the fmap created to read from the extracted file on cleanup or when switching back to the original fmap. In addition, it fails to use the cli_bytecode_context_setfile() function to restore the file_size in the context for the current fmap. Fixes a couple fmap leaks in the unit tests.	2020-04-20 11:26:43 -07:00
Micah Snyder (micasnyd)	485d8dec67	Check test support for check 0.13 Tests in libcheck 0.13 must have {} between START_TEST and END_TEST else it will not compile. Also replaced all deprecated "fail_" macros with "ck_" macros. E.g. fail_unless() becomes ck_assert_msg() The checks_common.h header file provided a couple of macros to support versions older than 0.9.3. As these older versions are no longer relevant, I've removed those compatibility macros entirely.	2020-01-15 08:14:23 -08:00
Micah Snyder	ee40795fe2	Converted mpool calls to macros when USE_MPOOL is defined to clearly differentiate between function and macro behavior.	2019-10-02 16:08:25 -04:00
Micah Snyder	cef54eaf8f	Freshclam refresh. This update makes libcurl a hard requirement for ClamAV. New features added to freshclam: - Update signature definitions over HTTPS. - Support for HTTP protocol v1.1 (formerly v1.0). - New libfreshclam library with an all new API and versioning separate from libclamav (v2.0.0). This library is now build and installed alongside libclamav as a hard dependency of freshclam. - The ability to opt-in and opt-out of standard and optional official ClamAV databases (ExtraDatabase, ExcludeDatabase) - The option to specify the protocol and port number of official and private mirror servers. - Support for additional types of proxy servers beyond plain HTTP (SOCKS 4, SOCKS 5). Features removed from freshclam: - Mirror management (mirrors.dat) file. This feature is no longer needed as official signature databases are distributed using a paid content delivery network (Cloudflare). This commit also adds the following features for Windows users: - The clamsubmit tool. - The json-c library dependency, which will enable the --gen-json option in clamscan. - Third party libraries under the win32/3rdparty directory have been removed. Developers will need to build the libraries separately from ClamAV and provide the headers and lib/dll library files the same way they do for OpenSSL. This includes libxml2, pthread-win32, bzip2, zlib, pcre2 as well as new dependencies: curl, json-c. Developers are encouraged to use the build tool Mussels to simplify this task.	2019-10-02 16:08:22 -04:00
Micah Snyder	155eaaad8b	bb12284 - Fix to prevent path traversal when using cli_genfname() to generate filenames that may retain path and filename information. Changed scanrar so that it will no longer retain path information for extracted files.	2019-10-02 16:08:19 -04:00
Micah Snyder	72fd33c8b2	clang-format'd using new .clang-format rules.	2019-10-02 16:08:16 -04:00
Micah Snyder	d39cb6581f	Updating libclamunrar from legacy C implementation to modern unrar 5.6.5. API changes and supporting changes included to pass the filepath of the scanned file into libclamav through the cli_ctx structure, required by the unrar library to open archives. The filename argument may be optional for the scandesc scanning variant, but libclamav will make a best effort to identify the filename from the file descriptor if it was not provided. In addition, included the ability to prefix temp file and directory names with file basenames.	2018-12-02 23:06:59 -05:00
Micah Snyder	d7979d4ff7	Restructured scan options flags from a single bitflag field to a structure containing multiple bitflag fields. This also required adding a new function to the bytecode API to get scan options a la carte, and modifying the existing function to hand back scan options in the old/deprecated uint32_t bitflag format. Re-generated bytecode iface header files. Updated libclamav documentation detailing new scan options structure. Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'. Renamed references to 'scan all' to 'scan all match'. Renamed a couple of 'Hueristic.' signature names as 'Heuristics.' signatures (plural) to match majority of other heuristics.	2018-12-02 23:06:59 -05:00
Steven Morgan	1f1bf36b8e	Add 'virus found' callback. Refactor scan-all API.	2015-10-01 17:47:37 -04:00
Kevin Lin	c2b36ddd95	bb#11355 - added messages to unit-test log about 'T' env var	2015-07-28 16:22:29 -04:00
Kevin Lin	11bdd8a7c6	make check: added env check 'T' to set timeout	2015-03-24 12:06:57 -04:00
Shawn Webb	60d8d2c352	Move all the crypto API to clamav.h	2014-07-01 19:38:01 -04:00
Shawn Webb	849cdc78c9	Clean up the XML parser in the unit tests	2014-06-30 16:35:48 -04:00
Shawn Webb	f9afc3092f	Cleanup OpenSSL on program exit	2014-05-09 17:14:21 -04:00
Kevin Lin	f4f331ae26	unit_tests: fixed testing involving optional features: unrar. bzip2, and xml	2014-03-18 17:23:27 -04:00
Steven Morgan	97fd85d1f3	bz#10534: add patch from Scott Kitterman/Sebastian Andrzej Siewior for make check unit testing when unrar is disabled. Rework for 0.98.2.	2014-03-13 12:50:30 -04:00
Shawn Webb	da6e06dd68	Provide further abstractions to the OpenSSL integration work	2014-02-28 12:12:30 -05:00
Shawn Webb	f077c6174f	Fix some race conditions. Fix some memory leaks.	2014-02-13 13:05:50 -05:00
Shawn Webb	b2e7c931d0	Use OpenSSL for hashing.	2014-02-08 00:31:12 -05:00

1 2 3

109 commits