clamav

mirror of https://github.com/Cisco-Talos/clamav.git synced 2025-10-19 18:33:16 +00:00

Author	SHA1	Message	Date
Valerie Snyder	51adfb8b61	ClamScan & libclamav: improve precision of bytes-scanned, bytes-read The ClamScan scan summary prints bytes scanned and bytes read in multiples of 4096 (aka `CL_COUNT_PRECISION`), as is provided by the `cl_scanfile()`, `cl_scandesc()`, `cl_scanfile_callback()`, and `cl_scandesc_callback()` functions. I believe this imprecision was the result of using an `unsigned long int` which may be 64bit or 32bit, depending on platform. I believe the intention was to be able to support scanning more than 4 GiB of data. Since the new `cl_scan*_ex()` functions use a `uint64_t`, which guarantees a 64bit integer and supports ~16,777,216 terabytes, I find no reason not to report an accurate count. For the legacy scan functions (above) I've kept the `CL_COUNT_PRECISION` behavior to maintain backwards compatibility. I have also improved the bytes scanned/read output to report GiB, MiB, KiB, or B as appropriate. Previously, it always report "MB". CLAM-1433	2025-08-14 22:39:15 -04:00
Valerie Snyder	31dcec1e42	libclamav: Add engine option to toggle temp directory recursion Temp directory recursion in ClamAV is when each layer of a scan gets its own temp directory in the parent layer's temp directory. In addition to temp directory recursion, ClamAV has been creating a new subdirectory for each file scan as a risk-adverse method to ensure no temporary file leaks fill up the disk. Creating a directory is relatively slow on Windows in particular if scanning a lot of very small files. This commit: 1. Separates the temp directory recursion feature from the leave-temps feature so that libclamav can leave temp files without making subdirectories for each file scanned. 2. Makes it so that when temp directory recursion is off, libclamav will just use the configure temp directory for all files. The new option to enable temp directory recursion is for libclamav-only at this time. It is off by default, and you can enable it like this: ```c cl_engine_set_num(engine, CL_ENGINE_TMPDIR_RECURSION, 1); ``` For the `clamscan` and `clamd` programs, temp directory recursion will be enabled when `--leave-temps` / `LeaveTemporaryFiles` is enabled. The difference is that when disabled, it will return to using the configured temp directory without making a subdirectory for each file scanned, so as to improve scan performance for small files, mostly on Windows. Under the hood, this commit also: 1. Cleans up how we keep track of tmpdirs for each layer. The goal here is to align how we keep track of layer-specific stuff using the scan_layer structure. 2. Cleans up how we record metadata JSON for embedded files. Note: Embedded files being different from Contained files, as they are extracted not with a parser, but by finding them with file type magic signatures. CLAM-1583	2025-08-14 22:38:58 -04:00
Val Snyder	7ff29b8c37	Bump copyright dates for 2025	2025-02-14 10:24:30 -05:00
Micah Snyder	3ae9c1e434	Add LHA/LZH archive support File type magic signatures chosen based on the extensions supported by Rust delharc crate. See: https://docs.rs/delharc/latest/delharc/	2024-04-09 10:35:22 -04:00
Micah Snyder	405829ee88	Refine max-allocation and safer-allocation function and macro names We add the _OR_GOTO_DONE suffix to the macros that go to done if the allocation fails. This makes it obvious what is different about the macro versus the equivalent function, and that error handling is built-in. Renamed the cli_strdup to safer_strdup to make it obvious that it exists because it is safer than regular strdup. Regular strdup doesn't have the NULL check before trying to dup, and so may result in a NULL-deref crash. Also remove unused STRDUP (_OR_GOTO_DONE) macro, since the one with the NULL-check is preferred.	2024-03-15 13:18:47 -04:00
Micah Snyder	8e04c25fec	Rename clamav memory allocation functions We have some special functions to wrap malloc, calloc, and realloc to make sure we don't allocate more than some limit, similar to the max-filesize and max-scansize limits. Our wrappers are really only needed when allocating memory for scans based on untrusted user input, where a scan file could have bytes that claim you need to allocate some ridiculous amount of memory. Right now they're named: - cli_malloc - cli_calloc - cli_realloc - cli_realloc2 ... and these names do not convey their purpose This commit renames them to: - cli_max_malloc - cli_max_calloc - cli_max_realloc - cli_max_realloc2 The realloc ones also have an additional feature in that they will not free your pointer if you try to realloc to 0 bytes. Freeing the memory is undefined by the C spec, and only done with some realloc implementations, so this stabilizes on the behavior of not doing that, which should prevent accidental double-free's. So for the case where you may want to realloc and do not need to have a maximum, this commit adds the following functions: - cli_safer_realloc - cli_safer_realloc2 These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when MPOOL is disabled (e.g. because mmap-support is not found), so as to match the behavior in the mpool_realloc/2 functions that do not make use of the allocation-limit.	2024-03-15 13:18:47 -04:00
Micah Snyder	6d6e04ddf8	Optimization: replace limited allocation calls There are a large number of allocations for fix sized buffers using the `cli_malloc` and `cli_calloc` calls that check if the requested size is larger than our allocation threshold for allocations based on untrusted input. These allocations will always be higher than the threshold, so the extra stack frame and check for these calls is a waste of CPU. This commit replaces needless calls with A -> B: - cli_malloc -> malloc - cli_calloc -> calloc - CLI_MALLOC -> MALLOC - CLI_CALLOC -> CALLOC I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by the max-allocation threshold, when MMAP is found/enabled. But the alternative was set to cli_malloc / cli_calloc when disabled. I changed those as well. I didn't change the cli_realloc/2 calls because our version of realloc not only implements a threshold but also stabilizes the undefined behavior in realloc to protect against accidental double-free's. It may be worth implementing a cli_realloc that doesn't have the threshold built-in, however, so as to allow reallocaitons for things like buffers for loading signatures, which aren't subject to the same concern as allocations for scanning possible malware. There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC, because it appears to be allocating based on untrusted input.	2024-03-15 13:18:47 -04:00
Micah Snyder	9cb28e51e6	Bump copyright dates for 2024	2024-01-22 11:27:17 -05:00
Micah Snyder	0d3dc86f90	Coverity-514958: Error handling check with getpagesize call `cli_getpagesize()` may return -1 in an error condition. If it does, let's just treat it as 4096. I believe the actual coverity complaint is a false positive, but it's fair to account for the error case and this should shut it up.	2023-08-16 21:08:01 -07:00
Micah Snyder	6eebecc303	Bump copyright for 2023	2023-02-12 11:20:22 -08:00
Micah Snyder	fcd8902cb2	HWP3, ASN1, blob: Remove all-match checks	2022-10-19 13:13:57 -07:00
Micah Snyder	cd3134568a	Code quality: Refactor layer attributes as scan parameter The current implementation sets a "next layer attributes" flag field in the scan context. This may introduce bugs if accidentally not cleared during error handling, causing that attribute to be applied to a different layer than intended. This commit resolves that by adding an attribute flag to the major internal scan functions and removing the "next layer attributes" from the scan context. This attributes flag shares the same flag fields as the attributes flag in the new file inspection callback and the flags are defined in `clamav.h`.	2022-10-13 08:57:44 -07:00
mko-x	a21cc6dcd7	Add explicit log level parameter to application logging API * Added loglevel parameter to logg() * Fix logg and mprintf internals with new loglevels * Update all logg calls to set loglevel * Update all mprintf calls to set loglevel * Fix hidden logg calls * Executed clam-format	2022-02-15 15:13:55 -08:00
micasnyd	140c88aa4e	Bump copyright for 2022 Includes minor format corrections.	2022-01-09 14:23:25 -07:00
Micah Snyder	d46832d5cf	clamav.net URL update for new docs, github issues Replace new bugzilla ticket links with links to github issues. Replace clamav.net/documentation links with docs.clamav.net equivalents.	2021-07-17 15:28:02 -07:00
Micah Snyder (micasnyd)	b9ca6ea103	Update copyright dates for 2021 Also fixes up clang-format.	2021-03-19 15:12:26 -07:00
Micah Snyder	e2f59af30a	Clang-format touchup	2020-07-24 16:37:25 -07:00
Andy Ragusa (aragusa)	2049078622	fuzz-22348 null deref in egg utf8 conversion Corrected memory leaks and a null dereference in the egg utf8 conversion.	2020-07-13 19:31:27 -07:00
Micah Snyder	9b9999d778	Rename core scanning functions Many of the core scanning functions' names no longer represent their specific purpose or arguments. This commit aims to make the names more intuitive. Names are now prefixed with "magic" if they involve file-typing and file-type parsing. In addition, each function now includes the type of input being scanned whether its "desc", "fmap", or "buff". Some of the APIs also now specify "type" to indicate that a type other than "ANY" may be passed in to select the type rather than use file type magic for type recognition. \| current name \| new name \| \| ------------------------- \| --------------------------------- \| \| magic_scandesc() \| cli_magic_scan() \| \| cli_magic_scandesc_type() \| <delete> \| \| cli_magic_scandesc() \| cli_magic_scan_desc() \| \| cli_base_scandesc() \| cli_magic_scan_desc_type() \| \| cli_partition_scandesc() \| <delete> \| \| cli_map_scandesc() \| magic_scan_nested_fmap_type() \| \| cli_map_scan() \| cli_magic_scan_nested_fmap_type() \| \| cli_mem_scandesc() \| cli_magic_scan_buff() \| \| cli_scanbuff() \| cli_scan_buff() \| \| cli_scandesc() \| cli_scan_desc() \| \| cli_fmap_scandesc() \| cli_scan_fmap() \| \| cli_scanfile() \| cli_magic_scan_file() \| \| cli_scandir() \| cli_magic_scan_dir() \| \| cli_filetype2() \| cli_determine_fmap_type() \| \| cli_filetype() \| cli_compare_ftm_file() \| \| cli_partitiontype() \| cli_compare_ftm_partition() \| \| cli_scanraw() \| scanraw() \|	2020-06-03 11:00:40 -04:00
Micah Snyder	005cbf5a37	Record names of extracted files A way is needed to record scanned file names for two purposes: 1. File names (and extensions) must be stored in the json metadata properties recorded when using the --gen-json clamscan option. Future work may use this to compare file extensions with detected file types. 2. File names are useful when interpretting tmp directory output when using the --leave-temps option. This commit enables file name retention for later use by storing file names in the fmap header structure, if a file name exists. To store the names in fmaps, an optional name argument has been added to any internal scan API's that create fmaps and every call to these APIs has been modified to pass a file name or NULL if a file name is not required. The zip and gpt parsers required some modification to record file names. The NSIS and XAR parsers fail to collect file names at all and will require future work to support file name extraction. Also: - Added recursive extraction to the tmp directory when the --leave-temps option is enabled. When not enabled, the tmp directory structure remains flat so as to prevent the likelihood of exceeding MAX_PATH. The current tmp directory is stored in the scan context. - Made the cli_scanfile() internal API non-static and added it to scanners.h so it would be accessible outside of scanners.c in order to remove code duplication within libmspack.c. - Added function comments to scanners.h and matcher.h - Converted a TDB-type macros and LSIG-type macros to enums for improved type safey. - Converted more return status variables from `int` to `cl_error_t` for improved type safety, and corrected ooxml file typing functions so they use `cli_file_t` exclusively rather than mixing types with `cl_error_t`. - Restructured the magic_scandesc() function to use goto's for error handling and removed the early_ret_from_magicscan() macro and magic_scandesc_cleanup() function. This makes the code easier to read and made it easier to add the recursive tmp directory cleanup to magic_scandesc(). - Corrected zip, egg, rar filename extraction issues. - Removed use of extra sub-directory layer for zip, egg, and rar file extraction. For Zip, this also involved changing the extracted filenames to be randomly generated rather than using the "zip.###" file name scheme.	2020-06-03 10:39:18 -04:00
Micah Snyder	206dbaefe8	Update copyright dates for 2020	2020-01-03 15:44:07 -05:00
Micah Snyder	cef54eaf8f	Freshclam refresh. This update makes libcurl a hard requirement for ClamAV. New features added to freshclam: - Update signature definitions over HTTPS. - Support for HTTP protocol v1.1 (formerly v1.0). - New libfreshclam library with an all new API and versioning separate from libclamav (v2.0.0). This library is now build and installed alongside libclamav as a hard dependency of freshclam. - The ability to opt-in and opt-out of standard and optional official ClamAV databases (ExtraDatabase, ExcludeDatabase) - The option to specify the protocol and port number of official and private mirror servers. - Support for additional types of proxy servers beyond plain HTTP (SOCKS 4, SOCKS 5). Features removed from freshclam: - Mirror management (mirrors.dat) file. This feature is no longer needed as official signature databases are distributed using a paid content delivery network (Cloudflare). This commit also adds the following features for Windows users: - The clamsubmit tool. - The json-c library dependency, which will enable the --gen-json option in clamscan. - Third party libraries under the win32/3rdparty directory have been removed. Developers will need to build the libraries separately from ClamAV and provide the headers and lib/dll library files the same way they do for OpenSSL. This includes libxml2, pthread-win32, bzip2, zlib, pcre2 as well as new dependencies: curl, json-c. Developers are encouraged to use the build tool Mussels to simplify this task.	2019-10-02 16:08:22 -04:00
Micah Snyder	52cddcbcfd	Updating and cleaning up copyright notices.	2019-10-02 16:08:18 -04:00
Micah Snyder	72fd33c8b2	clang-format'd using new .clang-format rules.	2019-10-02 16:08:16 -04:00
Micah Snyder	d39cb6581f	Updating libclamunrar from legacy C implementation to modern unrar 5.6.5. API changes and supporting changes included to pass the filepath of the scanned file into libclamav through the cli_ctx structure, required by the unrar library to open archives. The filename argument may be optional for the scandesc scanning variant, but libclamav will make a best effort to identify the filename from the file descriptor if it was not provided. In addition, included the ability to prefix temp file and directory names with file basenames.	2018-12-02 23:06:59 -05:00
Micah Snyder	d7979d4ff7	Restructured scan options flags from a single bitflag field to a structure containing multiple bitflag fields. This also required adding a new function to the bytecode API to get scan options a la carte, and modifying the existing function to hand back scan options in the old/deprecated uint32_t bitflag format. Re-generated bytecode iface header files. Updated libclamav documentation detailing new scan options structure. Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'. Renamed references to 'scan all' to 'scan all match'. Renamed a couple of 'Hueristic.' signature names as 'Heuristics.' signatures (plural) to match majority of other heuristics.	2018-12-02 23:06:59 -05:00
Micah Snyder	964a1e7321	Converting http urls to https urls. Primary focus was on clamav.net urls. I updated a couple others and fixes a few broken links as well. There are many (non-clamav.net) urls I didn't address, especially in 3rd party or contrib code.	2018-04-02 07:58:33 -04:00
Josh Soref	7cd9337a70	Spelling Adjustments (#30 ) * spelling: accessed * spelling: alignment * spelling: amalgamated * spelling: answers * spelling: another * spelling: acquisition * spelling: apitid * spelling: ascii * spelling: appending * spelling: appropriate * spelling: arbitrary * spelling: architecture * spelling: asynchronous * spelling: attachments * spelling: argument * spelling: authenticode * spelling: because * spelling: boundary * spelling: brackets * spelling: bytecode * spelling: calculation * spelling: cannot * spelling: changes * spelling: check * spelling: children * spelling: codegen * spelling: commands * spelling: container * spelling: concatenated * spelling: conditions * spelling: continuous * spelling: conversions * spelling: corresponding * spelling: corrupted * spelling: coverity * spelling: crafting * spelling: daemon * spelling: definition * spelling: delivered * spelling: delivery * spelling: delimit * spelling: dependencies * spelling: dependency * spelling: detection * spelling: determine * spelling: disconnects * spelling: distributed * spelling: documentation * spelling: downgraded * spelling: downloading * spelling: endianness * spelling: entities * spelling: especially * spelling: empty * spelling: expected * spelling: explicitly * spelling: existent * spelling: finished * spelling: flexibility * spelling: flexible * spelling: freshclam * spelling: functions * spelling: guarantee * spelling: hardened * spelling: headaches * spelling: heighten * spelling: improper * spelling: increment * spelling: indefinitely * spelling: independent * spelling: inaccessible * spelling: infrastructure Conflicts: docs/html/node68.html * spelling: initializing * spelling: inited * spelling: instream * spelling: installed * spelling: initialization * spelling: initialize * spelling: interface * spelling: intrinsics * spelling: interpreter * spelling: introduced * spelling: invalid * spelling: latency * spelling: lawyers * spelling: libclamav * spelling: likelihood * spelling: loop * spelling: maximum * spelling: million * spelling: milliseconds * spelling: minimum * spelling: minzhuan * spelling: multipart * spelling: misled * spelling: modifiers * spelling: notifying * spelling: objects * spelling: occurred * spelling: occurs * spelling: occurrences * spelling: optimization * spelling: original * spelling: originated * spelling: output * spelling: overridden * spelling: parenthesis * spelling: partition * spelling: performance * spelling: permission * spelling: phishing * spelling: portions * spelling: positives * spelling: preceded * spelling: properties * spelling: protocol * spelling: protos * spelling: quarantine * spelling: recursive * spelling: referring * spelling: reorder * spelling: reset * spelling: resources * spelling: resume * spelling: retrieval * spelling: rewrite * spelling: sanity * spelling: scheduled * spelling: search * spelling: section * spelling: separator * spelling: separated * spelling: specify * spelling: special * spelling: statement * spelling: streams * spelling: succession * spelling: suggests * spelling: superfluous * spelling: suspicious * spelling: synonym * spelling: temporarily * spelling: testfiles * spelling: transverse * spelling: turkish * spelling: typos * spelling: unable * spelling: unexpected * spelling: unexpectedly * spelling: unfinished * spelling: unfortunately * spelling: uninitialized * spelling: unlocking * spelling: unnecessary * spelling: unpack * spelling: unrecognized * spelling: unsupported * spelling: usable * spelling: wherever * spelling: wishlist * spelling: white * spelling: infrastructure * spelling: directories * spelling: overridden * spelling: permission * spelling: yesterday * spelling: initialization * spelling: intrinsics * space adjustment for spelling changes * minor modifications by klin	2018-02-27 22:00:09 -05:00
Steven Morgan	7a307529d8	bb11580 - make cli_matchmeta() respect allmatch.	2016-06-08 16:25:34 -04:00
Mickey Sola	46a35abe56	mass update of copyright headers	2015-09-17 13:41:26 -04:00
Shawn Webb	cd94be7a52	Silence a bunch of compiler warnings in libclamav	2014-07-10 18:11:49 -04:00
Shawn Webb	60d8d2c352	Move all the crypto API to clamav.h	2014-07-01 19:38:01 -04:00
Shawn Webb	b2e7c931d0	Use OpenSSL for hashing.	2014-02-08 00:31:12 -05:00
Steve Morgan	b81cbc263c	some corrections and refinements identified during 0.97 retrofit	2012-10-25 12:36:05 -07:00
Shawn webb	a2a004df25	BB#3737 - Value too large for specified data type Create compile-time preprocessor defines for switching from calling stat() to stat64(). Add --enable-stat64 switch in configure script.	2012-07-16 15:36:49 -04:00
Tomasz Kojm	53d41b9793	libclamav/blob.c: properly scan files when LeaveTemporaryFiles is enabled (bb#2447)	2010-12-28 13:05:00 +01:00
Tomasz Kojm	bb1e844cc2	fix some warnings	2010-01-27 16:06:12 +01:00
Tomasz Kojm	2ecbd98a5e	cdb: handle mail files	2010-01-15 16:24:16 +01:00
Tomasz Kojm	55094a9c76	libclamav: base code for unified container metadata matcher (bb#1579)	2010-01-07 18:26:12 +01:00
aCaB	58481352d5	win32 paths handling	2009-09-24 19:07:39 +02:00
aCaB	081f64735d	win32#2	2009-09-24 16:24:07 +02:00
aCaB	be4bf7f4ab	win32	2009-09-24 16:08:52 +02:00
aCaB	cb680655f1	unify mail-container scans	2009-08-30 23:57:20 +02:00
aCaB	86d59b249e	fix portability issues for fseeko, sysconf(_SC_PAGESIZE), getpagesize() (bb#1658)	2009-07-16 14:21:25 +02:00
Tomasz Kojm	e06afe8e8e	libclamav: fix handling of signature offsets in cli_scanbuff() (bb#1546) git-svn: trunk@5026	2009-04-06 20:01:09 +00:00
aCaB	f2d79ab352	bb#1456 git-svn: trunk@4925	2009-03-11 18:04:01 +00:00
Tomasz Kojm	0138619577	libclamav/matcher.c: cli_scanbuff: add support for external acdata git-svn: trunk@4781	2009-02-13 12:42:35 +00:00
Tomasz Kojm	33068e0973	libclamav: drop cl_settempdir(); use cl_engine_set() with CL_ENGINE_TMPDIR and CL_ENGINE_KEEPTMP instead git-svn: trunk@4416	2008-11-14 22:23:39 +00:00
Török Edvin	6a21552ef2	have configure define NDEBUG unless we use --enable-debug, instead of having to #ifndef CL_DEBUG #define NDEBUG #endif in each .c file that uses assert. If you want assertions enabled you'll need to use --enable-debug to configure, as until now, no change there. git-svn: trunk@4343	2008-11-06 14:27:18 +00:00
Tomasz Kojm	6670d61d4b	drop support for Cygwin (due to broken ClamAV builds) git-svn: trunk@4143	2008-08-25 21:59:33 +00:00

1 2

74 commits