clamav

mirror of https://github.com/Cisco-Talos/clamav.git synced 2025-10-19 18:33:16 +00:00

Author	SHA1	Message	Date
Micah Snyder	6eebecc303	Bump copyright for 2023	2023-02-12 11:20:22 -08:00
Micah Snyder	2f49080847	Excel (XLM, VBA): Remove allmatch checks + some code cleanup And fixed a couple of conditions where critical scan errors may be ignored.	2022-10-19 13:13:57 -07:00
Micah Snyder	cd3134568a	Code quality: Refactor layer attributes as scan parameter The current implementation sets a "next layer attributes" flag field in the scan context. This may introduce bugs if accidentally not cleared during error handling, causing that attribute to be applied to a different layer than intended. This commit resolves that by adding an attribute flag to the major internal scan functions and removing the "next layer attributes" from the scan context. This attributes flag shares the same flag fields as the attributes flag in the new file inspection callback and the flags are defined in `clamav.h`.	2022-10-13 08:57:44 -07:00
Micah Snyder	15eef50656	Code cleanup: Refactor to clean up formatting issues Refactored the clamscan code that determines 'what to scan' in order to clean up some very messy logic and also to get around a difference in how vscode and clang-format handle formatting #ifdef blocks in the middle of an else/if. In addition to refactoring, there is a slight behavior improvement. With this change, doing `clamscan blah -` will now scan `blah` and then also scan `stdin`. You can even do `clamscan - blah` to now scan `stdin` and then scan `blah`. Before, The `-` had to be the only "filename" argument in order to scan from stdin. In addition, added a bunch of extra empty lines or changing multi-line function calls to single-line function calls in order to get around a bug in clang-format with these two options do not playing nice together: - AlignConsecutiveAssignments: true - AlignAfterOpenBracket: true AlignAfterOpenBracket is not taking account the spaces inserted by AlignConsecutiveAssignments, so you end up with stuff like this: ```c bleeblah = 1; blah = function(arg1, arg2, arg3); // ^--- these args 4-left from where they should be. ``` VSCode, meanwhile, somehow fixes this whitespace issue so code that is correctly formatted by VSCode doesn't have this bug, meaning that: 1. The clang-format check in GH Actions fails. 2. We'd all have to stop using format-on-save in VSCode and accept the bug if we wanted those GH Actions tests to pass. Adding an empty line before variable assignments from multi-line function calls evades the buggy behavior. This commit should resolve the clang-format github action test failures, for now.	2022-03-22 10:42:46 -07:00
mko-x	a21cc6dcd7	Add explicit log level parameter to application logging API * Added loglevel parameter to logg() * Fix logg and mprintf internals with new loglevels * Update all logg calls to set loglevel * Update all mprintf calls to set loglevel * Fix hidden logg calls * Executed clam-format	2022-02-15 15:13:55 -08:00
micasnyd	140c88aa4e	Bump copyright for 2022 Includes minor format corrections.	2022-01-09 14:23:25 -07:00
Micah Snyder (micasnyd)	b9ca6ea103	Update copyright dates for 2021 Also fixes up clang-format.	2021-03-19 15:12:26 -07:00
Micah Snyder (micasnyd)	9e20cdf6ea	Add CMake build tooling This patch adds experimental-quality CMake build tooling. The libmspack build required a modification to use "" instead of <> for header #includes. This will hopefully be included in the libmspack upstream project when adding CMake build tooling to libmspack. Removed use of libltdl when using CMake. Flex & Bison are now required to build. If -DMAINTAINER_MODE, then GPERF is also required, though it currently doesn't actually do anything. TODO! I found that the autotools build system was generating the lexer output but not actually compiling it, instead using previously generated (and manually renamed) lexer c source. As a consequence, changes to the .l and .y files weren't making it into the build. To resolve this, I removed generated flex/bison files and fixed the tooling to use the freshly generated files. Flex and bison are now required build tools. On Windows, this adds a dependency on the winflexbison package, which can be obtained using Chocolatey or may be manually installed. CMake tooling only has partial support for building with external LLVM library, and no support for the internal LLVM (to be removed in the future). I.e. The CMake build currently only supports the bytecode interpreter. Many files used include paths relative to the top source directory or relative to the current project, rather than relative to each build target. Modern CMake support requires including internal dependency headers the same way you would external dependency headers (albeit with "" instead of <>). This meant correcting all header includes to be relative to the build targets and not relative to the workspace. For example, ... ```c include "../libclamav/clamav.h" include "clamd/clamd_others.h" ``` ... becomes: ```c // libclamav include "clamav.h" // clamd include "clamd_others.h" ``` Fixes header name conflicts by renaming a few of the files. Converted the "shared" code into a static library, which depends on libclamav. The ironically named "shared" static library provides features common to the ClamAV apps which are not required in libclamav itself and are not intended for use by downstream projects. This change was required for correct modern CMake practices but was also required to use the automake "subdir-objects" option. This eliminates warnings when running autoreconf which, in the next version of autoconf & automake are likely to break the build. libclamav used to build in multiple stages where an earlier stage is a static library containing utils required by the "shared" code. Linking clamdscan and clamdtop with this libclamav utils static lib allowed these two apps to function without libclamav. While this is nice in theory, the practical gains are minimal and it complicates the build system. As such, the autotools and CMake tooling was simplified for improved maintainability and this feature was thrown out. clamdtop and clamdscan now require libclamav to function. Removed the nopthreads version of the autotools libclamav_internal_utils static library and added pthread linking to a couple apps that may have issues building on some platforms without it, with the intention of removing needless complexity from the source. Kept the regular version of libclamav_internal_utils.la though it is no longer used anywhere but in libclamav. Added an experimental doxygen build option which attempts to build clamav.h and libfreshclam doxygen html docs. The CMake build tooling also may build the example program(s), which isn't a feature in the Autotools build system. Changed C standard to C90+ due to inline linking issues with socket.h when linking libfreshclam.so on Linux. Generate common.rc for win32. Fix tabs/spaces in shared Makefile.am, and remove vestigial ifndef from misc.c. Add CMake files to the automake dist, so users can try the new CMake tooling w/out having to build from a git clone. clamonacc changes: - Renamed FANOTIFY macro to HAVE_SYS_FANOTIFY_H to better match other similar macros. - Added a new clamav-clamonacc.service systemd unit file, based on the work of ChadDevOps & Aaron Brighton. - Added missing clamonacc man page. Updates to clamdscan man page, add missing options. Remove vestigial CL_NOLIBCLAMAV definitions (all apps now use libclamav). Rename Windows mspack.dll to libmspack.dll so all ClamAV-built libraries have the lib-prefix with Visual Studio as with CMake.	2020-08-13 00:25:34 -07:00
Micah Snyder	07a66adc75	Fix bug added in previous patch, fixup unit tests to use newly added sanitized_basename parameter.	2020-08-11 11:45:06 -07:00
Micah Snyder	860764eb16	Heuristic macro detection for imp VBA extraction Notably the commit adds a heuristic alert when VBA is extracted using the new VBA extraction code and similarly adds "HasMacros":true to the JSON scan properties. In addition, a change was added to the cli_sanitize_filepath() function so it converts posix pathseps to Windows pathseps on Windows and also outputs a sanitized basename pointer (optional) which is used when generating a temporary filename so that using a prefix with pathseps in it won't cause file creation failures (observed with --leave-temps where original filenames are incorporated into temporarily filenames). Included soem error handling improvements for cli_vba_scandir() to better track alert and macro detections. Downgraded utf8 conversion error messages to debug messages because they are too verbose in files with invalid filenames (observed in some malware). Changed the xlm macro and vba project temp filenames to include "xlm_macros" and "vba_project" prefix, to make it easier to find them. Relocated XLM and VBA temp files from the top-level tmp directory to the current sub_tmpdir, so tempfiles for a given scan are more organized.	2020-08-11 11:45:06 -07:00
Micah Snyder (micasnyd)	b35d1e1bec	fuzz-24354: Fix unknown read in VBA parser Fixes bound checks in recently rewritten VBA parser code (i.e. issue does not affect prior versions). Also improves VBA terminator header parsing to better match the spec, per recommendation by Jonas Zaddach.	2020-07-30 23:05:51 -07:00
Micah Snyder	e2f59af30a	Clang-format touchup	2020-07-24 16:37:25 -07:00
Andy Ragusa (aragusa)	2049078622	fuzz-22348 null deref in egg utf8 conversion Corrected memory leaks and a null dereference in the egg utf8 conversion.	2020-07-13 19:31:27 -07:00
Andrew	035265b96f	Bug fixes related to the recent HFS+/VBA/OLE2/XLM code changes This commit includes bug fixes and minor modifications based on warnings generated by Coverity. These include: - 287096 - In cli_xlm_extract_macros: Leak of memory or pointers to system resources (CWE-404). This was a legitimate leak of a generated temp filename and could occur frequently. - 287095 - In scan_for_xlm_macros: Use of an uninitialized variable. The uninitialized value (state.length) was likely never used unitialized, but we now initialize it just in case. - 287094 - In cli_vba_readdir_new: Out-of-bounds access to a buffer (CWE-119). This looks like a copy-paste error and was a legitimate read past the bounds of a buffer in an error case. - 284479 - In hfsplus_walk_catalog: All paths that lead to this null pointer comparison already dereference the pointer earlier (CWE-476). In certain cases a NULL pointer could be returned in the success case of hfsplus_scanfile, which was not handled correctly. This case may have been prevented in practice by an earlier check, but adding a check for NULL just in case. - 284478 - In hfsplus_walk_catalog: A value assigned to a variable is never used. ret would be set if zlib's inflateEnd function fails. The fix is to just not set ret in this case, since the error doesn't seem fatal (although would result in a memory leak by the zlib code...). - 284477 - In hfsplus_check_attribute: Pointer is checked against null but then dereferenced anyway. I just took out the NULL check of record and recordSize, since the code requires these values to not be NULL elsewhere and there's no way an error could occur as currently used (stack var addresses are passed via these parameters). I also fixed up some of the function identifiers in debug print messages.	2020-06-17 16:02:39 -04:00
Micah Snyder	11ef77007b	Improve tmp sub-directory names At present many parsers create tmp subdirectories to store extracted files. For parsers like the vba parser, this is required as the directory is later scanned. For other parsers, these subdirectories are probably not helpful now that we provide recursive sub-dirs when --leave-temps is enabled. It's not quite as simple as removing the extra subdirectories, however. Certain parsers, like autoit, don't create very unique filenames and would result in file name collisions when --leave-temps is not enabled. The best thing to do would be to make sure each parser uses unique filenames and doesn't rely on cli_magic_scan_dir() to scan extracted content before removing the extra subdirectory. In the meantime, this commit gives the extra subdirectories meaningful names to improve readability. This commit also: - Provides the 'bmp' prefix for extracted PE icons. - Removes empty tmp subdirs when extracting rtf files, to eliminate clutter. - The PDF parser sometimes creates tmp files when decompressing streams before it knows if there is actually any content to decompress. This resulted in a large number of empty files. While it would be best to avoid creating empty files in the first place, that's not quite as as it sounds. This commit does the next best thing and deletes the tmp files if nothing was actually extracted, even if --leave-temps is enabled. - Removes the "scantemp" prefix for unnamed fmaps scanned with cli_magic_scan(). The 5-character hashes given to tmp files with prefixes resulted in occasional file name collisions when extracting certain file types with thousands of embedded files. - The VBA and TAR parsers mistakenly used NAME_MAX instead of PATH_MAX, resulting in truncated file paths and failed extraction when --leave-temps is enabled and a lot of recursion is in play. This commit switches them from NAME_MAX to PATH_MAX.	2020-06-03 11:00:53 -04:00
Micah Snyder	9b9999d778	Rename core scanning functions Many of the core scanning functions' names no longer represent their specific purpose or arguments. This commit aims to make the names more intuitive. Names are now prefixed with "magic" if they involve file-typing and file-type parsing. In addition, each function now includes the type of input being scanned whether its "desc", "fmap", or "buff". Some of the APIs also now specify "type" to indicate that a type other than "ANY" may be passed in to select the type rather than use file type magic for type recognition. \| current name \| new name \| \| ------------------------- \| --------------------------------- \| \| magic_scandesc() \| cli_magic_scan() \| \| cli_magic_scandesc_type() \| <delete> \| \| cli_magic_scandesc() \| cli_magic_scan_desc() \| \| cli_base_scandesc() \| cli_magic_scan_desc_type() \| \| cli_partition_scandesc() \| <delete> \| \| cli_map_scandesc() \| magic_scan_nested_fmap_type() \| \| cli_map_scan() \| cli_magic_scan_nested_fmap_type() \| \| cli_mem_scandesc() \| cli_magic_scan_buff() \| \| cli_scanbuff() \| cli_scan_buff() \| \| cli_scandesc() \| cli_scan_desc() \| \| cli_fmap_scandesc() \| cli_scan_fmap() \| \| cli_scanfile() \| cli_magic_scan_file() \| \| cli_scandir() \| cli_magic_scan_dir() \| \| cli_filetype2() \| cli_determine_fmap_type() \| \| cli_filetype() \| cli_compare_ftm_file() \| \| cli_partitiontype() \| cli_compare_ftm_partition() \| \| cli_scanraw() \| scanraw() \|	2020-06-03 11:00:40 -04:00
Micah Snyder	005cbf5a37	Record names of extracted files A way is needed to record scanned file names for two purposes: 1. File names (and extensions) must be stored in the json metadata properties recorded when using the --gen-json clamscan option. Future work may use this to compare file extensions with detected file types. 2. File names are useful when interpretting tmp directory output when using the --leave-temps option. This commit enables file name retention for later use by storing file names in the fmap header structure, if a file name exists. To store the names in fmaps, an optional name argument has been added to any internal scan API's that create fmaps and every call to these APIs has been modified to pass a file name or NULL if a file name is not required. The zip and gpt parsers required some modification to record file names. The NSIS and XAR parsers fail to collect file names at all and will require future work to support file name extraction. Also: - Added recursive extraction to the tmp directory when the --leave-temps option is enabled. When not enabled, the tmp directory structure remains flat so as to prevent the likelihood of exceeding MAX_PATH. The current tmp directory is stored in the scan context. - Made the cli_scanfile() internal API non-static and added it to scanners.h so it would be accessible outside of scanners.c in order to remove code duplication within libmspack.c. - Added function comments to scanners.h and matcher.h - Converted a TDB-type macros and LSIG-type macros to enums for improved type safey. - Converted more return status variables from `int` to `cl_error_t` for improved type safety, and corrected ooxml file typing functions so they use `cli_file_t` exclusively rather than mixing types with `cl_error_t`. - Restructured the magic_scandesc() function to use goto's for error handling and removed the early_ret_from_magicscan() macro and magic_scandesc_cleanup() function. This makes the code easier to read and made it easier to add the recursive tmp directory cleanup to magic_scandesc(). - Corrected zip, egg, rar filename extraction issues. - Removed use of extra sub-directory layer for zip, egg, and rar file extraction. For Zip, this also involved changing the extracted filenames to be randomly generated rather than using the "zip.###" file name scheme.	2020-06-03 10:39:18 -04:00
Micah Snyder (micasnyd)	a97ce0c837	fuzz-21960: Add missing size checks to vba parser Add missing size checks to validate size data parsed from a VBA file. This fixes a possible buffer overflow read that was caught by oss-fuzz before it made it into any release.	2020-05-12 17:28:37 -07:00
Jonas Zaddach (jzaddach)	b7f8440965	Modernize VBA code extraction from Microsoft Office files - Existing VBA extraction code uses undocumented cache structures. This code uses the documented way of accessing VBA projects. - Adds additional detail to the dumped information: Project name, Project doc string, ... All VBA projects are dumped into a single file. - Malware authors are currently evading detection by spreading malicious code over several projects. It is hard to write signatures if only part of the malicious code is visible.	2020-04-28 13:32:07 -07:00
Micah Snyder	206dbaefe8	Update copyright dates for 2020	2020-01-03 15:44:07 -05:00
Mickey Sola	622771bd58	oss-fuzz - 13468 - fix shift of negative value when converting from unicode	2019-10-02 16:08:26 -04:00
Micah Snyder	bbfe42e133	Correcting use of unsigned variable to a signed off_t variable in calculation that was intended to result in a negative number but failed on 32bit platforms without a cast.	2019-10-02 16:08:26 -04:00
Micah Snyder	4524c398f3	Argument and return types for fmap_readn(), cli_writen(), cli_readn() converted to use size_t instead of int.	2019-10-02 16:08:25 -04:00
Micah Snyder	50f178dc63	fuzz - 12166 - Fix for 4-byte out of bounds write wherein the an invalid struct pointer member variable is set to zero. The fix adds bounds checking to the Uniq storage 'add' function as well as error code checks. Included a lot of new inline documentation.	2019-10-02 16:08:19 -04:00
Micah Snyder	52cddcbcfd	Updating and cleaning up copyright notices.	2019-10-02 16:08:18 -04:00
Micah Snyder	72fd33c8b2	clang-format'd using new .clang-format rules.	2019-10-02 16:08:16 -04:00
Micah Snyder	d39cb6581f	Updating libclamunrar from legacy C implementation to modern unrar 5.6.5. API changes and supporting changes included to pass the filepath of the scanned file into libclamav through the cli_ctx structure, required by the unrar library to open archives. The filename argument may be optional for the scandesc scanning variant, but libclamav will make a best effort to identify the filename from the file descriptor if it was not provided. In addition, included the ability to prefix temp file and directory names with file basenames.	2018-12-02 23:06:59 -05:00
Mickey Sola	46a35abe56	mass update of copyright headers	2015-09-17 13:41:26 -04:00
Kevin Lin	4de2f5ec2c	bb#11164 - fixed invalid wrap-around read with vba inflation	2014-11-05 16:46:11 -05:00
Kevin Lin	9b38ab7248	bb#11165 - added size check to prevent invalid reads	2014-11-03 13:07:55 -05:00
Shawn Webb	60d8d2c352	Move all the crypto API to clamav.h	2014-07-01 19:38:01 -04:00
Steven Morgan	e182c02ce3	support libjson-c 0.10, 1.11, and 1.12	2014-05-23 09:46:06 -04:00
Kevin Lin	4c37996842	doc/ppt: moved information stream parsing from vba source to ole2 source	2014-04-21 18:30:28 -04:00
Kevin Lin	09dddc5be3	doc/ppt: added SummaryInfo and DocumentSummary streams parsing, JSON or debug	2014-04-21 16:44:26 -04:00
Shawn Webb	b2e7c931d0	Use OpenSSL for hashing.	2014-02-08 00:31:12 -05:00
David Raynor	dc31213450	vba: length grab cleanup	2013-08-07 13:41:14 -04:00
David Raynor	8c66e38605	vba: grab length after middle test	2013-03-22 12:05:59 -04:00
David Raynor	d489ba8066	vba: fix vba_read_project_strings() looping and bad returns	2013-03-13 14:22:04 -04:00
David Raynor	80649b2842	cid #11398 , #11400 , #11401	2013-03-12 13:51:49 -04:00
Shawn Webb	241e7eb147	bb6258 - Add warnings when allocations fail	2013-03-01 13:51:15 -05:00
Shawn Webb	7e40bab956	bb6099 - check return value of lseek()	2013-02-28 21:01:40 -05:00
Shawn webb	a2a004df25	BB#3737 - Value too large for specified data type Create compile-time preprocessor defines for switching from calling stat() to stat64(). Add --enable-stat64 switch in configure script.	2012-07-16 15:36:49 -04:00
David Raynor	bebd86a60b	bb#5343	2012-06-22 16:55:29 -04:00
Tomasz Kojm	d21fb8d975	libclamav/vba_extract.c: fix error path double free (bb#2486)	2011-02-07 17:26:45 +01:00
Tomasz Kojm	fd45238eb6	libclamav: fix some error messages (bb#2083)	2010-07-05 17:31:25 +02:00
aCaB	58481352d5	win32 paths handling	2009-09-24 19:07:39 +02:00
aCaB	081f64735d	win32#2	2009-09-24 16:24:07 +02:00
Török Edvin	f6f2869f8d	avoid size 1 reads for performance reasons (bb #1542 ). git-svn: trunk@5037	2009-04-10 15:20:18 +00:00
Tomasz Kojm	871177cdd9	return codes cleanup (bb#1159) git-svn: trunk@4749	2009-02-12 13:53:23 +00:00
Tomasz Kojm	33068e0973	libclamav: drop cl_settempdir(); use cl_engine_set() with CL_ENGINE_TMPDIR and CL_ENGINE_KEEPTMP instead git-svn: trunk@4416	2008-11-14 22:23:39 +00:00

1 2

81 commits