Commit graph

89 commits

Author SHA1 Message Date
Valerie Snyder
4660141186
Auto-format touch-up 2025-08-14 22:39:16 -04:00
Valerie Snyder
13c4788f36
FIPS & FIPS-like limits on hash algs for cryptographic uses
ClamAV will not function when using a FIPS-enabled OpenSSL 3.x.
This is because ClamAV uses MD5 and SHA1 algorithms for a variety of
purposes including matching for malware detection, matching to prevent
false positives on known-clean files, and for verification of MD5-based
RSA digital signatures for determining CVD (signature database archive)
authenticity.

Interestingly, FIPS had been intentionally bypassed when creating hashes
based whole buffers and whole files (by descriptor or `FILE`-pointer):
78d4a9985a
Note: this bypassed FIPS the 1.x way with:
`EVP_MD_CTX_set_flags(ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW);`

It was NOT disabled when using `cl_hash_init()` / `cl_update_hash()` /
`cl_finish_hash()`. That likely worked by coincidence in that the hash
was already calculated most of the time. It certainly would have made
use of those functions if the hash had not been calculated prior:
78d4a9985a/libclamav/matcher.c (L743)

Regardless, bypassing FIPS entirely is not the correct solution.
The FIPS restrictions against using MD5 and SHA1 are valid, particularly
when verifying CVD digital siganatures, but also I think when using a
hash to determine if the file is known-clean (i.e. the "clean cache" and
also MD5-based and SHA1-based FP signatures).

This commit extends the work to bypass FIPS using the newer 3.x method:
`md = EVP_MD_fetch(NULL, alg, "-fips");`

It does this for the legacy `cl_hash*()` functions including
`cl_hash_init()` / `cl_update_hash()` / `cl_finish_hash()`.
It also introduces extended versions that allow the caller to choose if
they want to bypass FIPS:
- `cl_hash_data_ex()`
- `cl_hash_init_ex()`
- `cl_update_hash_ex()`
- `cl_finish_hash_ex()`
- `cl_hash_destroy_ex()`
- `cl_hash_file_fd_ex()`
See the `flags` parameter for each.

Ironically, this commit does NOT use the new functions at this time.
The rational is that ClamAV may need MD5, SHA1, and SHA-256 hashes of
the same files both for determining if the file is malware, and for
determining if the file is clean.

So instead, this commit will do a checks when:

1. Creating a new ClamAV scanning engine. If FIPS-mode enabled, it will
   automatically toggle the "FIPS limits" engine option.
   When loading signatures, if the engine "FIPS limits" option is enabled,
   then MD5 and SHA1 FP signatures will be skipped.

2. Before verifying a CVD (e.g. also for loading, unpacking when
   verification enabled).
   If "FIPS limits" or FIPS-mode are enabled, then the legacy MD5-based RSA
   method is disabled.

   Note: This commit also refactors the interface for `cl_cvdverify_ex()`
   and `cl_cvdunpack_ex()` so they take a `flags` parameters, rather than a
   single `bool`. As these functions are new in this version, it does not
   break the ABI.

The cache was already switched to use SHA2-256, so that's not a concern
for checking FIPS-mode / FIPS limits options.

This adds an option for `freshclam.conf` and `clamd.conf`:

   FIPSCryptoHashLimits yes

And an equivalent command-line option for `clamscan` and `sigtool`:

   --fips-limits

You may programmatically enable FIPS-limits for a ClamAV engine like this:
```C
   cl_engine_set_num(engine, CL_ENGINE_FIPS_LIMITS, 1);
```

CLAM-2792
2025-08-14 22:39:15 -04:00
Valerie Snyder
aa7b7e9421
Swap clean cache from MD5 to SHA2-256
Change the clean-cache to use SHA2-256 instead of MD5.
Note that all references are changed to specify "SHA2-256" now instead
of "SHA256", for clarity. But there is no plan to add support for SHA3
algorithms at this time.

Significant code cleanup. E.g.:
- Implemented goto-done error handling.
- Used `uint8_t *` instead of `unsigned char *`.
- Use `bool` for boolean checks, rather than `int.
- Used `#defines` instead of magic numbers.
- Removed duplicate `#defines` for things like hash length.

Add new option to calculate and record additional hash types when the
"generate metadata JSON" feature is enabled:
- libclamav option: `CL_SCAN_GENERAL_STORE_EXTRA_HASHES`
- clamscan option: `--json-store-extra-hashes` (default off)
- clamd.conf option: `JsonStoreExtraHashes` (default 'no')

Renamed the sigtool option `--sha256` to `--sha2-256`.
The original option is still functional, but is deprecated.

For the "generate metadata JSON" feature, the file hash is now stored as
"sha2-256" instead of "FileMD5". If you enable the "extra hashes" option,
then it will also record "md5" and "sha1".

Deprecate and disable the internal "SHA collect" feature.
This option had been hidden behind C #ifdef checks for an option that
wasn't exposed through CMake, so it was basically unavailable anyways.

Changes to calculate file hashes when they're needed and no sooner.

For the FP feature in the matcher module, I have mimiced the
optimization in the FMAP scan routine which makes it so that it can
calculate multiple hashes in a single pass of the file.

The `HandlerType` feature stores a hash of the file in the scan ctx to
prevent retyping the exact same data more than once.
I removed that hash field and replaced it with an attribute flag that is
applied to the new recursion stack layer when retyping a file.
This also closes a minor bug that would prevent retyping a file with an
all-zero hash. :)

The work upgrading cache.c to support SHA2-256 sized hashes thanks to:
https://github.com/m-sola

CLAM-255
CLAM-1858
CLAM-1859
CLAM-1860
2025-08-14 21:23:30 -04:00
Val Snyder
272e84eaa8
Auto-format with clang-format 2025-03-26 20:00:14 -04:00
Val Snyder
8d485b9bfd
FIPS-compliant CVD signing and verification
Add X509 certificate chain based signing with PKCS7-PEM external
signatures distributed alongside CVD's in a custom .cvd.sign format.
This new signing and verification mechanism is primarily in support
of FIPS compliance.

Fixes: https://github.com/Cisco-Talos/clamav/issues/564

Add a Rust implementation for parsing, verifying, and unpacking CVD
files.

Now installs a 'certs' directory in the app config directory
(e.g. <prefix>/etc/certs). The install location is configurable.
The CMake option to configure the CVD certs directory is:
  `-D CVD_CERTS_DIRECTORY=PATH`

New options to set an alternative CVD certs directory:
- Commandline for freshclam, clamd, clamscan, and sigtool is:
  `--cvdcertsdir PATH`
- Env variable for freshclam, clamd, clamscan, and sigtool is:
  `CVD_CERTS_DIR`
- Config option for freshclam and clamd is:
  `CVDCertsDirectory PATH`

Sigtool:
- Add sign/verify commands.
- Also verify CDIFF external digital signatures when applying CDIFFs.
- Place commonly used commands at the top of --help string.
- Fix up manpage.

Freshclam:
- Will try to download .sign files to verify CVDs and CDIFFs.
- Fix an issue where making a CLD would only include the CFG file for
daily and not if patching any other database.

libclamav.so:
- Bump version to 13:0:1 (aka 12.1.0).
- Also remove libclamav.map versioning.
  Resolves: https://github.com/Cisco-Talos/clamav/issues/1304
- Add two new API's to the public clamav.h header:
  ```c
  extern cl_error_t cl_cvdverify_ex(const char *file,
                                    const char *certs_directory);

  extern cl_error_t cl_cvdunpack_ex(const char *file,
                                    const char *dir,
                                    bool dont_verify,
                                    const char *certs_directory);
  ```
  The original `cl_cvdverify` and `cl_cvdunpack` are deprecated.
- Add `cl_engine_field` enum option `CL_ENGINE_CVDCERTSDIR`.
  You may set this option with `cl_engine_set_str` and get it
  with `cl_engine_get_str`, to override the compiled in default
  CVD certs directory.

libfreshclam.so: Bump version to 4:0:0 (aka 4.0.0).

Add sigtool sign/verify tests and test certs.

Make it so downloadFile doesn't throw a warning if the server
doesn't have the .sign file.

Replace use of md5-based FP signatures in the unit tests with
sha256-based FP signatures because the md5 implementation used
by Python may be disabled in FIPS mode.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1411

CMake: Add logic to enable the Rust openssl-sys / openssl-rs crates
to build against the same OpenSSL library as is used for the C build.
The Rust unit test application must also link directly with libcrypto
and libssl.

Fix some log messages with missing new lines.

Fix missing environment variable notes in --help messages and manpages.

Deconflict CONFDIR/DATADIR/CERTSDIR variable names that are defined in
clamav-config.h.in for libclamav from variable that had the same name
for use in clamav applications that use the optparser.

The 'clamav-test' certs for the unit tests will live for 10 years.
The 'clamav-beta.crt' public cert will only live for 120 days and will
be replaced before the stable release with a production 'clamav.crt'.
2025-03-26 19:33:25 -04:00
Val Snyder
7ff29b8c37
Bump copyright dates for 2025 2025-02-14 10:24:30 -05:00
Stiliyan Tonev (Bark)
9a7b186aec
fix: Issue with --fail-if-cvd-older-than and non-CVD database files
Clamscan and ClamD will throw an error if you use the
'--fail-if-cvd-older-than=DAYS' / 'FailIfCvdOlderThan' option and
try to load any plaintext signature files.
That is, it throws an error when encountering plain signature files like
`.ign2`, `.ldb`, `.hdb`, etc.
This feature should only verify CVD / CLD files.

The feature (and bug) was introduced in ClamAV 1.1.0, here:
e4fe6654c1

With this change, the `cl_cvdgetage` checks will skip any file that is
not a CVD or CLD.

Fixes: https://github.com/Cisco-Talos/clamav/issues/1174
2024-07-23 16:01:07 -04:00
Micah Snyder
405829ee88 Refine max-allocation and safer-allocation function and macro names
We add the _OR_GOTO_DONE suffix to the macros that go to done if the
allocation fails. This makes it obvious what is different about the
macro versus the equivalent function, and that error handling is
built-in.

Renamed the cli_strdup to safer_strdup to make it obvious that it exists
because it is safer than regular strdup. Regular strdup doesn't have the
NULL check before trying to dup, and so may result in a NULL-deref
crash.

Also remove unused STRDUP (_OR_GOTO_DONE) macro, since the one with the
NULL-check is preferred.
2024-03-15 13:18:47 -04:00
Micah Snyder
902623972d Remove max-allocation limits where not required
The cli_max_malloc, cli_max_calloc, and cli_max_realloc functions
provide a way to protect against allocating too much memory
when the size of the allocation is derived from the untrusted input.
Specifically, we worry about values in the file being scanned being
manipulated to exhaust the RAM and crash the application.

There is no need to check the limits if the size of the allocation
is fixed, or if the size of the allocation is necessary for signature
loading, or the general operation of the applications.
E.g. checking the max-allocation limit for the size of a hash, or
for the size of the scan recursion stack, is a complete waste of
time.

Although we significantly increased the max-allocation limit in
a recent release, it is best not to check an allocation if the
allocation will be safe. It would be a waste of time.

I am also hopeful that if we can reduce the number allocations
that require a limit-check to those that require it for the safe
scan of a file, then eventually we can store the limit in the scan-
context, and make it configurable.
2024-03-15 13:18:47 -04:00
Micah Snyder
8e04c25fec Rename clamav memory allocation functions
We have some special functions to wrap malloc, calloc, and realloc to
make sure we don't allocate more than some limit, similar to the
max-filesize and max-scansize limits. Our wrappers are really only
needed when allocating memory for scans based on untrusted user input,
where a scan file could have bytes that claim you need to allocate
some ridiculous amount of memory. Right now they're named:
- cli_malloc
- cli_calloc
- cli_realloc
- cli_realloc2

... and these names do not convey their purpose

This commit renames them to:
- cli_max_malloc
- cli_max_calloc
- cli_max_realloc
- cli_max_realloc2

The realloc ones also have an additional feature in that they will not
free your pointer if you try to realloc to 0 bytes. Freeing the memory
is undefined by the C spec, and only done with some realloc
implementations, so this stabilizes on the behavior of not doing that,
which should prevent accidental double-free's.

So for the case where you may want to realloc and do not need to have a
maximum, this commit adds the following functions:
- cli_safer_realloc
- cli_safer_realloc2

These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when
MPOOL is disabled (e.g. because mmap-support is not found), so as to
match the behavior in the mpool_realloc/2 functions that do not make use
of the allocation-limit.
2024-03-15 13:18:47 -04:00
Micah Snyder
6d6e04ddf8 Optimization: replace limited allocation calls
There are a large number of allocations for fix sized buffers using the
`cli_malloc` and `cli_calloc` calls that check if the requested size is
larger than our allocation threshold for allocations based on untrusted
input. These allocations will *always* be higher than the threshold, so
the extra stack frame and check for these calls is a waste of CPU.

This commit replaces needless calls with A -> B:
- cli_malloc -> malloc
- cli_calloc -> calloc
- CLI_MALLOC -> MALLOC
- CLI_CALLOC -> CALLOC

I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by
the max-allocation threshold, when MMAP is found/enabled. But the
alternative was set to cli_malloc / cli_calloc when disabled. I changed
those as well.

I didn't change the cli_realloc/2 calls because our version of realloc
not only implements a threshold but also stabilizes the undefined
behavior in realloc to protect against accidental double-free's.
It may be worth implementing a cli_realloc that doesn't have the
threshold built-in, however, so as to allow reallocaitons for things
like buffers for loading signatures, which aren't subject to the same
concern as allocations for scanning possible malware.

There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC,
because it appears to be allocating based on untrusted input.
2024-03-15 13:18:47 -04:00
Micah Snyder
9cb28e51e6 Bump copyright dates for 2024 2024-01-22 11:27:17 -05:00
Răzvan Cojocaru
e4fe6654c1
Add options: --fail-if-cvd-older-than, FailIfCvdOlderThan
* Add a new function cl_cvdgetage() to the libclamav API. 

This function will retrieve the age of the youngest file in a
database directory, or the age of a single CVD (or CLD) file.

* Add new clamscan option --fail-if-cvd-older-than=days

When passed, causes clamscan to exit with a non-zero return code
if the virus database is older than the specified number of days.

* Add new clamd option --fail-if-cvd-older-than=days

When passed, causes clamd to exit on start-up with a non-zero
return code if the virus database is older than the specified
number of days.

Additionally, we introduce FailIfCvdOlderThan as a clamd.conf
synonym for --fail-if-cvd-older-than.

Fixes #820
2023-03-28 14:22:48 -07:00
Micah Snyder
6eebecc303 Bump copyright for 2023 2023-02-12 11:20:22 -08:00
Micah Snyder
8340e55660 Tests: unit tests for cl_load(), cl_cvdverify(), cl_cvdunpack()
Some basic testing is needed for the new cl_cvdunpack() API, so this
commit adds basic unit tests for that.

For reasons unknown, a number of cl_* API's have stubs for unit tests
that weren't filled out.  The CVD load/verify ones in particular
required access to a signed CVD.  We actually ship a very basic signed
CVD with the databases now, so I added tests for those while I was at it.
2022-10-12 21:46:54 -07:00
Micah Snyder
54b69ec431 Freshclam, Sigtool: use public CVD unpack API
In the interest of using the public API's as much as possible for our
own applications (dog-fooding the API), this commit swaps sigtool and
freshclam `cli_cvdunpack()` calls to `cl_cvdunpack()`.
2022-10-12 21:46:54 -07:00
Micah Snyder
07b08ff0a9 libclamav API: Add cl_cvdunpack() function
Add `cl_cvdunpack()` function to the public API.

This new API has an option to disable verification, but otherwise it
will attempt to verify that the CVD is correctly signed.
2022-10-12 21:46:54 -07:00
micasnyd
140c88aa4e Bump copyright for 2022
Includes minor format corrections.
2022-01-09 14:23:25 -07:00
Micah Snyder
d46832d5cf clamav.net URL update for new docs, github issues
Replace new bugzilla ticket links with links to github issues.
Replace clamav.net/documentation links with docs.clamav.net equivalents.
2021-07-17 15:28:02 -07:00
Micah Snyder (micasnyd)
b9ca6ea103 Update copyright dates for 2021
Also fixes up clang-format.
2021-03-19 15:12:26 -07:00
Micah Snyder
e01ba94e36 bb12506: Fix phishing/heuristic alert verbosity
Some detections, like phishing, are considered heuristic alerts because
they match based on behavior more than on content.  A subset of these
are considered "potentially unwanted" (low-severity).  These
low-severity alerts include:
- phishing
- PDFs with obfuscated object names
- bytecode signature alerts that start with "BC.Heuristics"

The concept is that unless you enable "heuristic precedence" (a method
of lowing the threshold to immediateley alert on low-severity
detections), the scan should continue after a match in case a higher
severity match is found.  Only at the end will it print the low-severity
match if nothing else was found.

The current implementation is buggy though. Scanning of archives does
not correctly bail out for the entire archive if one email contains a
phishing link.  Instead, it sets the "heuristic found"  flag then and
alerts for every subsequent file in the archive because it doesn't know
if the heuristic was found in an embedded file or the target file.
Because it's just a heuristic and the status is "clean", it keeps
scanning.

This patch corrects the behavior by checking if a low-severity alerts
were found at the end of scanning the target file, instead of at the end
of each embedded file.

Additionally, this patch fixes an in issue with phishing alerts wherein
heuristic precedence mode did not cause a scan to stop after the first
alert.

The above changes required restructuring to create an fmap inside of
cl_scandesc_callback() so that scan_common() could be modified to
require an fmap and set up so that the current *ctx->fmap pointer is
never NULL when scan_common() evaluates match results.

Also fixed a couple minor bugs in the phishing unit tests and cleaned up
the test code for improved legitibility and type safety.
2020-06-03 17:20:35 -04:00
Micah Snyder
206dbaefe8 Update copyright dates for 2020 2020-01-03 15:44:07 -05:00
Micah Snyder
ee40795fe2 Converted mpool calls to macros when USE_MPOOL is defined to clearly differentiate between function and macro behavior. 2019-10-02 16:08:25 -04:00
Andrew
7ba310e605 PE parsing code improvements, db loading bug fixes
Consolidate the PE parsing code into one function.  I tried to preserve all existing functionality from the previous, distinct implementations to a large extent (with the exceptions mentioned below).  If I noticed potential bugs/improvements, I added a TODO statement about those so that they can be fixed in a smaller commit later.  Also, there are more TODOs in places where I'm not entirely sure why certain actions are performed - more research is needed for these.

I'm submitting a pull request now so that regression testing can be done, and because merging what I have thus far now will likely have fewer conflicts than if I try to merge later

PE parsing code improvements:
- PEs without all 16 data directories are parsed more appropriately now
- Added lots more debug statements

Also:
 - Allow MAX_BC and MAX_TRACKED_PCRE to be specified via CFLAGS

    When doing performance testing with the latest CVD, MAX_BC and
    MAX_TRACKED_PCRE need to be raised to track all the events.
    Allow these to be specified via CFLAGS by not redefining them
    if they are already defined

- Fix an issue preventing wildcard sizes in .MDB/.MSB rules

    I'm not sure what the original intent of the check I removed was,
    but it prevents using wildcard sizes in .MDB/.MSB rules.  AFAICT
    these wildcard sizes should be handled appropriately by the MD5
    section hash computation code, so I don't think a check on that
    is needed.

- Fix several issues related to db loading
     - .imp files will now get loaded if they exist in a directory passed
       via clamscan's '-d' flag
     - .pwdb files will now get loaded if they exist in a directory passed
       via clamscan's '-d' flag even when compiling without yara support
     - Changes to .imp, .ign, and .ign2 files will now be reflected in calls
       to cl_statinidir and cl_statchkdir (and also .pwdb files, even when
       compiling without yara support)
     - The contents of .sfp files won't be included in some of the signature
       counts, and the contents of .cud files will be
     - Any local.gdb files will no longer be loaded twice

- For .imp files, you are no longer required to specify a minimum flevel for wildcard rules, since this isn't needed
2019-10-02 16:08:20 -04:00
Micah Snyder
52cddcbcfd Updating and cleaning up copyright notices. 2019-10-02 16:08:18 -04:00
Micah Snyder
72fd33c8b2 clang-format'd using new .clang-format rules. 2019-10-02 16:08:16 -04:00
Micah Snyder
964a1e7321 Converting http urls to https urls. Primary focus was on clamav.net urls. I updated a couple others and fixes a few broken links as well. There are many (non-clamav.net) urls I didn't address, especially in 3rd party or contrib code. 2018-04-02 07:58:33 -04:00
Micah Snyder
6289eda8e0 Eliminating AUTHORS file, and moving acknowledgements for various source code contributions to the file comment blocks for the individual files, as appropriate. 2018-03-06 17:44:05 -05:00
Micah Snyder
7b1f1aaf9a fixed minor warnings regarding type conversions. 2017-08-08 17:38:17 -04:00
Kevin Lin
52da917589 bb#11421 - CUD digital signature verification and empty files 2015-10-29 17:48:46 -04:00
Mickey Sola
46a35abe56 mass update of copyright headers 2015-09-17 13:41:26 -04:00
Steven Morgan
dfbb1604fd bb#11195 - change html links in code to match the current clamav.net website. 2014-11-20 15:11:15 -05:00
Joel Esler
00fb0d9118 Fixed broken links.
Across the whole of the product.
2014-09-02 11:29:35 -04:00
Shawn Webb
cd94be7a52 Silence a bunch of compiler warnings in libclamav 2014-07-10 18:11:49 -04:00
Shawn Webb
60d8d2c352 Move all the crypto API to clamav.h 2014-07-01 19:38:01 -04:00
Shawn Webb
da6e06dd68 Provide further abstractions to the OpenSSL integration work 2014-02-28 12:12:30 -05:00
Shawn Webb
f077c6174f Fix some race conditions. Fix some memory leaks. 2014-02-13 13:05:50 -05:00
Shawn Webb
a1cbd793f3 Fix all memory leaks introduce by OpenSSL backport. 2014-02-12 17:42:48 -05:00
Shawn Webb
7fb5036fb2 Make Valgrind happy. Rely less on EVP_MD_CTX_create. 2014-02-08 01:42:41 -05:00
Shawn Webb
b2e7c931d0 Use OpenSSL for hashing. 2014-02-08 00:31:12 -05:00
Shawn Webb
7c92a662d9 Add functions to set the stats callbacks. Don't submit stats when verifying CVDs. 2014-01-28 10:39:05 -05:00
David Raynor
a6c9b78168 libclamav: cli_cvdverify() patch 2013-04-10 15:28:20 -04:00
David Raynor
0944b8eeb9 cid #10234 2013-02-12 17:33:21 -05:00
David Raynor
003a784077 cid #11136 2013-02-12 16:51:10 -05:00
David Raynor
e828f534df Log messages for malformed DB cases 2012-08-10 10:17:22 -04:00
Ryan Pentney
23b89d6b34 Bug 5554, double-close bug 2012-07-27 12:17:14 -07:00
Shawn webb
a2a004df25 BB#3737 - Value too large for specified data type
Create compile-time preprocessor defines for switching from calling
stat() to stat64(). Add --enable-stat64 switch in configure script.
2012-07-16 15:36:49 -04:00
David Raynor
bebd86a60b bb#5343 2012-06-22 16:55:29 -04:00
Török Edvin
3afedd0761 fix GCC warnings.
especially the one about gzFile vs gzFile*, gzopen returns gzFile!
2012-05-30 13:37:32 +03:00
Tomasz Kojm
cdddd014ff sigtool: add support for building unsigned dbs (--unsigned)
libclamav: handle unsigned db files (.cud)
2011-05-10 21:29:49 +02:00