Commit graph

425 commits

Author SHA1 Message Date
Val S.
d4114e0d2c
Fix static analysis code quality issues; Fix old libjson-c support (#1574)
`clamscan/manager.c`: Fix double-free in an error condition in `scanfile()`.

`common/optparser.c`: Fix uninitialized use of the `numarg` variable when
`arg` is `NULL`.

`libclamav/cache.c`: Don't check if `ctx-fmap` is `NULL` when we've
already dereferenced it.

`libclamav/crypto.c`: The `win_exception` variable and associated logic
is Windows-specific and so needs preprocessor platform checks. Otherwise
it generates unused variable warnings.

`libclamav/crypto.c`: Check for `size_t` overflow of the `byte_read`
variable in the `cl_hash_file_fd_ex()` function.

`libclamav/crypto.c`: Fix a memory leak in the `cl_hash_file_fd_ex()`
function.

`libclamav/fmap.c`: Correctly the `name` and `path` pointer if
`fmap_duplicate()` fails. Also need to clear those variables when
duplicating the parent `map` so that on error it does not free the wrong
`name` or `path`.

`libclamav/fmap.c`: Refine error handling for `hash_string` cleanup in
`cl_fmap_get_hash()`. Coverity's complaint was that `hash_string` could
never be non-NULL if `status` is not `CL_SUCCESS`. I.e., the cleanup is
dead code. I don't think my cleanup actually "fixes" that though it is
definitely a better way to do the error handling.
The `if (NULL != hash_string) {` check is still technically dead code.
It safeguards against future changes that may `goto done` between the
allocation and transfering ownership from `hash_string` to `hash_out`.

`libclamav/others.c`: Fix possible memory leak in `cli_recursion_stack_push()`.

`libclamav/others.c`: Refactor an if/else + switch statement inside
`cli_dispatch_scan_callback()` so that the `CL_SCAN_CALLBACK_ALERT` case
is not dead-code. It's also easier to read now.

`libclamav/pdfdecode.c`: For logging, use the `%zu` to format `size_t`
instead of casting to `long long` and using `%llu`. Simiularly use the
`STDu32` format string macro for `uint32_t`.

`libclamav/pdfdecode.c`: Fix a possible double-free for the `decoded`
pointer in `filter_lzwdecode()`.

`libclamav/pdfdecode.c`: Remove the `if (capacity > UINT_MAX) {`
overflow check inside `filter_lzwdecode()`, which didn't do anything.
The `capacity` variable this point is a fixed value and so I also changed
the `avail_out` to be that fixed `INFLATE_CHUNK_SIZE` value rather than
using `capacity`. It is more straightforward and replicates how similar
logic works later in the file.
I also removed the copy-pasted `(Bytef *)` cast which didn't reaaally do
anything, and was a copypaste from a different algorihm. The lzw
implementation interface doesn't use `Bytef`.

`libclamav/readdb.c`: Fix a possible NULL-deref on the `matcher` variable
in the error handling/cleanup code if the function fails.

`libclamav/scanners.c`: Fix an issue where the return value from some of
the parsers may be lost/overridden by the call to
`cli_dispatch_scan_callback()` just after the `done:` label in
`cli_magic_scan()`.

`libclamav/scanners.c`: Silence an unused-return value warning when
calling `cli_basename()`.

`sigtool/sigtool.c` and `unit_tests/check_regex.c`:
Fix possible NULL-derefs of the `ctx.recursion_stack` pointer in the error
handling for several functions.

Also, and this isn't a Coverity thing:

`libclamav/json_api.c` and `libclamav/others.c`:
Fix support for libjson-c version 0.13 and older.
I don't think we *should* be using the old version, but some environments
such as the current OSS-Fuzz base image are older and still use it.
The issue is that `json_object_new_uint64()` was introduced in a later
libjson-c version, so we have to fallback to use `json_object_new_int64()`
with older libjson-c, provided the int were storing isn't too big.

CLAM-2768
2025-09-26 18:26:00 -04:00
Valerie Snyder
6d9b57eeeb
libclamav: cl_scan*_ex() functions provide verdict separate from errors
It is a shortcoming of existing scan APIs that it is not possible
to return an error without masking a verdict.
We presently work around this limitation by counting up detections at
the end and then overriding the error code with `CL_VIRUS`, if necessary.

The `cl_scanfile_ex()`, `cl_scandesc_ex()`, and `cl_scanmap_ex()` functions
should provide the scan verdict separately from the error code.

This introduces a new enum for recording and reporting a verdict:
`cl_verdict_t` with options:

- `CL_VERDICT_NOTHING_FOUND`
- `CL_VERDICT_TRUSTED`
- `CL_VERDICT_STRONG_INDICATOR`
- `CL_VERDICT_POTENTIALLY_UNWANTED`

Notably, the newer scan APIs may set the verdict to `CL_VERDICT_TRUSTED`
if there is a (hash-based) FP signature for a file, or in the cause where
Authenticode or similar certificate-based verification was performed, or
in the case where an application scan callback returned `CL_VERIFIED`.

CLAM-763
CLAM-865
2025-08-14 22:40:46 -04:00
Valerie Snyder
cf11815ae3
ClamsScan: add missing --json-store-extra-hashes option to help and manpage 2025-08-14 22:40:44 -04:00
Valerie Snyder
13c4788f36
FIPS & FIPS-like limits on hash algs for cryptographic uses
ClamAV will not function when using a FIPS-enabled OpenSSL 3.x.
This is because ClamAV uses MD5 and SHA1 algorithms for a variety of
purposes including matching for malware detection, matching to prevent
false positives on known-clean files, and for verification of MD5-based
RSA digital signatures for determining CVD (signature database archive)
authenticity.

Interestingly, FIPS had been intentionally bypassed when creating hashes
based whole buffers and whole files (by descriptor or `FILE`-pointer):
78d4a9985a
Note: this bypassed FIPS the 1.x way with:
`EVP_MD_CTX_set_flags(ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW);`

It was NOT disabled when using `cl_hash_init()` / `cl_update_hash()` /
`cl_finish_hash()`. That likely worked by coincidence in that the hash
was already calculated most of the time. It certainly would have made
use of those functions if the hash had not been calculated prior:
78d4a9985a/libclamav/matcher.c (L743)

Regardless, bypassing FIPS entirely is not the correct solution.
The FIPS restrictions against using MD5 and SHA1 are valid, particularly
when verifying CVD digital siganatures, but also I think when using a
hash to determine if the file is known-clean (i.e. the "clean cache" and
also MD5-based and SHA1-based FP signatures).

This commit extends the work to bypass FIPS using the newer 3.x method:
`md = EVP_MD_fetch(NULL, alg, "-fips");`

It does this for the legacy `cl_hash*()` functions including
`cl_hash_init()` / `cl_update_hash()` / `cl_finish_hash()`.
It also introduces extended versions that allow the caller to choose if
they want to bypass FIPS:
- `cl_hash_data_ex()`
- `cl_hash_init_ex()`
- `cl_update_hash_ex()`
- `cl_finish_hash_ex()`
- `cl_hash_destroy_ex()`
- `cl_hash_file_fd_ex()`
See the `flags` parameter for each.

Ironically, this commit does NOT use the new functions at this time.
The rational is that ClamAV may need MD5, SHA1, and SHA-256 hashes of
the same files both for determining if the file is malware, and for
determining if the file is clean.

So instead, this commit will do a checks when:

1. Creating a new ClamAV scanning engine. If FIPS-mode enabled, it will
   automatically toggle the "FIPS limits" engine option.
   When loading signatures, if the engine "FIPS limits" option is enabled,
   then MD5 and SHA1 FP signatures will be skipped.

2. Before verifying a CVD (e.g. also for loading, unpacking when
   verification enabled).
   If "FIPS limits" or FIPS-mode are enabled, then the legacy MD5-based RSA
   method is disabled.

   Note: This commit also refactors the interface for `cl_cvdverify_ex()`
   and `cl_cvdunpack_ex()` so they take a `flags` parameters, rather than a
   single `bool`. As these functions are new in this version, it does not
   break the ABI.

The cache was already switched to use SHA2-256, so that's not a concern
for checking FIPS-mode / FIPS limits options.

This adds an option for `freshclam.conf` and `clamd.conf`:

   FIPSCryptoHashLimits yes

And an equivalent command-line option for `clamscan` and `sigtool`:

   --fips-limits

You may programmatically enable FIPS-limits for a ClamAV engine like this:
```C
   cl_engine_set_num(engine, CL_ENGINE_FIPS_LIMITS, 1);
```

CLAM-2792
2025-08-14 22:39:15 -04:00
Valerie Snyder
51adfb8b61
ClamScan & libclamav: improve precision of bytes-scanned, bytes-read
The ClamScan scan summary prints bytes scanned and bytes read in
multiples of 4096 (aka `CL_COUNT_PRECISION`), as is provided by the
`cl_scanfile()`, `cl_scandesc()`, `cl_scanfile_callback()`, and
`cl_scandesc_callback()` functions.

I believe this imprecision was the result of using an `unsigned long int`
which may be 64bit or 32bit, depending on platform. I believe the
intention was to be able to support scanning more than 4 GiB of data.

Since the new `cl_scan*_ex()` functions use a `uint64_t`, which
guarantees a 64bit integer and supports ~16,777,216 terabytes, I find no
reason not to report an accurate count.

For the legacy scan functions (above) I've kept the `CL_COUNT_PRECISION`
behavior to maintain backwards compatibility.

I have also improved the bytes scanned/read output to report GiB, MiB,
KiB, or B as appropriate. Previously, it always report "MB".

CLAM-1433
2025-08-14 22:39:15 -04:00
Valerie Snyder
466c875d69
ClamScan: Add hash & file-type in/out CLI options
Adds the following ClamScan CLI options:

* --hash-hint

  The file hash so that libclamav does not need to calculate it.
  The type of hash must match the '--hash-alg'.

* --log-hash

  Print the file hash after each file scanned.
  The type of hash printed will match the '--hash-alg'.

* --hash-alg

  The hashing algorithm used for either '--hash-hint' or '--log-hash'.
  Supported algorithms are 'md5', 'sha1', 'sha2-256'.
  If not specified, the default is 'sha2-256'.

* --file-type-hint

  The file type hint so that libclamav can optimize scanning.
  E.g. 'pe', 'elf', 'zip', etc.
  You may also use ClamAV type names such as 'CL_TYPE_PE'.
  ClamAV will ignore the hint if it is not familiar with the specified type.
  See also: https://docs.clamav.net/appendix/FileTypes.html#file-types

* --log-file-type

  Print the file type after each file scanned.

Will NOT be adding this for ClamDScan, as we don't have a mechanism
in the ClamD socket API to receive scan options or a way for ClamD
to include scan metadata in the response.
2025-08-14 22:39:12 -04:00
Valerie Snyder
31dcec1e42
libclamav: Add engine option to toggle temp directory recursion
Temp directory recursion in ClamAV is when each layer of a scan gets its
own temp directory in the parent layer's temp directory.

In addition to temp directory recursion, ClamAV has been creating a new
subdirectory for each file scan as a risk-adverse method to ensure
no temporary file leaks fill up the disk.
Creating a directory is relatively slow on Windows in particular if
scanning a lot of very small files.

This commit:

1. Separates the temp directory recursion feature from the leave-temps
   feature so that libclamav can leave temp files without making
   subdirectories for each file scanned.

2. Makes it so that when temp directory recursion is off, libclamav
   will just use the configure temp directory for all files.

The new option to enable temp directory recursion is for libclamav-only
at this time. It is off by default, and you can enable it like this:

```c
cl_engine_set_num(engine, CL_ENGINE_TMPDIR_RECURSION, 1);
```

For the `clamscan` and `clamd` programs, temp directory recursion will
be enabled when `--leave-temps` / `LeaveTemporaryFiles` is enabled.

The difference is that when disabled, it will return to using the
configured temp directory without making a subdirectory for each file
scanned, so as to improve scan performance for small files, mostly on
Windows.

Under the hood, this commit also:

1. Cleans up how we keep track of tmpdirs for each layer.
   The goal here is to align how we keep track of layer-specific stuff
   using the scan_layer structure.

2. Cleans up how we record metadata JSON for embedded files.
   Note: Embedded files being different from Contained files, as they
         are extracted not with a parser, but by finding them with
         file type magic signatures.

CLAM-1583
2025-08-14 22:38:58 -04:00
Valerie Snyder
aa7b7e9421
Swap clean cache from MD5 to SHA2-256
Change the clean-cache to use SHA2-256 instead of MD5.
Note that all references are changed to specify "SHA2-256" now instead
of "SHA256", for clarity. But there is no plan to add support for SHA3
algorithms at this time.

Significant code cleanup. E.g.:
- Implemented goto-done error handling.
- Used `uint8_t *` instead of `unsigned char *`.
- Use `bool` for boolean checks, rather than `int.
- Used `#defines` instead of magic numbers.
- Removed duplicate `#defines` for things like hash length.

Add new option to calculate and record additional hash types when the
"generate metadata JSON" feature is enabled:
- libclamav option: `CL_SCAN_GENERAL_STORE_EXTRA_HASHES`
- clamscan option: `--json-store-extra-hashes` (default off)
- clamd.conf option: `JsonStoreExtraHashes` (default 'no')

Renamed the sigtool option `--sha256` to `--sha2-256`.
The original option is still functional, but is deprecated.

For the "generate metadata JSON" feature, the file hash is now stored as
"sha2-256" instead of "FileMD5". If you enable the "extra hashes" option,
then it will also record "md5" and "sha1".

Deprecate and disable the internal "SHA collect" feature.
This option had been hidden behind C #ifdef checks for an option that
wasn't exposed through CMake, so it was basically unavailable anyways.

Changes to calculate file hashes when they're needed and no sooner.

For the FP feature in the matcher module, I have mimiced the
optimization in the FMAP scan routine which makes it so that it can
calculate multiple hashes in a single pass of the file.

The `HandlerType` feature stores a hash of the file in the scan ctx to
prevent retyping the exact same data more than once.
I removed that hash field and replaced it with an attribute flag that is
applied to the new recursion stack layer when retyping a file.
This also closes a minor bug that would prevent retyping a file with an
all-zero hash. :)

The work upgrading cache.c to support SHA2-256 sized hashes thanks to:
https://github.com/m-sola

CLAM-255
CLAM-1858
CLAM-1859
CLAM-1860
2025-08-14 21:23:30 -04:00
ember91
89bacba696
Windows: Fix issue printing unicode filenames (#1461)
On Windows, use CP_UTF8 over CP_OEMCP for output.
2025-07-25 17:42:43 -04:00
Valerie Snyder
9d4fcbdb1e
clamd: Add missing scan option checks for PDF and HTML URIs 2025-06-04 14:06:52 -04:00
Val S.
e86919789f
Merge pull request #1482 from jhumlick/CLAM-2588-PDF-Urls
Clam 2588 Record PDF URIs if generating scan metadata
2025-05-31 18:57:00 -04:00
John Humlick
e1e3d4c64d
libclamav: Add URI scanning support to PDF parser
Threat Research requests scanning URIs in PDF files and adding them to
the json report file.

This change adds URI scanning support to the PDF parser, including
support for object references to URIs in PDF files.

Jira: CLAM-2588

Fix out-of-order references and other minor improvements.

CLAM-2588, CLAM-2757
2025-05-30 12:41:17 -07:00
Val Snyder
8d485b9bfd
FIPS-compliant CVD signing and verification
Add X509 certificate chain based signing with PKCS7-PEM external
signatures distributed alongside CVD's in a custom .cvd.sign format.
This new signing and verification mechanism is primarily in support
of FIPS compliance.

Fixes: https://github.com/Cisco-Talos/clamav/issues/564

Add a Rust implementation for parsing, verifying, and unpacking CVD
files.

Now installs a 'certs' directory in the app config directory
(e.g. <prefix>/etc/certs). The install location is configurable.
The CMake option to configure the CVD certs directory is:
  `-D CVD_CERTS_DIRECTORY=PATH`

New options to set an alternative CVD certs directory:
- Commandline for freshclam, clamd, clamscan, and sigtool is:
  `--cvdcertsdir PATH`
- Env variable for freshclam, clamd, clamscan, and sigtool is:
  `CVD_CERTS_DIR`
- Config option for freshclam and clamd is:
  `CVDCertsDirectory PATH`

Sigtool:
- Add sign/verify commands.
- Also verify CDIFF external digital signatures when applying CDIFFs.
- Place commonly used commands at the top of --help string.
- Fix up manpage.

Freshclam:
- Will try to download .sign files to verify CVDs and CDIFFs.
- Fix an issue where making a CLD would only include the CFG file for
daily and not if patching any other database.

libclamav.so:
- Bump version to 13:0:1 (aka 12.1.0).
- Also remove libclamav.map versioning.
  Resolves: https://github.com/Cisco-Talos/clamav/issues/1304
- Add two new API's to the public clamav.h header:
  ```c
  extern cl_error_t cl_cvdverify_ex(const char *file,
                                    const char *certs_directory);

  extern cl_error_t cl_cvdunpack_ex(const char *file,
                                    const char *dir,
                                    bool dont_verify,
                                    const char *certs_directory);
  ```
  The original `cl_cvdverify` and `cl_cvdunpack` are deprecated.
- Add `cl_engine_field` enum option `CL_ENGINE_CVDCERTSDIR`.
  You may set this option with `cl_engine_set_str` and get it
  with `cl_engine_get_str`, to override the compiled in default
  CVD certs directory.

libfreshclam.so: Bump version to 4:0:0 (aka 4.0.0).

Add sigtool sign/verify tests and test certs.

Make it so downloadFile doesn't throw a warning if the server
doesn't have the .sign file.

Replace use of md5-based FP signatures in the unit tests with
sha256-based FP signatures because the md5 implementation used
by Python may be disabled in FIPS mode.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1411

CMake: Add logic to enable the Rust openssl-sys / openssl-rs crates
to build against the same OpenSSL library as is used for the C build.
The Rust unit test application must also link directly with libcrypto
and libssl.

Fix some log messages with missing new lines.

Fix missing environment variable notes in --help messages and manpages.

Deconflict CONFDIR/DATADIR/CERTSDIR variable names that are defined in
clamav-config.h.in for libclamav from variable that had the same name
for use in clamav applications that use the optparser.

The 'clamav-test' certs for the unit tests will live for 10 years.
The 'clamav-beta.crt' public cert will only live for 120 days and will
be replaced before the stable release with a production 'clamav.crt'.
2025-03-26 19:33:25 -04:00
Val Snyder
7ff29b8c37
Bump copyright dates for 2025 2025-02-14 10:24:30 -05:00
Andy Ragusa
666e047f2b
Store URLs from HTML when recording scan metadata json
Store URLs found in HTML `<a>` and `<form>` tags during scan of HTML files
when recording scan metadata.

HTML URL recording will be ON by default, but is a part of the
generate-metadata-json feature.
The generate-metadata-json feature is OFF by default.

This introduces a new general scan option:
- libclamav: `CL_SCAN_GENERAL_STORE_HTML_URLS`.
- ClamD: `JsonStoreHTMLUrls`.
- ClamScan: `--json-store-html-urls`

Thank you Matt Jolly for the helpful comment on the pull request.
2024-09-11 13:40:29 -04:00
Micah Snyder
e7cb0ff6f1
Clang-format touchup 2024-09-09 12:46:33 -04:00
Andy Ragusa
29987c0eeb
Limit the max-recursion scan option to 100
There is presently no limit for the max-recursion scan option.
Selecting a max-recursion limit that is too high will cause confusing
errors. E.g.:

/home/aragusa/install.alz/bin/clamscan -d clamav.hdb . --max-recursion=9999999999

LibClamAV Error: fmap_fd: Attempted to get fd for NULL fmap
/home/aragusa/issue/clamav.hdb: Can't allocate memory ERROR
LibClamAV Error: fmap_fd: Attempted to get fd for NULL fmap
/home/aragusa/issue/test.sh: Can't allocate memory ERROR

This commit prevents setting the max-recursion limit higher than 100.
2024-09-09 12:32:29 -04:00
Stiliyan Tonev (Bark)
9a7b186aec
fix: Issue with --fail-if-cvd-older-than and non-CVD database files
Clamscan and ClamD will throw an error if you use the
'--fail-if-cvd-older-than=DAYS' / 'FailIfCvdOlderThan' option and
try to load any plaintext signature files.
That is, it throws an error when encountering plain signature files like
`.ign2`, `.ldb`, `.hdb`, etc.
This feature should only verify CVD / CLD files.

The feature (and bug) was introduced in ClamAV 1.1.0, here:
e4fe6654c1

With this change, the `cl_cvdgetage` checks will skip any file that is
not a CVD or CLD.

Fixes: https://github.com/Cisco-Talos/clamav/issues/1174
2024-07-23 16:01:07 -04:00
Micah Snyder
47dfe9bd5d Remove libjson-c dead code
As of ClamAV 0.105, libjson-c is required.
There is also no option to disable libjson-c support.

This commit removes the dead code associated with the old build
option.
2024-04-13 12:34:15 -04:00
Micah Snyder
a729aafc38 Remove PCRE dead code
As of ClamAV 0.105, PCRE2 is required. PCRE (1) is not an option, and
there is also no option to disable PCRE support.

This commit removes the dead code associated with those old build
options.
2024-04-13 12:34:15 -04:00
Micah Snyder
48393b67b8 clamscan: Add options missing from manpage and --help
The --force-to-disk option is missing from the clamscan --help and
clamscan manpage documentation.

Also change clamd.conf.sample suggestions to differ the from default
settings so that the sample is easier to use.
2024-03-14 16:57:48 -04:00
Micah Snyder
2cc47c83ac Make image fuzzy hashing optional
Image fuzzy hashing is enabled by default. The following options have
been added to allow users to disable it, if desired.

New clamscan options:

  --scan-image[=yes(*)/no]

  --scan-image-fuzzy-hash[=yes(*)/no]

New clamd config options:

  ScanImage yes(*)/no

  ScanImageFuzzyHash yes(*)/no

New libclamav scan options:

  options.parse &= ~CL_SCAN_PARSE_IMAGE;

  options.parse &= ~CL_SCAN_PARSE_IMAGE_FUZZY_HASH;

This commit also changes scan behavior to disable image fuzzy hashing
for specific types when the DCONF (.cfg) signatures disable those types.
That is, if DCONF disables the PNG parser, it should not only disable
the CVE/format checker for PNG files, but also disable image fuzzy
hashing for PNG files.

Also adds a DCONF option to disable image fuzzy hashing:
  OTHER_CONF_IMAGE_FUZZY_HASH

DCONF allows scanning features to be disabled using a configuration
"signature".
2024-03-14 16:57:48 -04:00
Micah Snyder
9cb28e51e6 Bump copyright dates for 2024 2024-01-22 11:27:17 -05:00
Micah Snyder
3b2f8c044a Support for extracting attachments from OneNote section files
Includes rudimentary support for getting slices from FMap's and for
interacting with libclamav's context structure.

For now will use a Cisco-Talos org fork of the onenote_parser
until the feature to read open a onenote section from a slice (instead
of from a filepath) is added to the upstream.
2023-12-11 15:18:41 -05:00
RainRat
caf324e544
Fix typos (no functional changes) 2023-11-26 18:01:19 -05:00
Craig Andrews
e70493cf61 Add options: --cache-size, CacheSize
* Add new clamd and clamscan option --cache-size

This option allows you to set the number of entries the cache can store.

Additionally, introduce CacheSize as a clamd.conf
synonym for --cache-size.

Fixes #867
2023-05-16 19:18:30 -07:00
Răzvan Cojocaru
e4fe6654c1
Add options: --fail-if-cvd-older-than, FailIfCvdOlderThan
* Add a new function cl_cvdgetage() to the libclamav API. 

This function will retrieve the age of the youngest file in a
database directory, or the age of a single CVD (or CLD) file.

* Add new clamscan option --fail-if-cvd-older-than=days

When passed, causes clamscan to exit with a non-zero return code
if the virus database is older than the specified number of days.

* Add new clamd option --fail-if-cvd-older-than=days

When passed, causes clamd to exit on start-up with a non-zero
return code if the virus database is older than the specified
number of days.

Additionally, we introduce FailIfCvdOlderThan as a clamd.conf
synonym for --fail-if-cvd-older-than.

Fixes #820
2023-03-28 14:22:48 -07:00
Micah Snyder
6eebecc303 Bump copyright for 2023 2023-02-12 11:20:22 -08:00
Micah Snyder
4a0382cf7a CMake Windows: Install debug symbol files for debug builds
CMake does not install the PDB debugging symbol files automatically.
These are useful for testing programs built using libclamav.dll.
2022-08-12 13:08:34 -07:00
Micah Snyder
15eef50656 Code cleanup: Refactor to clean up formatting issues
Refactored the clamscan code that determines 'what to scan' in order
to clean up some very messy logic and also to get around a difference in
how vscode and clang-format handle formatting #ifdef blocks in the
middle of an else/if.

In addition to refactoring, there is a slight behavior improvement. With
this change, doing `clamscan blah -` will now scan `blah` and then also
scan `stdin`.  You can even do `clamscan - blah` to now scan `stdin` and
then scan `blah`. Before, The `-` had to be the only "filename" argument
in order to scan from stdin.

In addition, added a bunch of extra empty lines or changing multi-line
function calls to single-line function calls in order to get around a
bug in clang-format with these two options do not playing nice together:
- AlignConsecutiveAssignments: true
- AlignAfterOpenBracket: true

AlignAfterOpenBracket is not taking account the spaces inserted by
AlignConsecutiveAssignments, so you end up with stuff like this:
```c
    bleeblah = 1;
    blah     = function(arg1,
                    arg2,
                    arg3);

                //  ^--- these args 4-left from where they should be.
```

VSCode, meanwhile, somehow fixes this whitespace issue so code that is
correctly formatted by VSCode doesn't have this bug, meaning that:

1. The clang-format check in GH Actions fails.
2. We'd all have to stop using format-on-save in VSCode and accept the
  bug if we wanted those GH Actions tests to pass.

Adding an empty line before variable assignments from multi-line
function calls evades the buggy behavior.

This commit should resolve the clang-format github action test failures,
for now.
2022-03-22 10:42:46 -07:00
mko-x
a21cc6dcd7
Add explicit log level parameter to application logging API
* Added loglevel parameter to logg()

* Fix logg and mprintf internals with new loglevels

* Update all logg calls to set loglevel

* Update all mprintf calls to set loglevel

* Fix hidden logg calls

* Executed clam-format
2022-02-15 15:13:55 -08:00
micasnyd
140c88aa4e Bump copyright for 2022
Includes minor format corrections.
2022-01-09 14:23:25 -07:00
Micah Snyder
016af483e6 CMake: support macOS code signing during build
To build with code signing, the macOS build must have:
  -G Xcode \
  -D CLAMAV_SIGN_FILE=ON \
  -D CODE_SIGN_IDENTITY="...your codesign ID..." \
  -D DEVELOPMENT_TEAM_ID="...your team ID..." \

You can find the codesign ID using:
  /usr/bin/env xcrun security find-identity -v -p codesigning

The team ID should also be listed in the identity description.

Also I changed the package name for APPLE to be "clamav" so it doesn't
put "ClamAV <version>" in the PKG PackageInfo like this:
  com.cisco.ClamAV 0.104.0.libraries
Instead, it should just be something like:
  com.cisco.clamav.libraries

Version is a separate field in that file and shouldn't be in the name.
2021-10-11 11:28:37 -07:00
kang-grace
23dfe8fc4c
ClamScan, ClamDScan: process memory scanning (Windows)
Add the process memory scanning feature from ClamWin's ClamScan.
This commit extends that feature to make it available in ClamDScan 
as well.

This adds three new options to ClamScan and ClamDScan on Windows:
* --memory
* --kill
* --unload

--allmatch and --stream are available for ClamDScan.

To reduce code duplication, this refactors clamd related code
used in both scanmem.c and proto.c into clamdcom. 
Moved send_fdpass(), send_stream(), chkpath(), dconnect(), and
dsresult(); as well as some type definitions.

Special thanks to Gianluigi Tiesi for allowing us to integrate the 
Windows process memory scanning feature from ClamWin into the ClamAV.
2021-08-27 09:14:45 -07:00
Micah Snyder
e0e0c8f955 CMake: Support to build deb, rpm, & macOS pkg packages
CMake/CPack is already used to build:
- TGZ source tarball
- WiX-based installer (Windows)
- ZIP install packages (Windows)

This commit adds support for building:
- macOS PKG installer
- DEB package
- RPM package

This should also enable building FreeBSD packages, but while I was able
to build all of the static dependencies using Mussels, CMake/CPack 3.20
doesn't appear to have the the FreeBSD generator despite being in the
documentation.

The package names are will be in this format:
  clamav-<version><suffix>.<os>.<arch>.<extension>

This includes changing the Windows .zip and .msi installer names.

E.g.:
- clamav-0.104.0-rc.macos.x86_64.pkg
- clamav-0.104.0-rc.win.win32.msi
- clamav-0.104.0-rc.win.win32.zip
- clamav-0.104.0-rc.win.x64.msi
- clamav-0.104.0-rc.linux.x86_64.deb
- clamav-0.104.0-rc.linux.x86_64.rpm

Notes about building the packages:

I've only tested this with building ClamAV using static dependencies that
I build using the clamav_deps "host-static" recipes from the "clamav"
Mussels cookbook. Eg:

  msl build clamav_deps -t host-static

Here's an example configuration to build clam in this way, installing to
/usr/local/clamav:

```sh
cmake .. \
  -D CMAKE_FIND_PACKAGE_PREFER_CONFIG=TRUE \
  -D CMAKE_PREFIX_PATH=$HOME/.mussels/install/host-static \
  -D CMAKE_INSTALL_PREFIX="/usr/local/clamav" \
  -D CMAKE_MODULE_PATH=$HOME/.mussels/install/host-static/lib/cmake \
  -D CMAKE_BUILD_TYPE=RelWithDebInfo \
  -D ENABLE_EXAMPLES=OFF \
  -D JSONC_INCLUDE_DIR="$HOME/.mussels/install/host-static/include/json-c" \
  -D JSONC_LIBRARY="$HOME/.mussels/install/host-static/lib/libjson-c.a" \
  -D ENABLE_JSON_SHARED=OFF \
  -D BZIP2_INCLUDE_DIR="$HOME/.mussels/install/host-static/include" \
  -D BZIP2_LIBRARY_RELEASE="$HOME/.mussels/install/host-static/lib/libbz2_static.a" \
  -D OPENSSL_ROOT_DIR="$HOME/.mussels/install/host-static" \
  -D OPENSSL_INCLUDE_DIR="$HOME/.mussels/install/host-static/include" \
  -D OPENSSL_CRYPTO_LIBRARY="$HOME/.mussels/install/host-static/lib/libcrypto.a" \
  -D OPENSSL_SSL_LIBRARY="$HOME/.mussels/install/host-static/lib/libssl.a" \
  -D LIBXML2_INCLUDE_DIR="$HOME/.mussels/install/host-static/include/libxml2" \
  -D LIBXML2_LIBRARY="$HOME/.mussels/install/host-static/lib/libxml2.a" \
  -D PCRE2_INCLUDE_DIR="$HOME/.mussels/install/host-static/include" \
  -D PCRE2_LIBRARY="$HOME/.mussels/install/host-static/lib/libpcre2-8.a" \
  -D CURSES_INCLUDE_DIR="$HOME/.mussels/install/host-static/include" \
  -D CURSES_LIBRARY="$HOME/.mussels/install/host-static/lib/libncurses.a" \
  -D ZLIB_INCLUDE_DIR="$HOME/.mussels/install/host-static/include" \
  -D ZLIB_LIBRARY="$HOME/.mussels/install/host-static/lib/libz.a" \
  -D LIBCHECK_INCLUDE_DIR="$HOME/.mussels/install/host-static/include" \
  -D LIBCHECK_LIBRARY="$HOME/.mussels/install/host-static/lib/libcheck.a"
```

Set CPACK_PACKAGING_INSTALL_PREFIX to customize the resulting package's
install location. This can be different than the install prefix. E.g.:
```sh
  -D CMAKE_INSTALL_PREFIX="/usr/local/clamav" \
  -D CPACK_PACKAGING_INSTALL_PREFIX="/usr/local/clamav" \
```

Then `make` and then one of these, depending on the platform:
```sh
cpack        # macOS: productbuild is default
cpack -G DEB # Debian-based
cpack -G RPM # RPM-based
```

On macOS you'll need to `pip3 install markdown` so that the NEWS.md file can
be converted to html so it will render in the installer.

On RPM-based systems, you'll need rpmbuild (install rpm-build)

This commit also fixes an issue where the html manual (if present) was
not correctly added to the Windows (or now other) install packages.

Fix num to hex function for Windows installer guid

Fix win32 cpack build

Fix macOS cpack build
2021-08-18 13:53:34 -07:00
Grace Kang
657a8e0ff8 CLAM-1535: Long file path support on Windows
via clam.manifest in win32/res. Opts into new Windows behavior that
does not have file path limitations.
Only works on Windows 10. In addition, you must set the registry key
"LongPathsEnabled" to  1.
(as described here: https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=powershell)
2021-08-05 14:49:55 -07:00
Micah Snyder
239ede8023 News & Install: updates to prepare for 0.104 RC
Updates to prepare for the 0.104 release candidate:

- Change documentation to explain current bytecode runtime situation.

- Document Python 2 pytest issue.

- Add additional contributors to acknowledgements.

- Update Install instructions to note that Autotool has been removed.

- Add *.cat SHA256 support and PDF bytecode hook bugfix to the News.

- Clarify purpose of the clamscan `--gen-json` option in the
  clamscan --help.
2021-07-20 12:41:44 -07:00
Micah Snyder
c5c7492781 Fix broken build introduced with progress callbacks
Fix missing return values for progress callbacks.

Fix Windows build.
The cli_debug_flag variable is not exported on Windows. The correct way
to check if in debug-mode is to check the command line options.
2021-07-17 10:39:27 -07:00
Micah Snyder
090c8990e3
libclamav, clamscan: load/unload callbacks & progress meters
Add progress callbacks to libclamav for:
- database load
- engine compile
- engine free

Add a progress bar to clamscan for load & compile.
These are disabled if you run with --debug or stdout is not a TTY or you
are using one of --quiet, --infected, or --no-summary.

Added code so you can test the engine-free callback by building with
ENABLE_ENGINE_FREE_PROGRESSBAR defined.

The compile & free progress callbacks pre-calculate the number of
tasks to complete to estimate the progress. Some tasks may take longer
than others so the progress speed my appear to vary a little.

The callbacks return type is a cl_error_t but doesn't currently do
anything. It is reserved for future use.

Minor formatting change in matcher-ac.c to counteract weird
clang-format behavior, and to make it easier to read.

Added progress callbacks and clamscan progress bars to the news.
2021-07-16 11:47:23 -07:00
Micah Snyder
a746d344df Remove Autotools build system & built-in LLVM
CMake is now required to build.

The built-in LLVM is no longer available.

Also removed support for libltdl calls, which is not used in the CMake
builds, was only used when building with Autotools.

TODO: Fix CMake LLVM support & update to work with modern versions.
2021-05-19 14:20:59 -07:00
Micah Snyder
c025afd683 Rename "shared" library to "common"
The named "shared" is confusing, especially now that these features are
built as a static library instead of being directly compiled into the
various applications.
2021-04-20 17:31:19 -07:00
Micah Snyder (micasnyd)
b9ca6ea103 Update copyright dates for 2021
Also fixes up clang-format.
2021-03-19 15:12:26 -07:00
Micah Snyder
d26b71d02d cmake: Fix vcpkg debug build issues
Adds support to the pcre2 and pthreadw32 Find<Package>.cmake modules for
correctly discovering the debug versions. This change modeled after the
upstream FindBZip2.cmake module.

Also eliminated HAVE_STRUCT_TIMESPEC redefinition warnings in Windows
builds.
2021-02-25 11:55:06 -08:00
Micah Snyder
05a8d589e7 CMake: improve multiarch support
In testing on Alpine, I found that most libs were installing to
<prefix>/lib while libclamav installed to <prefix>/lib64. Those who like
multiarch will advocate for lib64, though I only actually noticed it
because clamscan failed to find libclamav.so! Anyways, they should all
install to lib64 by default if that's what how the system is set up.
Using ${CMAKE_INSTALL_FULL_LIBDIR} instead of <prefix>/lib will do that.
2021-02-25 11:41:29 -08:00
Micah Snyder
6b3b8b2e9d CMake: CPack generate Windows installer with WIX
Also creates a ZIP for non-Admin (per-user) installs.

WIX requires the license file to have a .txt or .rtf extension so I
added the .txt extension. I've taken the opportunity to migrate the 3rd
party licenses to a COPYING subdirectory and have added licensing
details to the README.md file.

To build the installer, install WIX and simply run `cpack -C Release`

Also removed the explicit --config option from the
clamav-clamonacc.service file because it should not be required and
isn't being generated correctly when using autotools anyways, especially
after changes in this commit.
2021-02-25 11:41:27 -08:00
Micah Snyder
2552cfd0d1 CMake: Add CTest support to match Autotools checks
An ENABLE_TESTS CMake option is provided so that users can disable
testing if they don't want it. Instructions for how to use this
included in the INSTALL.cmake.md file.

If you run `ctest`, each testcase will write out a log file to the
<build>/unit_tests directory.

As with Autotools' make check, the test files are from test/.split
and unit_tests/.split files, but for CMake these are generated at
build time instead of at test time.

On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled
libraries can be loaded when running tests.

On Windows systems, CTest will identify and collect all library
dependencies and assemble a temporarily install under the
build/unit_tests directory so that the libraries can be loaded when
running tests.

The same feature is used on Windows when using CMake to install to
collect all DLL dependencies so that users don't have to install them
manually afterwards.

Each of the CTest tests are run using a custom wrapper around Python's
unittest framework, which is also responsible for finding and inserting
valgrind into the valgrind tests on Posix systems.

Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by
default, if Valgrind can be found. There's no need to set VG=1.
CTest's memcheck module is NOT supported, because we use Python to
orchestrate our tests.

Added a bunch of Windows compatibility changes to the unit tests.
These were primarily changing / to PATHSEP and making adjustments
to use Win32 C headers and ifdef out the POSIX ones which aren't
available on Windows. Also disabled a bunch of tests on Win32
that don't work on Windows, notably the mmap ones and FD-passing
(i.e. FILEDES) ones.

Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate
warnings on Windows where json.h is included after inttypes.h because
json-c's inttypes replacement relies on it.
This is a it of a hack and may be removed if json-c fixes their
inttypes header stuff in the future.

Add preprocessor definitions on Windows to disable MSVC warnings about
CRT secure and nonstandard functions. While there may be a better
solution, this is needed to be able to see other more serious warnings.

Add missing file comment block and copyright statement for clamsubmit.c.
Also change json-c/json.h include filename to json.h in clamsubmit.c.
The directory name is not required.

Changed the hash table data integer type from long, which is poorly
defined, to size_t -- which is capable of storing a pointer. Fixed a
bunch of casts regarding this variable to eliminate warnings.

Fixed two bugs causing utf8 encoding unit tests to fail on Windows:
- The in_size variable should be the number of bytes, not the character
  count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8
  transcoding test to only transcode half the bytes.
- It turns out that the MultiByteToWideChar() API can't transcode
  UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer
  and flip the bytes on each uint16_t. This but was causing the UTF16-BE
  to UTF8 tests to fail.

I also split up the utf8 transcoding tests into separate tests so I
could see all of the failures instead of just the first one.

Added a flags parameter to the unit test function to open testfiles
because it turns out that on Windows if a file contains the \r\n it will
replace it with just \n if you opened the file as a text file instead of
as binary. However, if we open the CBC files as binary, then a bunch of
bytecode tests fail. So I've changed the tests to open the CBC files in
the bytecode tests as text files and open all other files as binary.

Ported the feature tests from shell scripts to Python using a modified
version of our QA test-framework, which is largely compatible and will
allow us to migrate some QA tests into this repo. I'd like to add GitHub
Actions pipelines in the future so that all public PR's get some testing
before anyone has to manually review them.

The clamd --log option was missing from the help string, though it
definitely works. I've added it in this commit.
It appears that clamd.c was never clang-format'd, so this commit also
reformats clamd.c.

Some of the check_clamd tests expected the path returned by clamd to
match character for character with original path sent to clamd. However,
as we now evaluate real paths before a scan, the path returned by clamd
isn't going to match the relative (and possibly symlink-ridden) path
passed to clamdscan. I fixed this test by changing the test to search
for the basename: <signature> FOUND within the response instead of
matching the exact path.

Autotools: Link check_clamd with libclamav so we can use our utility
functions in check_clamd.c.
2021-02-25 11:41:26 -08:00
Micah Snyder
4cce1fcd20 GIF, PNG bugfixes; Add AlertBrokenMedia option
Added a new scan option to alert on broken media (graphics) file
formats. This feature mitigates the risk of malformed media files
intended to exploit vulnerabilities in other software. At present
media validation exists for JPEG, TIFF, PNG, and GIF files.

To enable this feature, set `AlertBrokenMedia yes` in clamd.conf, or
use the `--alert-broken-media` option when using `clamscan`.
These options are disabled by default for now.

Application developers may enable this scan option by enabling
`CL_SCAN_HEURISTIC_BROKEN_MEDIA` for the `heuristic` scan option bit
field.

Fixed PNG parser logic bugs that caused an excess of parsing errors
and fixed a stack exhaustion issue affecting some systems when
scanning PNG files. PNG file type detection was disabled via
signature database update for 0.103.0 to mitigate effects from these
bugs.

Fixed an issue where PNG and GIF files no longer work with Target:5
(graphics) signatures if detected as CL_TYPE_PNG/GIF rather than as
CL_TYPE_GRAPHICS. Target types now support up to 10 possible file
types to make way for additional graphics types in future releases.

Scanning JPEG, TIFF, PNG, and GIF files will no longer return "parse"
errors when file format validation fails. Instead, the scan will alert
with the "Heuristics.Broken.Media" signature prefix and a descriptive
suffix to indicate the issue, provided that the "alert broken media"
feature is enabled.

GIF format validation will no longer fail if the GIF image is missing
the trailer byte, as this appears to be a relatively common issue in
otherwise functional GIF files.

Added a TIFF dynamic configuration (DCONF) option, which was missing.
This will allow us to disable TIFF format validation via signature
database update in the event that it proves to be problematic.
This feature already exists for many other file types.

Added CL_TYPE_JPEG and CL_TYPE_TIFF types.
2021-01-28 12:54:47 -08:00
Micah Snyder (micasnyd)
3a7a20c323 Allow scans even if realpath lookup fails
The security improvement to perform file realpath lookups prior to a
scan has the adverse effect of causing file scans to fail on Windows
when scanning on some filesystems.

Specifically, it was observed that the ImDisk driver doesn't handle the
IRP_MJ_QUERY_INFORMATION message so the call to look up the realpath
using GetFinalPathNameByHandleW() doesn't work.

There are two other API's I've found which can query the real file path.
The first is to create a file mapping of the target file and then use
GetMappedFileNameW() to get the file path.  The other is to use the
NtQueryObject() undocumented NT API to get the file path.  Each of
these should return roughly the same thing.  For files in an ImDisk
RAM-disk drive, the resulting filepath for R:\clam.exe would
be \\Device\ImDisk0\clam.exe.  The trouble is, mapping
\\Device\ImDisk0\clam.exe back to R:\clam.exe would rely on an
assumption that ImDisk is using the default drive letter, which is a tad
hacky.

Instead, this patch simply allows the scan to proceed if the realpath
lookup failed.  If the user is using the quarantine (remove/move)
features AND if the scan target filepath has a directory junction (soft
link), then the quarantine action will fail.  It's not ideal but it is
quite unlikely.
2021-01-12 12:01:27 -08:00
Micah Snyder
2f62d6bf11 Fix RAR unicode filename bug on Linux
The UnRAR library requires the character-classification locale to be set
to the empty string "" so it will be set according to the environment
variables, as seeen in the rar.cpp example application `main()`.

Without this, extracting RAR archives containing unicode filenames on
non-Windows, non-macOS operating systems may fail.
2020-08-31 13:59:33 -07:00
Micah Snyder (micasnyd)
9e20cdf6ea Add CMake build tooling
This patch adds experimental-quality CMake build tooling.

The libmspack build required a modification to use "" instead of <> for
header #includes. This will hopefully be included in the libmspack
upstream project when adding CMake build tooling to libmspack.

Removed use of libltdl when using CMake.

Flex & Bison are now required to build.

If -DMAINTAINER_MODE, then GPERF is also required, though it currently
doesn't actually do anything.  TODO!

I found that the autotools build system was generating the lexer output
but not actually compiling it, instead using previously generated (and
manually renamed) lexer c source. As a consequence, changes to the .l
and .y files weren't making it into the build. To resolve this, I
removed generated flex/bison files and fixed the tooling to use the
freshly generated files. Flex and bison are now required build tools.
On Windows, this adds a dependency on the winflexbison package,
which can be obtained using Chocolatey or may be manually installed.

CMake tooling only has partial support for building with external LLVM
library, and no support for the internal LLVM (to be removed in the
future). I.e. The CMake build currently only supports the bytecode
interpreter.

Many files used include paths relative to the top source directory or
relative to the current project, rather than relative to each build
target. Modern CMake support requires including internal dependency
headers the same way you would external dependency headers (albeit
with "" instead of <>). This meant correcting all header includes to
be relative to the build targets and not relative to the workspace.

For example, ...

```c
include "../libclamav/clamav.h"
include "clamd/clamd_others.h"
```

... becomes:

```c
// libclamav
include "clamav.h"

// clamd
include "clamd_others.h"
```

Fixes header name conflicts by renaming a few of the files.

Converted the "shared" code into a static library, which depends on
libclamav. The ironically named "shared" static library provides
features common to the ClamAV apps which are not required in
libclamav itself and are not intended for use by downstream projects.
This change was required for correct modern CMake practices but was
also required to use the automake "subdir-objects" option.
This eliminates warnings when running autoreconf which, in the next
version of autoconf & automake are likely to break the build.

libclamav used to build in multiple stages where an earlier stage is
a static library containing utils required by the "shared" code.
Linking clamdscan and clamdtop with this libclamav utils static lib
allowed these two apps to function without libclamav. While this is
nice in theory, the practical gains are minimal and it complicates
the build system. As such, the autotools and CMake tooling was
simplified for improved maintainability and this feature was thrown
out. clamdtop and clamdscan now require libclamav to function.

Removed the nopthreads version of the autotools
libclamav_internal_utils static library and added pthread linking to
a couple apps that may have issues building on some platforms without
it, with the intention of removing needless complexity from the
source. Kept the regular version of libclamav_internal_utils.la
though it is no longer used anywhere but in libclamav.

Added an experimental doxygen build option which attempts to build
clamav.h and libfreshclam doxygen html docs.

The CMake build tooling also may build the example program(s), which
isn't a feature in the Autotools build system.

Changed C standard to C90+ due to inline linking issues with socket.h
when linking libfreshclam.so on Linux.

Generate common.rc for win32.

Fix tabs/spaces in shared Makefile.am, and remove vestigial ifndef
from misc.c.

Add CMake files to the automake dist, so users can try the new
CMake tooling w/out having to build from a git clone.

clamonacc changes:
- Renamed FANOTIFY macro to HAVE_SYS_FANOTIFY_H to better match other
  similar macros.
- Added a new clamav-clamonacc.service systemd unit file, based on
  the work of ChadDevOps & Aaron Brighton.
- Added missing clamonacc man page.

Updates to clamdscan man page, add missing options.

Remove vestigial CL_NOLIBCLAMAV definitions (all apps now use
libclamav).

Rename Windows mspack.dll to libmspack.dll so all ClamAV-built
libraries have the lib-prefix with Visual Studio as with CMake.
2020-08-13 00:25:34 -07:00