Commit graph

46 commits

Author SHA1 Message Date
Val S.
aadf25df6a
Fix static analysis code quality issues (#1582)
`libclamav/libmspack.c`: Initialize variables before first `goto done;`
to fix unitialized variable use in an error condition.

`libclamav/others.c`: Explicitly ignore return values for calls to add
JSON values when subsequent calls don't depend on them.
If we were to add error handling here, the only thing we'd do is debug-
log it. I don't think it's worth adding the extra lines of code.

`libclamav/unarj.c`: Removed dead code.
The `status` variable is immediately set afterwards based on whether or
not any files may be extracted.

`libclamav/unzip.c`: Removed dead code.
The `ret` variable is checked immediately after being set, above. This
check after the `do`-`while()` loop is dead code.

`sigtool/sigtool.c`: Fix potential NULL deref in error handling.
This is a fix for the same issue as was fixed in a previous commit.
I somehow overlooked this one. Copy/paste bug.

`libclamav/pdfdecode.c`: Fix leaked `stream` memory when
`filter_lzwdecode()` fails.

`clamdtop/clamdtop.c`: Fix possible NULL dereference if `strchr` returns
NULL in `read_version()` and `check_stats_available()`.

`libclamav/rtf.c`: Fix memory leak in `rtf_object_process()` if
`cli_gentemp_with_prefix()` fails.
Also change empty for-loop to resolve clang-format weirdness and make it
more obvious the for-loop has no body.

`libclamav/aspack.c`: Ensure that `endoff - old` is not negative in
`build_decrypt_array()` before passing to `CLI_ISCONTAINED()` which expects
unsigned values.

`libclamav/upx.c`: Fix integer overflow checks in multiple functions.

`libclamav/vba_extract.c`: Set `entries` pointer back to NULL after free in
`word_read_macro_entry()` error condition.

`libclamav/unzip.c`: Remove logic to return `CL_EMAXFILES` from
`index_local_file_headers()`. It seems it only overwrote the status when
not `CL_SUCCESS` in which case it could be overriding a more serious failure.
Further, updates to the how the ZIP parser works has made it so this needs
to return `CL_SUCCESS` in order for the caller to at least scan the files
found so far.
Finally, the calling function has checks of its own to make sure we don't
exceeds the max-files limit.

`libclamav/unzip.c`: Fix issue where `cli_append_potentially_unwanted()` in
`index_local_file_headers()` might overwrite an error in `status` with
`CL_CLEAN`. Instead, it now checks the return value and only overwrites the
`CL_EFORMAT` status with a different value if not `CL_SUCCESS`.

`libclamav/unzip.c`: Fix a potential leak with `combined_catalogue` and
`temp_catalogue` in an error condition. We should always free them if not NULL,
not just if the function failed. And to make this safe, we must set
`combined_catalogue` to NULL when we give ownership to `*catalogue`.

`libclamav/scanners.c`: Fix a potential leak in error handling for the
`cli_ole2_tempdir_scan_vba()` function.

CLAM-2768
2025-10-02 11:46:14 -04:00
Val S.
d4114e0d2c
Fix static analysis code quality issues; Fix old libjson-c support (#1574)
`clamscan/manager.c`: Fix double-free in an error condition in `scanfile()`.

`common/optparser.c`: Fix uninitialized use of the `numarg` variable when
`arg` is `NULL`.

`libclamav/cache.c`: Don't check if `ctx-fmap` is `NULL` when we've
already dereferenced it.

`libclamav/crypto.c`: The `win_exception` variable and associated logic
is Windows-specific and so needs preprocessor platform checks. Otherwise
it generates unused variable warnings.

`libclamav/crypto.c`: Check for `size_t` overflow of the `byte_read`
variable in the `cl_hash_file_fd_ex()` function.

`libclamav/crypto.c`: Fix a memory leak in the `cl_hash_file_fd_ex()`
function.

`libclamav/fmap.c`: Correctly the `name` and `path` pointer if
`fmap_duplicate()` fails. Also need to clear those variables when
duplicating the parent `map` so that on error it does not free the wrong
`name` or `path`.

`libclamav/fmap.c`: Refine error handling for `hash_string` cleanup in
`cl_fmap_get_hash()`. Coverity's complaint was that `hash_string` could
never be non-NULL if `status` is not `CL_SUCCESS`. I.e., the cleanup is
dead code. I don't think my cleanup actually "fixes" that though it is
definitely a better way to do the error handling.
The `if (NULL != hash_string) {` check is still technically dead code.
It safeguards against future changes that may `goto done` between the
allocation and transfering ownership from `hash_string` to `hash_out`.

`libclamav/others.c`: Fix possible memory leak in `cli_recursion_stack_push()`.

`libclamav/others.c`: Refactor an if/else + switch statement inside
`cli_dispatch_scan_callback()` so that the `CL_SCAN_CALLBACK_ALERT` case
is not dead-code. It's also easier to read now.

`libclamav/pdfdecode.c`: For logging, use the `%zu` to format `size_t`
instead of casting to `long long` and using `%llu`. Simiularly use the
`STDu32` format string macro for `uint32_t`.

`libclamav/pdfdecode.c`: Fix a possible double-free for the `decoded`
pointer in `filter_lzwdecode()`.

`libclamav/pdfdecode.c`: Remove the `if (capacity > UINT_MAX) {`
overflow check inside `filter_lzwdecode()`, which didn't do anything.
The `capacity` variable this point is a fixed value and so I also changed
the `avail_out` to be that fixed `INFLATE_CHUNK_SIZE` value rather than
using `capacity`. It is more straightforward and replicates how similar
logic works later in the file.
I also removed the copy-pasted `(Bytef *)` cast which didn't reaaally do
anything, and was a copypaste from a different algorihm. The lzw
implementation interface doesn't use `Bytef`.

`libclamav/readdb.c`: Fix a possible NULL-deref on the `matcher` variable
in the error handling/cleanup code if the function fails.

`libclamav/scanners.c`: Fix an issue where the return value from some of
the parsers may be lost/overridden by the call to
`cli_dispatch_scan_callback()` just after the `done:` label in
`cli_magic_scan()`.

`libclamav/scanners.c`: Silence an unused-return value warning when
calling `cli_basename()`.

`sigtool/sigtool.c` and `unit_tests/check_regex.c`:
Fix possible NULL-derefs of the `ctx.recursion_stack` pointer in the error
handling for several functions.

Also, and this isn't a Coverity thing:

`libclamav/json_api.c` and `libclamav/others.c`:
Fix support for libjson-c version 0.13 and older.
I don't think we *should* be using the old version, but some environments
such as the current OSS-Fuzz base image are older and still use it.
The issue is that `json_object_new_uint64()` was introduced in a later
libjson-c version, so we have to fallback to use `json_object_new_int64()`
with older libjson-c, provided the int were storing isn't too big.

CLAM-2768
2025-09-26 18:26:00 -04:00
Val S.
9198f411d5
Fix PDF double-free bug (#1559)
It's possible that the `token->content` variable may get freed and set
to an uninitialized value from the `decoded` variable.
This results in both
 "Conditional jump or move depends on uninitialised value"
and
 "Invalid free() / delete / delete[] / realloc()"

This bug appears to have been introduced during 1.5 development and does
not occur in prior versions.

To fix it, I'm initializing the `decoded` variable and adding some
callocs elsewhere to initialize a couple other things that looked iffy,
and I'm making it so it won't try to `free(token->content)` and use
`decoded` if decompression results in an empty buffer and the status
code is set to CL_BREAK.

Fixes: https://issues.oss-fuzz.com/issues/429489013

CLAM-2854
2025-08-27 14:22:35 -04:00
Val S.
d2fa46d76c
Fix integer overflow in PDF parser (#1523)
The ascii85decode function calculates the amount of memory to reserve as
a function of (4 * bytes) + 1. Since the result is stored in a uint32_t,
we need to make sure that this calculation will not overflow. If we
detect that an overflow would occur, return CL_EFORMAT and do not
proceed.

Also check additional potential overflow conditions.
Other areas were identified that could potentially overflow.
This commit adds additional checks to prevent said overflows.

Thank you Greg Walkup at Sandia National Labs for reporting this issue.

CLAM-2752
CLAM-2757
CLAM-2759

Co-authored-by: John Humlick <15677335+jhumlick@users.noreply.github.com>
2025-06-30 10:47:02 -04:00
Val Snyder
7ff29b8c37
Bump copyright dates for 2025 2025-02-14 10:24:30 -05:00
Micah Snyder
8e04c25fec Rename clamav memory allocation functions
We have some special functions to wrap malloc, calloc, and realloc to
make sure we don't allocate more than some limit, similar to the
max-filesize and max-scansize limits. Our wrappers are really only
needed when allocating memory for scans based on untrusted user input,
where a scan file could have bytes that claim you need to allocate
some ridiculous amount of memory. Right now they're named:
- cli_malloc
- cli_calloc
- cli_realloc
- cli_realloc2

... and these names do not convey their purpose

This commit renames them to:
- cli_max_malloc
- cli_max_calloc
- cli_max_realloc
- cli_max_realloc2

The realloc ones also have an additional feature in that they will not
free your pointer if you try to realloc to 0 bytes. Freeing the memory
is undefined by the C spec, and only done with some realloc
implementations, so this stabilizes on the behavior of not doing that,
which should prevent accidental double-free's.

So for the case where you may want to realloc and do not need to have a
maximum, this commit adds the following functions:
- cli_safer_realloc
- cli_safer_realloc2

These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when
MPOOL is disabled (e.g. because mmap-support is not found), so as to
match the behavior in the mpool_realloc/2 functions that do not make use
of the allocation-limit.
2024-03-15 13:18:47 -04:00
Micah Snyder
6d6e04ddf8 Optimization: replace limited allocation calls
There are a large number of allocations for fix sized buffers using the
`cli_malloc` and `cli_calloc` calls that check if the requested size is
larger than our allocation threshold for allocations based on untrusted
input. These allocations will *always* be higher than the threshold, so
the extra stack frame and check for these calls is a waste of CPU.

This commit replaces needless calls with A -> B:
- cli_malloc -> malloc
- cli_calloc -> calloc
- CLI_MALLOC -> MALLOC
- CLI_CALLOC -> CALLOC

I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by
the max-allocation threshold, when MMAP is found/enabled. But the
alternative was set to cli_malloc / cli_calloc when disabled. I changed
those as well.

I didn't change the cli_realloc/2 calls because our version of realloc
not only implements a threshold but also stabilizes the undefined
behavior in realloc to protect against accidental double-free's.
It may be worth implementing a cli_realloc that doesn't have the
threshold built-in, however, so as to allow reallocaitons for things
like buffers for loading signatures, which aren't subject to the same
concern as allocations for scanning possible malware.

There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC,
because it appears to be allocating based on untrusted input.
2024-03-15 13:18:47 -04:00
Micah Snyder
ebe3c50555 PDF: Minor optimizations
Store temp files with obj id and gen id so analysts know which is which.

Don't dump decoded objects immediately. They'll get dumped later at the
end of pdf_extract_obj().

At the end of PDF object extraction, we don't need to find out the
"dumpid" (aka the object index in our list of pdf objects).
It isn't actually used! So I removed the unused parameter.
2024-01-22 11:29:52 -05:00
Micah Snyder
9cb28e51e6 Bump copyright dates for 2024 2024-01-22 11:27:17 -05:00
RainRat
caf324e544
Fix typos (no functional changes) 2023-11-26 18:01:19 -05:00
Micah Snyder
6eebecc303 Bump copyright for 2023 2023-02-12 11:20:22 -08:00
Micah Snyder
3954b8d6c7 PDF: remove allmatch checks
And fix issue where critical errors in magic_scan done by pdf_scan_contents() may be ignored.
2022-10-19 13:13:57 -07:00
micasnyd
140c88aa4e Bump copyright for 2022
Includes minor format corrections.
2022-01-09 14:23:25 -07:00
Micah Snyder (micasnyd)
9910d2d426 PDF: Fix error Attempt to allocate 0 bytes
The PDF parser currently prints verbose error messages when attempting
to shrink a buffer down to actual data length after decoding if it turns
out that the decoded stream was empty (0 bytes).  With exception to the
verbose error messages, there's no real behavior issue.

This commit fixes the issue by checking if any bytes were decoded before
attempting to shrink the buffer.
2020-04-06 15:01:53 -07:00
Andy Ragusa (aragusa)
ba6467a6a6 Fixed memory leak reported by oss-fuzz.
Added checks to see if realloc succeeds before reassigning the pointers,
and made this file build without warnings when compiled with -Wextra.
2020-01-31 10:52:55 -08:00
Micah Snyder
dd4967ed55 Adds fallthrough hints to alleviate warnings on gcc with -Wextra. 2020-01-22 12:57:49 -08:00
Micah Snyder (micasnyd)
8a97a905d1 bb12387 - PDF scan speed optimization
Users have reported slow scan speeds of some PDF documents. The scan
speed was very slow on Windows in particular.  Investigation indicated
significant time spent in cli_realloc.

Performance is particularly bad with a relatively large data stream is
decompressed using small chunk sizes and the final buffer is reallocated
to a larger buffer size each time a chunk is added.

This commit replaces BUFSIZ, which varies from 256B -> 8192B, with
INFLATE_CHUNK_SIZE, set to 256kB, the chunk size recommended by the zlib
documentation for efficient inflate performance. The output buffer is
shrunk (reallocated) down to the final decoded buffer length so as not
to waste memory when many small buffers must be decompressed.

A followup fix should provide a standard way to do zlib decompression
across libclamav where a linked list of decompressed chunks are
assembled and then the final output buffer is allocated at the end.
2020-01-22 12:57:49 -08:00
Micah Snyder
97a0647e88 Additional variable type changes for correctness and to silence warnings. A handful of other minor changes to silence warnings. Corrected a number of function definitions so they return cl_error_t rather than int. 2019-10-02 16:08:25 -04:00
Micah Snyder
5f4f69102d Correcting types from int to cl_error_t where appropriate. Eliminating unused variables and referencing unused parameters to remove warnings. 2019-10-02 16:08:25 -04:00
Micah Snyder
479a9a235a Fixes for issues identified by coverity. 2019-10-02 16:08:19 -04:00
Micah Snyder
72fd33c8b2 clang-format'd using new .clang-format rules. 2019-10-02 16:08:16 -04:00
Micah Snyder
38fe8b69a0 Added .clang-format style rules, clam-format script to automate formatting of ClamAV code, and preparing select files so that clang-format does not alter carefully formatted sections. 2019-10-02 16:08:16 -04:00
Micah Snyder
d77b8ae0fb Fixes to a handful of bugs identified during regression testing of PDF and UnRAR changes.
Fix for minor memory leak in fmap_dump_to_file().
Fix to PDF object stream logic, accounting for a realloc() issue when the only pdf object stream fails to parse, and for when pdf objects in a stream appear to extend further than the size of the stream.
Fix for memory leak cleaning up PDF object stream buffer in error condition.
Fix to bug in pdf_decodestream wherein objects were found in an object stream, but the object stream could later be free'd if max scansize was exceeded, resulting in a NULL dereference.
General cleanup of pdf_decodestream/pdf_decodestream_internal exit code logic.
2018-12-02 23:07:00 -05:00
Micah Snyder
d7979d4ff7 Restructured scan options flags from a single bitflag field to a structure containing multiple bitflag fields. This also required adding a new function to the bytecode API to get scan options a la carte, and modifying the existing function to hand back scan options in the old/deprecated uint32_t bitflag format. Re-generated bytecode iface header files.
Updated libclamav documentation detailing new scan options structure.
Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'.
Renamed references to 'scan all' to 'scan all match'.
Renamed a couple of 'Hueristic.*' signature names as 'Heuristics.*' signatures (plural) to match majority of other heuristics.
2018-12-02 23:06:59 -05:00
Micah Snyder (micasnyd)
89d5207b31 Added new pdf object stream parsing capability. 2018-12-02 23:06:58 -05:00
Micah Snyder
e09d884341 eliminated a large number of warnings, many of which had to do with mixing types. i switched some types to size_t and a couple to ptrdiff_t to make things more consistent, but there is a huge amount of work to be done to make types consistent. int, unsigned int, unsigned, off_t, and other types are ill-suited to storing buffer lengths or memory addresses. 2017-08-16 17:31:45 -04:00
Steven Morgan
9807521cf2 Change some warning messages to debug messages. 2016-08-26 11:25:34 -04:00
Kevin Lin
fdcf5109d6 pdfdecode: reduced stream dumping conditions to just leave-temps + fix warnings 2016-07-26 16:39:40 -04:00
Kevin Lin
567c73ec69 lzwdec: modify dictionary max code points and change state flags 2016-05-26 15:25:54 -04:00
Kevin Lin
bfd8ca3eda pdfdecode: return raw stream if no non-forced filters succeed 2016-04-18 17:11:59 -04:00
Kevin Lin
a081b3e946 pdfdecode: do not apply forced decryption to /XRef streams 2016-04-18 17:11:12 -04:00
Kevin Lin
046d4cc91b pdfdecode: reduce errmsg to dbgmsg for log reduction 2016-04-15 13:29:56 -04:00
Kevin Lin
5c2915120a pdfdecode: add dictionary heuristic and all-match support 2016-04-15 13:29:56 -04:00
Kevin Lin
ce3cf4c64f pdfdecode: lzw dconf 2016-04-15 13:29:56 -04:00
Kevin Lin
e8a23886d2 pdfdecode: integrate lzw decompression 2016-04-15 13:29:55 -04:00
Kevin Lin
02c120e8f2 pdfdecode: change various debug messages for clarity 2016-04-15 13:29:55 -04:00
Kevin Lin
0018a8e792 pdfdecode: memory fixes for ascii85, reorganize initial crypt 2016-04-15 13:29:54 -04:00
Kevin Lin
7aad5a3b96 pdf: pdfdecode integration 2016-04-15 13:29:54 -04:00
Kevin Lin
639615af8d pdfdecode: change condition for dumping intermediate filter buffers 2016-04-15 13:29:53 -04:00
Kevin Lin
a042e6f068 pdfdecode: add error flags in asciihex and ascii85 2016-04-15 13:29:53 -04:00
Kevin Lin
1d0cdc67c1 pdf: open abi for pdfdecode usage 2016-04-15 13:29:53 -04:00
Kevin Lin
d593717b09 pdfdecode: fixes, error code handling, and limit checks 2016-04-15 13:29:53 -04:00
Kevin Lin
07a7200626 pdfdecode: add dumping intermediate filter buffers 2016-04-15 13:29:52 -04:00
Kevin Lin
eaf5221181 pdfdecode: implement crypt filter handler 2016-04-15 13:29:52 -04:00
Kevin Lin
739e5052cc pdfdecode: implement rldecode filter handler 2016-04-15 13:29:52 -04:00
Kevin Lin
7ded9e2955 pdfdecode: add new source for stream decoding (ascii85, asciihex, flate) 2016-04-15 13:29:52 -04:00