`libclamav/libmspack.c`: Initialize variables before first `goto done;`
to fix unitialized variable use in an error condition.
`libclamav/others.c`: Explicitly ignore return values for calls to add
JSON values when subsequent calls don't depend on them.
If we were to add error handling here, the only thing we'd do is debug-
log it. I don't think it's worth adding the extra lines of code.
`libclamav/unarj.c`: Removed dead code.
The `status` variable is immediately set afterwards based on whether or
not any files may be extracted.
`libclamav/unzip.c`: Removed dead code.
The `ret` variable is checked immediately after being set, above. This
check after the `do`-`while()` loop is dead code.
`sigtool/sigtool.c`: Fix potential NULL deref in error handling.
This is a fix for the same issue as was fixed in a previous commit.
I somehow overlooked this one. Copy/paste bug.
`libclamav/pdfdecode.c`: Fix leaked `stream` memory when
`filter_lzwdecode()` fails.
`clamdtop/clamdtop.c`: Fix possible NULL dereference if `strchr` returns
NULL in `read_version()` and `check_stats_available()`.
`libclamav/rtf.c`: Fix memory leak in `rtf_object_process()` if
`cli_gentemp_with_prefix()` fails.
Also change empty for-loop to resolve clang-format weirdness and make it
more obvious the for-loop has no body.
`libclamav/aspack.c`: Ensure that `endoff - old` is not negative in
`build_decrypt_array()` before passing to `CLI_ISCONTAINED()` which expects
unsigned values.
`libclamav/upx.c`: Fix integer overflow checks in multiple functions.
`libclamav/vba_extract.c`: Set `entries` pointer back to NULL after free in
`word_read_macro_entry()` error condition.
`libclamav/unzip.c`: Remove logic to return `CL_EMAXFILES` from
`index_local_file_headers()`. It seems it only overwrote the status when
not `CL_SUCCESS` in which case it could be overriding a more serious failure.
Further, updates to the how the ZIP parser works has made it so this needs
to return `CL_SUCCESS` in order for the caller to at least scan the files
found so far.
Finally, the calling function has checks of its own to make sure we don't
exceeds the max-files limit.
`libclamav/unzip.c`: Fix issue where `cli_append_potentially_unwanted()` in
`index_local_file_headers()` might overwrite an error in `status` with
`CL_CLEAN`. Instead, it now checks the return value and only overwrites the
`CL_EFORMAT` status with a different value if not `CL_SUCCESS`.
`libclamav/unzip.c`: Fix a potential leak with `combined_catalogue` and
`temp_catalogue` in an error condition. We should always free them if not NULL,
not just if the function failed. And to make this safe, we must set
`combined_catalogue` to NULL when we give ownership to `*catalogue`.
`libclamav/scanners.c`: Fix a potential leak in error handling for the
`cli_ole2_tempdir_scan_vba()` function.
CLAM-2768
`clamscan/manager.c`: Fix double-free in an error condition in `scanfile()`.
`common/optparser.c`: Fix uninitialized use of the `numarg` variable when
`arg` is `NULL`.
`libclamav/cache.c`: Don't check if `ctx-fmap` is `NULL` when we've
already dereferenced it.
`libclamav/crypto.c`: The `win_exception` variable and associated logic
is Windows-specific and so needs preprocessor platform checks. Otherwise
it generates unused variable warnings.
`libclamav/crypto.c`: Check for `size_t` overflow of the `byte_read`
variable in the `cl_hash_file_fd_ex()` function.
`libclamav/crypto.c`: Fix a memory leak in the `cl_hash_file_fd_ex()`
function.
`libclamav/fmap.c`: Correctly the `name` and `path` pointer if
`fmap_duplicate()` fails. Also need to clear those variables when
duplicating the parent `map` so that on error it does not free the wrong
`name` or `path`.
`libclamav/fmap.c`: Refine error handling for `hash_string` cleanup in
`cl_fmap_get_hash()`. Coverity's complaint was that `hash_string` could
never be non-NULL if `status` is not `CL_SUCCESS`. I.e., the cleanup is
dead code. I don't think my cleanup actually "fixes" that though it is
definitely a better way to do the error handling.
The `if (NULL != hash_string) {` check is still technically dead code.
It safeguards against future changes that may `goto done` between the
allocation and transfering ownership from `hash_string` to `hash_out`.
`libclamav/others.c`: Fix possible memory leak in `cli_recursion_stack_push()`.
`libclamav/others.c`: Refactor an if/else + switch statement inside
`cli_dispatch_scan_callback()` so that the `CL_SCAN_CALLBACK_ALERT` case
is not dead-code. It's also easier to read now.
`libclamav/pdfdecode.c`: For logging, use the `%zu` to format `size_t`
instead of casting to `long long` and using `%llu`. Simiularly use the
`STDu32` format string macro for `uint32_t`.
`libclamav/pdfdecode.c`: Fix a possible double-free for the `decoded`
pointer in `filter_lzwdecode()`.
`libclamav/pdfdecode.c`: Remove the `if (capacity > UINT_MAX) {`
overflow check inside `filter_lzwdecode()`, which didn't do anything.
The `capacity` variable this point is a fixed value and so I also changed
the `avail_out` to be that fixed `INFLATE_CHUNK_SIZE` value rather than
using `capacity`. It is more straightforward and replicates how similar
logic works later in the file.
I also removed the copy-pasted `(Bytef *)` cast which didn't reaaally do
anything, and was a copypaste from a different algorihm. The lzw
implementation interface doesn't use `Bytef`.
`libclamav/readdb.c`: Fix a possible NULL-deref on the `matcher` variable
in the error handling/cleanup code if the function fails.
`libclamav/scanners.c`: Fix an issue where the return value from some of
the parsers may be lost/overridden by the call to
`cli_dispatch_scan_callback()` just after the `done:` label in
`cli_magic_scan()`.
`libclamav/scanners.c`: Silence an unused-return value warning when
calling `cli_basename()`.
`sigtool/sigtool.c` and `unit_tests/check_regex.c`:
Fix possible NULL-derefs of the `ctx.recursion_stack` pointer in the error
handling for several functions.
Also, and this isn't a Coverity thing:
`libclamav/json_api.c` and `libclamav/others.c`:
Fix support for libjson-c version 0.13 and older.
I don't think we *should* be using the old version, but some environments
such as the current OSS-Fuzz base image are older and still use it.
The issue is that `json_object_new_uint64()` was introduced in a later
libjson-c version, so we have to fallback to use `json_object_new_int64()`
with older libjson-c, provided the int were storing isn't too big.
CLAM-2768
It's possible that the `token->content` variable may get freed and set
to an uninitialized value from the `decoded` variable.
This results in both
"Conditional jump or move depends on uninitialised value"
and
"Invalid free() / delete / delete[] / realloc()"
This bug appears to have been introduced during 1.5 development and does
not occur in prior versions.
To fix it, I'm initializing the `decoded` variable and adding some
callocs elsewhere to initialize a couple other things that looked iffy,
and I'm making it so it won't try to `free(token->content)` and use
`decoded` if decompression results in an empty buffer and the status
code is set to CL_BREAK.
Fixes: https://issues.oss-fuzz.com/issues/429489013
CLAM-2854
The ascii85decode function calculates the amount of memory to reserve as
a function of (4 * bytes) + 1. Since the result is stored in a uint32_t,
we need to make sure that this calculation will not overflow. If we
detect that an overflow would occur, return CL_EFORMAT and do not
proceed.
Also check additional potential overflow conditions.
Other areas were identified that could potentially overflow.
This commit adds additional checks to prevent said overflows.
Thank you Greg Walkup at Sandia National Labs for reporting this issue.
CLAM-2752
CLAM-2757
CLAM-2759
Co-authored-by: John Humlick <15677335+jhumlick@users.noreply.github.com>
We have some special functions to wrap malloc, calloc, and realloc to
make sure we don't allocate more than some limit, similar to the
max-filesize and max-scansize limits. Our wrappers are really only
needed when allocating memory for scans based on untrusted user input,
where a scan file could have bytes that claim you need to allocate
some ridiculous amount of memory. Right now they're named:
- cli_malloc
- cli_calloc
- cli_realloc
- cli_realloc2
... and these names do not convey their purpose
This commit renames them to:
- cli_max_malloc
- cli_max_calloc
- cli_max_realloc
- cli_max_realloc2
The realloc ones also have an additional feature in that they will not
free your pointer if you try to realloc to 0 bytes. Freeing the memory
is undefined by the C spec, and only done with some realloc
implementations, so this stabilizes on the behavior of not doing that,
which should prevent accidental double-free's.
So for the case where you may want to realloc and do not need to have a
maximum, this commit adds the following functions:
- cli_safer_realloc
- cli_safer_realloc2
These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when
MPOOL is disabled (e.g. because mmap-support is not found), so as to
match the behavior in the mpool_realloc/2 functions that do not make use
of the allocation-limit.
There are a large number of allocations for fix sized buffers using the
`cli_malloc` and `cli_calloc` calls that check if the requested size is
larger than our allocation threshold for allocations based on untrusted
input. These allocations will *always* be higher than the threshold, so
the extra stack frame and check for these calls is a waste of CPU.
This commit replaces needless calls with A -> B:
- cli_malloc -> malloc
- cli_calloc -> calloc
- CLI_MALLOC -> MALLOC
- CLI_CALLOC -> CALLOC
I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by
the max-allocation threshold, when MMAP is found/enabled. But the
alternative was set to cli_malloc / cli_calloc when disabled. I changed
those as well.
I didn't change the cli_realloc/2 calls because our version of realloc
not only implements a threshold but also stabilizes the undefined
behavior in realloc to protect against accidental double-free's.
It may be worth implementing a cli_realloc that doesn't have the
threshold built-in, however, so as to allow reallocaitons for things
like buffers for loading signatures, which aren't subject to the same
concern as allocations for scanning possible malware.
There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC,
because it appears to be allocating based on untrusted input.
Store temp files with obj id and gen id so analysts know which is which.
Don't dump decoded objects immediately. They'll get dumped later at the
end of pdf_extract_obj().
At the end of PDF object extraction, we don't need to find out the
"dumpid" (aka the object index in our list of pdf objects).
It isn't actually used! So I removed the unused parameter.
The PDF parser currently prints verbose error messages when attempting
to shrink a buffer down to actual data length after decoding if it turns
out that the decoded stream was empty (0 bytes). With exception to the
verbose error messages, there's no real behavior issue.
This commit fixes the issue by checking if any bytes were decoded before
attempting to shrink the buffer.
Users have reported slow scan speeds of some PDF documents. The scan
speed was very slow on Windows in particular. Investigation indicated
significant time spent in cli_realloc.
Performance is particularly bad with a relatively large data stream is
decompressed using small chunk sizes and the final buffer is reallocated
to a larger buffer size each time a chunk is added.
This commit replaces BUFSIZ, which varies from 256B -> 8192B, with
INFLATE_CHUNK_SIZE, set to 256kB, the chunk size recommended by the zlib
documentation for efficient inflate performance. The output buffer is
shrunk (reallocated) down to the final decoded buffer length so as not
to waste memory when many small buffers must be decompressed.
A followup fix should provide a standard way to do zlib decompression
across libclamav where a linked list of decompressed chunks are
assembled and then the final output buffer is allocated at the end.
Fix for minor memory leak in fmap_dump_to_file().
Fix to PDF object stream logic, accounting for a realloc() issue when the only pdf object stream fails to parse, and for when pdf objects in a stream appear to extend further than the size of the stream.
Fix for memory leak cleaning up PDF object stream buffer in error condition.
Fix to bug in pdf_decodestream wherein objects were found in an object stream, but the object stream could later be free'd if max scansize was exceeded, resulting in a NULL dereference.
General cleanup of pdf_decodestream/pdf_decodestream_internal exit code logic.
Updated libclamav documentation detailing new scan options structure.
Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'.
Renamed references to 'scan all' to 'scan all match'.
Renamed a couple of 'Hueristic.*' signature names as 'Heuristics.*' signatures (plural) to match majority of other heuristics.