We add the _OR_GOTO_DONE suffix to the macros that go to done if the
allocation fails. This makes it obvious what is different about the
macro versus the equivalent function, and that error handling is
built-in.
Renamed the cli_strdup to safer_strdup to make it obvious that it exists
because it is safer than regular strdup. Regular strdup doesn't have the
NULL check before trying to dup, and so may result in a NULL-deref
crash.
Also remove unused STRDUP (_OR_GOTO_DONE) macro, since the one with the
NULL-check is preferred.
Variables like the number of signature parts are considered trusted user input
and so allocations based on those values need not check the memory allocation
limit.
Specifically for the allocation of the normalized buffer in cli_scanscript,
I determined that the size of SCANBUFF is fixed and so safe, and the maxpatlen
comes from the signature load and is therefore also trusted, so we do not
need to check the allocation limit.
We have some special functions to wrap malloc, calloc, and realloc to
make sure we don't allocate more than some limit, similar to the
max-filesize and max-scansize limits. Our wrappers are really only
needed when allocating memory for scans based on untrusted user input,
where a scan file could have bytes that claim you need to allocate
some ridiculous amount of memory. Right now they're named:
- cli_malloc
- cli_calloc
- cli_realloc
- cli_realloc2
... and these names do not convey their purpose
This commit renames them to:
- cli_max_malloc
- cli_max_calloc
- cli_max_realloc
- cli_max_realloc2
The realloc ones also have an additional feature in that they will not
free your pointer if you try to realloc to 0 bytes. Freeing the memory
is undefined by the C spec, and only done with some realloc
implementations, so this stabilizes on the behavior of not doing that,
which should prevent accidental double-free's.
So for the case where you may want to realloc and do not need to have a
maximum, this commit adds the following functions:
- cli_safer_realloc
- cli_safer_realloc2
These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when
MPOOL is disabled (e.g. because mmap-support is not found), so as to
match the behavior in the mpool_realloc/2 functions that do not make use
of the allocation-limit.
There are a large number of allocations for fix sized buffers using the
`cli_malloc` and `cli_calloc` calls that check if the requested size is
larger than our allocation threshold for allocations based on untrusted
input. These allocations will *always* be higher than the threshold, so
the extra stack frame and check for these calls is a waste of CPU.
This commit replaces needless calls with A -> B:
- cli_malloc -> malloc
- cli_calloc -> calloc
- CLI_MALLOC -> MALLOC
- CLI_CALLOC -> CALLOC
I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by
the max-allocation threshold, when MMAP is found/enabled. But the
alternative was set to cli_malloc / cli_calloc when disabled. I changed
those as well.
I didn't change the cli_realloc/2 calls because our version of realloc
not only implements a threshold but also stabilizes the undefined
behavior in realloc to protect against accidental double-free's.
It may be worth implementing a cli_realloc that doesn't have the
threshold built-in, however, so as to allow reallocaitons for things
like buffers for loading signatures, which aren't subject to the same
concern as allocations for scanning possible malware.
There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC,
because it appears to be allocating based on untrusted input.
If a byte compare subsignature specifies a 0-byte length then the
process may crash with a divide-by-zero exception while loading
the signature.
byte_length had validation for invalid characters, but nothing for a
zero value. Added validation for a zero value.
There is a possible heap buffer overflow read when loading some malformed
logical signatures that use the byte-compare feature.
Previously the upper bound for loop in cli_bcomp_freemeta was hardcoded to 2.
But it's possibly for there to be less than 2 items.
This issue is not a vulnerability.
Changed "2" to "bm->comp_count" to avoid going past the end.
Fixes: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43816
Add a new logical signature subsignature type for matching on images
with image fuzzy hashes.
Image fuzzy hash subsigantures follow this format:
fuzzy_img#<hash>#<dist>
In this initial implementation, the hamming distance (dist) is ignored
and only exact fuzzy hash matches will alert.
Fuzzy hash matching is only performed for supported image types.
Also: removed some excessive debug log messages on start-up.
Fixed an issue where the signature name (virname) is being allocated and
stored for every subsignature or even ever sub-pattern in an AC-pattern
(i.e. NDB sig or LDB subsig) containing a `{n-m}` or `*` wildcard.
This fix is only for LDB subsigs though. NDB signatures are still
allocaing one virname per sub-pattern.
This fix was required because I needed a place to store the virname with
fuzzy-hash subsignatures. Storing it in the fuzzy-hash subsig
metadatathe way AC-pattern, PCRE, and BComp subsigs were doing it
wouldn't work because it would cross the C-Rust FFI boundary and giving
pointers to Rust allocated stuff is dicey. Not to mention native Rust
strings are different thatn C strings. Anyways, the correct thing to do
was to store the virname with the actual logical signature.
TODO: Keep track of NDB signatures in the same way and store the virname
for NDB sigs there instead of in AC-patterns so that we can get rid of
the virname field in the AC-pattern struct.
The byte compare feature in logical signatures will cause the rule to
alert if it successfully matches regardless of the rest of the logical
signature.
An easy way to test this is with a logical signature that has two
bcomp subsignatures and requires both to match for the rule to alert.
In the following example, we have 4 signatures where
- the first will match both bcomp subsigs.
- the second will match neither.
- the last two match just one bcomp subsig.
In an --allmatch test, you'll find that the 3 of these match, with the
first one matching *twice*, once for each bcomp subsig.
test.ldb:
```
bcomp.both;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=123);4242;2(>>5#hb2#=255)
bcomp.neither;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=124);4242;2(>>5#hb2#=254)
bcomp.second;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=124);4242;2(>>5#hb2#=255)
bcomp.first;Engine:51-255,Target:0;0&1&2&3;4141;0(>>5#hb2#=123);4242;2(>>5#hb2#=254)
```
test.sample:
```
AA = 7B; BB = FF
```
You can also try a similar test to compare the behavior with regular
ac-pattern-match subsigs with this lsig-test.ldb:
```
pattern.both;Engine:51-255,Target:0;0&1;4141;4242
pattern.neither;Engine:51-255,Target:0;0&1;4140;4241
pattern.second;Engine:51-255,Target:0;0&1;4140;4242
pattern.first;Engine:51-255,Target:0;0&1;4141;4241
```
This commit fixes the issue by incrementing the logical subsignature
count for each bcomp subsig match instead of appending an alert for
each bcomp match.
Also removed call to `lsig_sub_matched()` that didn't do anything.
The for loop in cli_bcomp_scanbuf contains a few "continue" directives
that do not free the three-bytes subsigid buffer allocated within the
loop. This code path is triggered only when a signature contains more
than one byte compare subsignatures. Over a significant amount of time,
as for example when using clamd, this leads to memory exhaustion.
- 192959 Resource leak - In cli_bcomp_compare_check:Â Leak of
memory or pointers to system resources. Several fail cases
could lead to `buffer` or `tmp_buffer` being leaked
- 192934 Resource leak - In cli_bcomp_normalize_buffer:Â Leak of
memory or pointers to system resources. `hex_buffer` leaked
under certain conditions
- 185977 Resource leak - In ole2_process_property:Â Leak of memory
or pointers to system resources. A fail case could lead to
`outstr` and `outstr2` being leaked
- 185941 Resource leak - In header_cb (clamsubmit):Â Leak of
memory or pointers to system resources. A fail case could lead
to `mem` being leaked
- 185925 Resource leak - In load_oneyara:Â Leak of memory or
pointers to system resources. Several fail cases could lead
to `newident` being leaked
- 185918 Resource leak - In parsehwp3_docsummary:Â Leak of memory
or pointers to system resources. Not actually a leak, but
caused by checking for a condition that can’t occur.
- 185915 Resource leak - In parsehwp3_docinfo:Â Leak of memory or
pointers to system resources. Not actually a leak, but caused
by checking for a condition that can’t occur.
- 147644 Resource leak - In tcpserver:Â Leak of memory or pointers
to system resources. A fail case could lead to `info` being leaked
- 147642 Resource leak - In onas_ht_add_hierarchy:Â Leak of memory
or pointers to system resources. Several fail cases could lead
to `hnode` or `elem` memory leaks
It is possible for bm->offset to be negative and (offset + bm->offset)
to be positive, in which case the bounds check was incorrectly skipped,
which could result in a segfault.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fea90598db0 in cli_bcomp_compare_check (
f_buffer=0x7fea5c9e3a3e <error: Cannot access memory at address 0x7fea5c9e3a3e>, f_buffer@entry=0x7fea5c98c1ba "\001\030\001\030",
buffer_length=buffer_length@entry=2590, offset=<optimized out>,
bm=bm@entry=0x7fea7289f9c8) at matcher-byte-comp.c:720