Commit graph

72 commits

Author SHA1 Message Date
Valerie Snyder
e64590d8b5
libclamav: Add 'ex'-scan functions to API w. hash and type in/out parameters
Add `cl_scanfile_ex()`, `cl_scanmap_ex()`, and `cl_scandesc_ex()`
functions that provide the following additional parameters:

hash_hint       (Optional) A NULL terminated string of the file hash so that
                libclamav does not need to calculate it.

[out] hash_out  (Optional) A NULL terminated string of the file hash.
                The caller is responsible for freeing the string.

hash_alg        The hashing algorithm used for either `hash_hint` or `hash_out`.
                Supported algorithms are "md5", "sha1", "sha2-256".
                If not specified, the default is "sha2-256".

file_type_hint  (Optional) A NULL terminated string of the file type hint.
                E.g. "pe", "elf", "zip", etc.
                You may also use ClamAV type names such as "CL_TYPE_PE".
                ClamAV will ignore the hint if it is not familiar with the specified type.
                See also: https://docs.clamav.net/appendix/FileTypes.html#file-types

file_type_out   (Optional) A NULL terminated string of the file type
                of the top layer as determined by ClamAV.
                Will take the form of the standard ClamAV file type format. E.g. "CL_TYPE_PE".
                See also: https://docs.clamav.net/appendix/FileTypes.html#file-types

CLAM-2626
2025-08-14 22:39:12 -04:00
Valerie Snyder
18854bf4bc
Windows: Fix filepath basename issue
On Windows, the cli_basename function should treat both '/' and '\' as path
separators. Most Windows APIs also accept both.

On Linux/Unix, it makes sense when using a filepath that is more for
informational purposes or where it may have come from a Windows system,
to treat the '\' as a path separator.
But in situations where the the path is needed for some critical action,
like moving or deleting a file, we can't treat it as a path separator.
2025-08-11 18:14:19 -04:00
Val Snyder
7ff29b8c37
Bump copyright dates for 2025 2025-02-14 10:24:30 -05:00
Micah Snyder
405829ee88 Refine max-allocation and safer-allocation function and macro names
We add the _OR_GOTO_DONE suffix to the macros that go to done if the
allocation fails. This makes it obvious what is different about the
macro versus the equivalent function, and that error handling is
built-in.

Renamed the cli_strdup to safer_strdup to make it obvious that it exists
because it is safer than regular strdup. Regular strdup doesn't have the
NULL check before trying to dup, and so may result in a NULL-deref
crash.

Also remove unused STRDUP (_OR_GOTO_DONE) macro, since the one with the
NULL-check is preferred.
2024-03-15 13:18:47 -04:00
Micah Snyder
8e04c25fec Rename clamav memory allocation functions
We have some special functions to wrap malloc, calloc, and realloc to
make sure we don't allocate more than some limit, similar to the
max-filesize and max-scansize limits. Our wrappers are really only
needed when allocating memory for scans based on untrusted user input,
where a scan file could have bytes that claim you need to allocate
some ridiculous amount of memory. Right now they're named:
- cli_malloc
- cli_calloc
- cli_realloc
- cli_realloc2

... and these names do not convey their purpose

This commit renames them to:
- cli_max_malloc
- cli_max_calloc
- cli_max_realloc
- cli_max_realloc2

The realloc ones also have an additional feature in that they will not
free your pointer if you try to realloc to 0 bytes. Freeing the memory
is undefined by the C spec, and only done with some realloc
implementations, so this stabilizes on the behavior of not doing that,
which should prevent accidental double-free's.

So for the case where you may want to realloc and do not need to have a
maximum, this commit adds the following functions:
- cli_safer_realloc
- cli_safer_realloc2

These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when
MPOOL is disabled (e.g. because mmap-support is not found), so as to
match the behavior in the mpool_realloc/2 functions that do not make use
of the allocation-limit.
2024-03-15 13:18:47 -04:00
Micah Snyder
9cb28e51e6 Bump copyright dates for 2024 2024-01-22 11:27:17 -05:00
Micah Snyder
6eebecc303 Bump copyright for 2023 2023-02-12 11:20:22 -08:00
Micah Snyder
d10a3d24f3 Fix issue preventing multiple LDB PCRE subsignatures
My recent fix for the issue where a '\' followed by ':' in a Yara regex
string would fail to parse introduced a new issue that broke loading a
signature in the current daily.ldb database.

Unbeknownst to me at the time, you can have multiple PCRE subsignatures
in a logical signature, so long as they're the last subsignatures.
The previous fix made it so the signature parser muddled more than one
PCRE subsignature into one messed up regex string.

This commit essentially reverts the previous fix, while keeping some of
the code readability improvements in that function.
Instead, it addresses the problem a different way. To resolve the
original problem, I'm simply checking if the signame starts with "YARA".
If it does, we don't tokenize it by ':' delimiters.
2022-09-16 13:53:20 -07:00
Micah Snyder
e4d4154492 Fix issue loading regex sigs containing '/' and ':'
There is an issue parsing PCRE patterns if the pattern contains a '/' in
the middle, followed by a ':'.  When splitting the subsignature (or yara
regex string) by ':' delimiters to identify the offset, it will
inadvertently think that the '/' in the middle of the sig is the end of
the PCRE string and will therefore consider the ':' in the string as
valid delimiter instead of ignoring it for being inside of the regex
string.

The solution I came up with is to ignore all content after a '/' when
tokenizing rather than ignoring content between a matching pair of /'s.
This works for LDB signatures because PCRE subsignatures are always
the last subsignature and because a ':' never comes *after* the PCRE
string.
It works for YARA rules because the `cli_tokenize()` function is only
ever used on the regex strings, never on the whole rule.

Fixes: https://github.com/Cisco-Talos/clamav/issues/594
2022-09-07 21:39:46 -07:00
Micah Snyder
350a2faf67 DB read logic cleanup, fix some warnings
The logic for parsing a logical subsignature isn't clearly identified
and has been, perhaps mistakenly or out of convenience, used to when
parsing NDB signatures in addition to LDB subsignatures. What this means
is that you can technically use a PCRE subsignature in an NDB file and
clam won't complain about it. It won't work however, because a PCRE
subsignature requires another matching subsignature to trigger it, but
it will parse. The same is likely true for byte-compare subsignatures.

This commit restructures that logic a bit so subsignature parsing has
its own function and is more organized.
I also renamed the functions a little bit and added lots of comments.

I fixed a few minor warnings relating to format string characters.

The change in str.c:cli_ldbtokenize is to prevent a buffer under-read if
you were to use the function on the start of a buffer, as is now down in
this commit.
2022-02-23 12:28:31 -07:00
mko-x
a21cc6dcd7
Add explicit log level parameter to application logging API
* Added loglevel parameter to logg()

* Fix logg and mprintf internals with new loglevels

* Update all logg calls to set loglevel

* Update all mprintf calls to set loglevel

* Fix hidden logg calls

* Executed clam-format
2022-02-15 15:13:55 -08:00
micasnyd
140c88aa4e Bump copyright for 2022
Includes minor format corrections.
2022-01-09 14:23:25 -07:00
Micah Snyder (micasnyd)
b9ca6ea103 Update copyright dates for 2021
Also fixes up clang-format.
2021-03-19 15:12:26 -07:00
Jonas Zaddach (jzaddach)
d5a733ef90 XLM (Excel 4.0) macro detection and extraction
XLM is a macro language in Excel that was used before VBA (before
1996). It is still parsed and executed by modern Excel and is gaining
popularity with malware authors.

This patch adds rudimentary support for detecting and extracting
Excel 4.0 (XLM) macros.

The code is based on Didier Steven's plugin_biff for oletools.py.
2020-04-29 14:19:41 -07:00
Micah Snyder
206dbaefe8 Update copyright dates for 2020 2020-01-03 15:44:07 -05:00
Andrew
ccc24eb307 Add NULL check in cli_isnumber 2019-11-09 08:26:34 -08:00
Micah Snyder
bcb4505e60 bb12370 - cli_strndup and other str* replacements must be built and exported for every OS to be used outside of libclamav on systems that don't have the original functions (e.g. strndup). This commit renames the macros to be uppercase, renames the replacement functions to be preceeded with two understores (e.g. __cli_strndup), and removes the ifdef's so that they are built regardless, because there are no ifdefs in libclamav.map. 2019-10-02 16:08:30 -04:00
Micah Snyder
ca8b4c466e Assortment of warning fixes. 2019-10-02 16:08:25 -04:00
Micah Snyder
cef54eaf8f Freshclam refresh. This update makes libcurl a hard requirement for ClamAV.
New features added to freshclam:
- Update signature definitions over HTTPS.
- Support for HTTP protocol v1.1 (formerly v1.0).
- New libfreshclam library with an all new API and versioning separate from libclamav (v2.0.0). This library is now build and installed alongside libclamav as a hard dependency of freshclam.
- The ability to opt-in and opt-out of standard and optional official ClamAV databases (ExtraDatabase, ExcludeDatabase)
- The option to specify the protocol and port number of official and private mirror servers.
- Support for additional types of proxy servers beyond plain HTTP (SOCKS 4, SOCKS 5).

Features removed from freshclam:
- Mirror management (mirrors.dat) file. This feature is no longer needed as official signature databases are distributed using a paid content delivery network (Cloudflare).

This commit also adds the following features for Windows users:
- The clamsubmit tool.
- The json-c library dependency, which will enable the --gen-json option in clamscan.
- Third party libraries under the win32/3rdparty directory have been removed. Developers will need to build the libraries separately from ClamAV and provide the headers and lib/dll library files the same way they do for OpenSSL. This includes libxml2, pthread-win32, bzip2, zlib, pcre2 as well as new dependencies: curl, json-c. Developers are encouraged to use the build tool Mussels to simplify this task.
2019-10-02 16:08:22 -04:00
Micah Snyder
155eaaad8b bb12284 - Fix to prevent path traversal when using cli_genfname() to generate filenames that may retain path and filename information. Changed scanrar so that it will no longer retain path information for extracted files. 2019-10-02 16:08:19 -04:00
Micah Snyder
a8ca96687a Clean up of PDF object finding logic. Changes include recording object sizes as objects are found, identifying object streams in the object parsing section instead of the PDF parsing section, and limiting of stream and other object parsing to the size of the object instead of the size of the PDF. It is also easier to read and includes more inline documentation. 2019-10-02 16:08:19 -04:00
Mickey Sola
1feebda93b fuzz - 12260 - fixing undefined shift issue when handling javascript escape sequences during hex to int conversion 2019-10-02 16:08:19 -04:00
Micah Snyder
52cddcbcfd Updating and cleaning up copyright notices. 2019-10-02 16:08:18 -04:00
Micah Snyder
b3e82e5e61 Replacing libclamav/cltypes.h with clamav-types.h.in, which generates a header clamav-types.h that we install alongside clamav.h. 2019-10-02 16:08:17 -04:00
Micah Snyder
72fd33c8b2 clang-format'd using new .clang-format rules. 2019-10-02 16:08:16 -04:00
Micah Snyder
38fe8b69a0 Added .clang-format style rules, clam-format script to automate formatting of ClamAV code, and preparing select files so that clang-format does not alter carefully formatted sections. 2019-10-02 16:08:16 -04:00
Micah Snyder (micasnyd)
fef94048c8 bb12220: Converting strnlen() calls to cli_strnlen() for systems such as Solaris 10 where strnlen() is not available. Adding #else clause to cli_get_filepath_from_filedesc() for platforms where we have not implemented a mechanism to determine the filename from the file descriptor. 2018-12-02 23:07:07 -05:00
Mickey Sola
6ad41ab25f bcomp - fixing case where automatic detection would fail against little endian hex values; removing code for little endian decimal support; fixing some clang warnings; fixes for hexidecimal detection in sli_strnto functions; updating documentation 2018-12-02 23:07:04 -05:00
Mickey Sola
9e408e7658 bb4007 - adding pcre byte sequence comparison functions 2018-12-02 23:07:03 -05:00
Micah Snyder
d39cb6581f Updating libclamunrar from legacy C implementation to modern unrar 5.6.5. API changes and supporting changes included to pass the filepath of the scanned file into libclamav through the cli_ctx structure, required by the unrar library to open archives. The filename argument may be optional for the scandesc scanning variant, but libclamav will make a best effort to identify the filename from the file descriptor if it was not provided. In addition, included the ability to prefix temp file and directory names with file basenames. 2018-12-02 23:06:59 -05:00
Micah Snyder (micasnyd)
89d5207b31 Added new pdf object stream parsing capability. 2018-12-02 23:06:58 -05:00
Micah Snyder
f842e965fe Replacing strntol with strntoul to ensure proper (un)signedness when parsing numbers from PDFs. 2018-07-30 09:16:23 -04:00
Micah Snyder
bf6e777fa7 bb12133: Wrapping cli_strntol to provide easy error detection. Applying cli_strntol_wrap with error checking. Adding logic to identify when a parsing error is in fact a new revision of the PDF. 2018-07-30 09:16:22 -04:00
Micah Snyder
53cbdee38a bb12133: Implementing cli_strntol based on gnu gcc's strtol implementation with modifications to limit string buffer length for non-null terminated strings. Using cli_strntol in pdf.c for added safety. 2018-07-30 09:16:22 -04:00
Micah Snyder
6289eda8e0 Eliminating AUTHORS file, and moving acknowledgements for various source code contributions to the file comment blocks for the individual files, as appropriate. 2018-03-06 17:44:05 -05:00
Mickey Sola
915614e7a6 strn - adding configuration option to force use of internal strn functions for use when crosscompiling binaries against older libraries 2017-08-31 11:20:45 -04:00
Mickey Sola
6eb84c277a str - fixing internal strndup implemenatation to use the internal strnlen implementation 2017-08-21 18:16:35 -04:00
Mickey Sola
47a544dc07 m4 - rework of strndup and strnlen function absence handling 2017-08-21 18:16:28 -04:00
Mickey Sola
46a35abe56 mass update of copyright headers 2015-09-17 13:41:26 -04:00
Kevin Lin
71e1364547 moved ldb_tokenize in readdb to cli_ldbtokenize in str 2015-07-21 17:29:23 -04:00
Steven Morgan
a80453e6e9 Merge master to features/yara. 2015-05-01 18:36:48 -04:00
Kevin Lin
5f31c9b450 bb#11296 - various fixes to pdf string base64 string conversion 2015-04-14 15:53:24 -04:00
Kevin Lin
0e7442f11e forced pdf json strings to be utf-8 or base64 encoded 2015-03-02 19:06:23 -05:00
Steven Morgan
a5bde84c28 Fix for errors on YARA rules when hex constants have odd lengths. 2015-02-23 17:17:08 -05:00
Shawn Webb
60d8d2c352 Move all the crypto API to clamav.h 2014-07-01 19:38:01 -04:00
Shawn Webb
b2e7c931d0 Use OpenSSL for hashing. 2014-02-08 00:31:12 -05:00
Shawn Webb
241e7eb147 bb6258 - Add warnings when allocations fail 2013-03-01 13:51:15 -05:00
aCaB
583cd65fc4 Add support for scanning different types of iso9660 image files.
The allowed sector size is within 2048 to 2448 (2352 raw + 96 sub).
Right now only the only file system supported is plain iso9660 with
optional Joliet extensions.
Additionally files with multi extents and interleaved files are not
supported.

Finally, due to the multiple possible ways to interpret the content
of a cd/dvd, I cannot guarantee that we scan the "right" files.
2011-11-14 21:46:47 +01:00
Tomasz Kojm
e5777fa389 libclamav/str.c: fix cli_isnumber() (bb#2070) 2010-06-10 16:08:15 +02:00
Tomasz Kojm
2979de20da fix some compiler warnings 2010-02-19 16:10:37 +01:00