Fix valgrind issues regarding:
- Unclosed log file descriptor in libclamav unit test program.
Also need to disable debug logging for `iconv_cache_destroy()` for this
or else it will try to use that file descriptor after `main()` exits.
- Unclosed socket file descriptor in ClamDScan when doing `ping()`
function.
CLAM-2872
ClamAV will not function when using a FIPS-enabled OpenSSL 3.x.
This is because ClamAV uses MD5 and SHA1 algorithms for a variety of
purposes including matching for malware detection, matching to prevent
false positives on known-clean files, and for verification of MD5-based
RSA digital signatures for determining CVD (signature database archive)
authenticity.
Interestingly, FIPS had been intentionally bypassed when creating hashes
based whole buffers and whole files (by descriptor or `FILE`-pointer):
78d4a9985a
Note: this bypassed FIPS the 1.x way with:
`EVP_MD_CTX_set_flags(ctx, EVP_MD_CTX_FLAG_NON_FIPS_ALLOW);`
It was NOT disabled when using `cl_hash_init()` / `cl_update_hash()` /
`cl_finish_hash()`. That likely worked by coincidence in that the hash
was already calculated most of the time. It certainly would have made
use of those functions if the hash had not been calculated prior:
78d4a9985a/libclamav/matcher.c (L743)
Regardless, bypassing FIPS entirely is not the correct solution.
The FIPS restrictions against using MD5 and SHA1 are valid, particularly
when verifying CVD digital siganatures, but also I think when using a
hash to determine if the file is known-clean (i.e. the "clean cache" and
also MD5-based and SHA1-based FP signatures).
This commit extends the work to bypass FIPS using the newer 3.x method:
`md = EVP_MD_fetch(NULL, alg, "-fips");`
It does this for the legacy `cl_hash*()` functions including
`cl_hash_init()` / `cl_update_hash()` / `cl_finish_hash()`.
It also introduces extended versions that allow the caller to choose if
they want to bypass FIPS:
- `cl_hash_data_ex()`
- `cl_hash_init_ex()`
- `cl_update_hash_ex()`
- `cl_finish_hash_ex()`
- `cl_hash_destroy_ex()`
- `cl_hash_file_fd_ex()`
See the `flags` parameter for each.
Ironically, this commit does NOT use the new functions at this time.
The rational is that ClamAV may need MD5, SHA1, and SHA-256 hashes of
the same files both for determining if the file is malware, and for
determining if the file is clean.
So instead, this commit will do a checks when:
1. Creating a new ClamAV scanning engine. If FIPS-mode enabled, it will
automatically toggle the "FIPS limits" engine option.
When loading signatures, if the engine "FIPS limits" option is enabled,
then MD5 and SHA1 FP signatures will be skipped.
2. Before verifying a CVD (e.g. also for loading, unpacking when
verification enabled).
If "FIPS limits" or FIPS-mode are enabled, then the legacy MD5-based RSA
method is disabled.
Note: This commit also refactors the interface for `cl_cvdverify_ex()`
and `cl_cvdunpack_ex()` so they take a `flags` parameters, rather than a
single `bool`. As these functions are new in this version, it does not
break the ABI.
The cache was already switched to use SHA2-256, so that's not a concern
for checking FIPS-mode / FIPS limits options.
This adds an option for `freshclam.conf` and `clamd.conf`:
FIPSCryptoHashLimits yes
And an equivalent command-line option for `clamscan` and `sigtool`:
--fips-limits
You may programmatically enable FIPS-limits for a ClamAV engine like this:
```C
cl_engine_set_num(engine, CL_ENGINE_FIPS_LIMITS, 1);
```
CLAM-2792
Every time we push a new map onto the scanning recursion context, give
it a unique object id number, which counts from zero.
Moved the location where we add metadata for each file from the
"cli_magic_scan" function over to the "recursion stack push" function.
Include a "path" as a parameter for creating a new fmap, and rename some
related variables and functions to be more intuitive.
CLAM-2796
See also: CLAM-2485, CLAM-2626
Change the clean-cache to use SHA2-256 instead of MD5.
Note that all references are changed to specify "SHA2-256" now instead
of "SHA256", for clarity. But there is no plan to add support for SHA3
algorithms at this time.
Significant code cleanup. E.g.:
- Implemented goto-done error handling.
- Used `uint8_t *` instead of `unsigned char *`.
- Use `bool` for boolean checks, rather than `int.
- Used `#defines` instead of magic numbers.
- Removed duplicate `#defines` for things like hash length.
Add new option to calculate and record additional hash types when the
"generate metadata JSON" feature is enabled:
- libclamav option: `CL_SCAN_GENERAL_STORE_EXTRA_HASHES`
- clamscan option: `--json-store-extra-hashes` (default off)
- clamd.conf option: `JsonStoreExtraHashes` (default 'no')
Renamed the sigtool option `--sha256` to `--sha2-256`.
The original option is still functional, but is deprecated.
For the "generate metadata JSON" feature, the file hash is now stored as
"sha2-256" instead of "FileMD5". If you enable the "extra hashes" option,
then it will also record "md5" and "sha1".
Deprecate and disable the internal "SHA collect" feature.
This option had been hidden behind C #ifdef checks for an option that
wasn't exposed through CMake, so it was basically unavailable anyways.
Changes to calculate file hashes when they're needed and no sooner.
For the FP feature in the matcher module, I have mimiced the
optimization in the FMAP scan routine which makes it so that it can
calculate multiple hashes in a single pass of the file.
The `HandlerType` feature stores a hash of the file in the scan ctx to
prevent retyping the exact same data more than once.
I removed that hash field and replaced it with an attribute flag that is
applied to the new recursion stack layer when retyping a file.
This also closes a minor bug that would prevent retyping a file with an
all-zero hash. :)
The work upgrading cache.c to support SHA2-256 sized hashes thanks to:
https://github.com/m-sola
CLAM-255
CLAM-1858
CLAM-1859
CLAM-1860
An infinite loop may occur when scanning some malformed ZIP files.
I introduced this issue in 96c00b6d80
with this line:
```c
// decrement coff by 1 to account for the increment at the end of the loop
coff -= 1;
```
The problem is that the function may return 0, which should
indicate that there are no more files. The result was that
`coff` would stay the same and the loop would repeat.
This issue is in 1.5 development and affects the 1.5.0 beta but
does not affect any production versions.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1534
Special thanks to Sophie0x2E for an initial fix, proposed in
https://github.com/Cisco-Talos/clamav/pull/1539
In review, I was uncomfortable with other existing code and
decided to to a more significant overhaul of the error handling
in the ZIP module.
In addition to cleanup, this commit has some functional changes:
- When parsing a central directory file header inside of
`parse_central_directory_file_header()`, it will now fail out if the
"extra length" or "comment length" fields would exceced the length of
the archive. That doesn't mean the associated local file header won't
be parsed later, but it won't use the central directory file header
to find it. Instead, the ZIP module will have to find the local file
header by searching for extra records not listed in the central directory.
This change was mostly to tidy up complex error handling.
- Add two FTM new signatures to identify split ZIP archives.
This signature identifies the first segment (first file) in a split or
spanned ZIP archive. It may also be found on a single-segment "split"
archive, depending on the ZIP archiver.
```
0:0:504b0708504b0304:ZIP (First segment split/spanned):CL_TYPE_ANY:CL_TYPE_ZIP
```
Practically speaking, this new signature makes it so ClamAV identifies
the file as a ZIP right away without having to rely on SFX_ZIP detection.
Extraction is then handled by the ZIP `cli_unzip` function rather than
extracting each with `cli_unzip_single` which handles SFX_ZIP entries.
Note: ClamAV isn't capable of finding additional files on disk to support
handling the additional segments. So it doesn't make any difference with
handling those other files.
This signature is for single-segment split/spanned archives, depending
on the ZIP archiver.
```
0:0:504b0303504b0304:ZIP (Single-segment split/spanned):CL_TYPE_ANY:CL_TYPE_ZIP
```
Like the first one, this also means we won't rely on SFX_ZIP detection
and will treat this files as regular ZIPs.
- Added a test file to verify that ClamAV can extract a single-file
"split" ZIP.
- Added a clamscan test with test files to verify that scanning a split
archive across two segments correctly extracts the properly formed zip
file entries. Sadly, we can't join the segments to extract everything.
Add X509 certificate chain based signing with PKCS7-PEM external
signatures distributed alongside CVD's in a custom .cvd.sign format.
This new signing and verification mechanism is primarily in support
of FIPS compliance.
Fixes: https://github.com/Cisco-Talos/clamav/issues/564
Add a Rust implementation for parsing, verifying, and unpacking CVD
files.
Now installs a 'certs' directory in the app config directory
(e.g. <prefix>/etc/certs). The install location is configurable.
The CMake option to configure the CVD certs directory is:
`-D CVD_CERTS_DIRECTORY=PATH`
New options to set an alternative CVD certs directory:
- Commandline for freshclam, clamd, clamscan, and sigtool is:
`--cvdcertsdir PATH`
- Env variable for freshclam, clamd, clamscan, and sigtool is:
`CVD_CERTS_DIR`
- Config option for freshclam and clamd is:
`CVDCertsDirectory PATH`
Sigtool:
- Add sign/verify commands.
- Also verify CDIFF external digital signatures when applying CDIFFs.
- Place commonly used commands at the top of --help string.
- Fix up manpage.
Freshclam:
- Will try to download .sign files to verify CVDs and CDIFFs.
- Fix an issue where making a CLD would only include the CFG file for
daily and not if patching any other database.
libclamav.so:
- Bump version to 13:0:1 (aka 12.1.0).
- Also remove libclamav.map versioning.
Resolves: https://github.com/Cisco-Talos/clamav/issues/1304
- Add two new API's to the public clamav.h header:
```c
extern cl_error_t cl_cvdverify_ex(const char *file,
const char *certs_directory);
extern cl_error_t cl_cvdunpack_ex(const char *file,
const char *dir,
bool dont_verify,
const char *certs_directory);
```
The original `cl_cvdverify` and `cl_cvdunpack` are deprecated.
- Add `cl_engine_field` enum option `CL_ENGINE_CVDCERTSDIR`.
You may set this option with `cl_engine_set_str` and get it
with `cl_engine_get_str`, to override the compiled in default
CVD certs directory.
libfreshclam.so: Bump version to 4:0:0 (aka 4.0.0).
Add sigtool sign/verify tests and test certs.
Make it so downloadFile doesn't throw a warning if the server
doesn't have the .sign file.
Replace use of md5-based FP signatures in the unit tests with
sha256-based FP signatures because the md5 implementation used
by Python may be disabled in FIPS mode.
Fixes: https://github.com/Cisco-Talos/clamav/issues/1411
CMake: Add logic to enable the Rust openssl-sys / openssl-rs crates
to build against the same OpenSSL library as is used for the C build.
The Rust unit test application must also link directly with libcrypto
and libssl.
Fix some log messages with missing new lines.
Fix missing environment variable notes in --help messages and manpages.
Deconflict CONFDIR/DATADIR/CERTSDIR variable names that are defined in
clamav-config.h.in for libclamav from variable that had the same name
for use in clamav applications that use the optparser.
The 'clamav-test' certs for the unit tests will live for 10 years.
The 'clamav-beta.crt' public cert will only live for 120 days and will
be replaced before the stable release with a production 'clamav.crt'.
As of ClamAV 0.105, libbz2 is required.
There is also no option to disable bz2 support.
This commit removes the dead code associated with the old build
option.
As of ClamAV 0.105, libxml2 is required.
There is also no option to disable PCRE support.
This commit removes the dead code associated with the old build
option.
Primarily this commit fixes an issue with the size of the parameters
passed to cli_checklimits(). The parameters were "unsigned long", which
varies in size depending on platform.
I've switched them to uint64_t / u64.
While working on this, I observed some concerning warnigns on Windows,
and some less serious ones, primarily regarding inconsistencies with
`const` parameters.
Finally, in `scanmem.c`, there is a warning regarding use of `wchar_t *`
with `GetModuleFileNameEx()` instead of `GetModuleFileNameExW()`.
This made me realize this code assumes we're not defining `UNICODE`,
which would have such macros use the 'A' variant.
I have fixed it the best I can, although I'm still a little
uncomfortable with some of this code that uses `char` or `wchar_t`
instead of TCHAR.
I also remove the `if (GetModuleFileNameEx) {` conditional, because this
macro/function will always be defined. The original code was checking a
function pointer, and so this was a bug when integrating into ClamAV.
Regarding the changes to `rijndael.c`, I found that this module assumes
`unsigned long` == 32bits. It does not.
I have corrected it to use `uint32_t`.
The cli_max_malloc, cli_max_calloc, and cli_max_realloc functions
provide a way to protect against allocating too much memory
when the size of the allocation is derived from the untrusted input.
Specifically, we worry about values in the file being scanned being
manipulated to exhaust the RAM and crash the application.
There is no need to check the limits if the size of the allocation
is fixed, or if the size of the allocation is necessary for signature
loading, or the general operation of the applications.
E.g. checking the max-allocation limit for the size of a hash, or
for the size of the scan recursion stack, is a complete waste of
time.
Although we significantly increased the max-allocation limit in
a recent release, it is best not to check an allocation if the
allocation will be safe. It would be a waste of time.
I am also hopeful that if we can reduce the number allocations
that require a limit-check to those that require it for the safe
scan of a file, then eventually we can store the limit in the scan-
context, and make it configurable.
We have some special functions to wrap malloc, calloc, and realloc to
make sure we don't allocate more than some limit, similar to the
max-filesize and max-scansize limits. Our wrappers are really only
needed when allocating memory for scans based on untrusted user input,
where a scan file could have bytes that claim you need to allocate
some ridiculous amount of memory. Right now they're named:
- cli_malloc
- cli_calloc
- cli_realloc
- cli_realloc2
... and these names do not convey their purpose
This commit renames them to:
- cli_max_malloc
- cli_max_calloc
- cli_max_realloc
- cli_max_realloc2
The realloc ones also have an additional feature in that they will not
free your pointer if you try to realloc to 0 bytes. Freeing the memory
is undefined by the C spec, and only done with some realloc
implementations, so this stabilizes on the behavior of not doing that,
which should prevent accidental double-free's.
So for the case where you may want to realloc and do not need to have a
maximum, this commit adds the following functions:
- cli_safer_realloc
- cli_safer_realloc2
These are used for the MPOOL_REALLOC and MPOOL_REALLOC2 macros when
MPOOL is disabled (e.g. because mmap-support is not found), so as to
match the behavior in the mpool_realloc/2 functions that do not make use
of the allocation-limit.
There are a large number of allocations for fix sized buffers using the
`cli_malloc` and `cli_calloc` calls that check if the requested size is
larger than our allocation threshold for allocations based on untrusted
input. These allocations will *always* be higher than the threshold, so
the extra stack frame and check for these calls is a waste of CPU.
This commit replaces needless calls with A -> B:
- cli_malloc -> malloc
- cli_calloc -> calloc
- CLI_MALLOC -> MALLOC
- CLI_CALLOC -> CALLOC
I also noticed that our MPOOL_MALLOC / MPOOL_CALLOC are not limited by
the max-allocation threshold, when MMAP is found/enabled. But the
alternative was set to cli_malloc / cli_calloc when disabled. I changed
those as well.
I didn't change the cli_realloc/2 calls because our version of realloc
not only implements a threshold but also stabilizes the undefined
behavior in realloc to protect against accidental double-free's.
It may be worth implementing a cli_realloc that doesn't have the
threshold built-in, however, so as to allow reallocaitons for things
like buffers for loading signatures, which aren't subject to the same
concern as allocations for scanning possible malware.
There was one case in mbox.c where I changed MALLOC -> CLI_MALLOC,
because it appears to be allocating based on untrusted input.
Also fixed a number of conditions where magic_scan() critical errors may
be ignored.
To ensure that the scan truly aborts for signature matches (not in
allmatch mode) and for timeouts, the `ctx->abort` option is now set in
these two conditions, and checked in several spots in magic_scan().
Additionally, I've consolidated some of the "scan must halt" type of
checks (mostly large switch statements) into a function so that we can
use the exact same logic in a few places in magic_scan().
I've also fixed a few minor warnings and code format issues.
Some basic testing is needed for the new cl_cvdunpack() API, so this
commit adds basic unit tests for that.
For reasons unknown, a number of cl_* API's have stubs for unit tests
that weren't filled out. The CVD load/verify ones in particular
required access to a signed CVD. We actually ship a very basic signed
CVD with the databases now, so I added tests for those while I was at it.
* Added loglevel parameter to logg()
* Fix logg and mprintf internals with new loglevels
* Update all logg calls to set loglevel
* Update all mprintf calls to set loglevel
* Fix hidden logg calls
* Executed clam-format
The fmap module provides a mechanism for creating a mapping into an
existing map at an offset and length that's used when a file is found
with an uncompressed archive or when embedded files are found with
embedded file type recognition in scanraw(). This is the
"fmap_duplicate()" function. Duplicate fmaps just reference the original
fmap's 'data' or file handle/descriptor while allowing the caller to
treat it like a new map using offsets and lengths that don't account for
the original/actual file dimensions.
fmap's keep track of this with m->nested_offset & m->real_len, which
admittedly have confusing names. I found incorrect uses of these in a
handful of locations. Notably:
- In cli_magic_scan_nested_fmap_type().
The force-to-disk feature would have been checking incorrect sizes and
may have written incorrect offsets for duplicate fmaps.
- In XDP parser.
- A bunch of places from the previous commit when making dupe maps.
This commit fixes those and adds lots of documentation to the fmap.h API
to try to prevent confusion in the future.
nested_offset should never be referenced outside of fmap.c/h.
The fmap_* functions for accessing or reading map data have two
implementations, mem_* or handle_*, depending the data source.
I found issues with some of these so I made a unit test that covers each
of the functions I'm concerned about for both types of data sources and
for both original fmaps and nested/duplicate fmaps.
With the tests, I found and fixed issues in these fmap functions:
- handle_need_offstr(): must account for the nested_offset in dupe maps.
- handle_gets(): must account for nested_offset and use len & real_len
correctly.
- mem_need_offstr(): must account for nested_offset in dupe maps.
- mem_gets(): must account for nested_offset and use len & real_len
correctly.
Moved CDBRANGE() macro out of function definition so for better
legibility.
Fixed a few warnings.
Xcode (and perhaps some other generators?) do not like targets that have
only object files. See:
https://cmake.org/cmake/help/latest/command/add_library.html#object-libraries
And: https://cmake.org/pipermail/cmake/2016-May/063479.html
This issue manifests when using `-G Xcode` on macOS as the library
dylibs being missing when linking with other binaries.
This commit removes the object libraries for libclamav, libfreshclam,
libclamunrar_iface, libclamunrar, libclammspack, and (lib)common
because they were used by static or shared libs that didn't
themselves have any added sources.
Add getter & setter for the debug flag, so it isn't referenced by unit
tests or other code that links with libclamav. This is needed because
global variables are exported symbols on Windows.
The split test files are flagged by some AV's because they look like
broken executables. Instead of splitting the test files to prevent
detections, we should encrypt them. This commit replaces the "reassemble
testfiles" script with a basic "XOR testfiles" script that can be used
to encrypt or decrypt test files. This commit also of course then
replaces all the split files with xor'ed files.
The test and unit_tests directories were a bit of a mess, so I
reorganized them all into unit_tests with all of the test files placed
under "unit_tests/input" using subdirectories for different types of files.
On Windows, files open()'ed without the O_BINARY flag will have new-line
LF (aka \n) converted to CRLF (aka \r\n) automatically when read from or
written to. This is undesirable for all scan targets AND temp files
because it affects pattern matching and with hashing.
This commit converts a handful of instances throughout the codebase
where it appears that O_BINARY was mistakenly omitted and could result
in unexpected behavior on Windows.
Git on Windows also converts LF -> CRLF for "text" files, for editing
purposes.
This is problematic for scan files and test files that should match
verbatim.
We can prevent this issue by marking .ref test files as "binary" in the
.gitattributes file and by always opening scan files and temp files as
binary.
In this commit I've also removed the `ChangeLog merge=cl-merge` line
that was once used to reduce ChangeLog merge conflicts by using the
gnulib git-merge-changlog tool. This project now categorizes changes in
the NEWS.md.
For finer detail, git commit history is fully accessible on github.com.
An ENABLE_TESTS CMake option is provided so that users can disable
testing if they don't want it. Instructions for how to use this
included in the INSTALL.cmake.md file.
If you run `ctest`, each testcase will write out a log file to the
<build>/unit_tests directory.
As with Autotools' make check, the test files are from test/.split
and unit_tests/.split files, but for CMake these are generated at
build time instead of at test time.
On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled
libraries can be loaded when running tests.
On Windows systems, CTest will identify and collect all library
dependencies and assemble a temporarily install under the
build/unit_tests directory so that the libraries can be loaded when
running tests.
The same feature is used on Windows when using CMake to install to
collect all DLL dependencies so that users don't have to install them
manually afterwards.
Each of the CTest tests are run using a custom wrapper around Python's
unittest framework, which is also responsible for finding and inserting
valgrind into the valgrind tests on Posix systems.
Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by
default, if Valgrind can be found. There's no need to set VG=1.
CTest's memcheck module is NOT supported, because we use Python to
orchestrate our tests.
Added a bunch of Windows compatibility changes to the unit tests.
These were primarily changing / to PATHSEP and making adjustments
to use Win32 C headers and ifdef out the POSIX ones which aren't
available on Windows. Also disabled a bunch of tests on Win32
that don't work on Windows, notably the mmap ones and FD-passing
(i.e. FILEDES) ones.
Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate
warnings on Windows where json.h is included after inttypes.h because
json-c's inttypes replacement relies on it.
This is a it of a hack and may be removed if json-c fixes their
inttypes header stuff in the future.
Add preprocessor definitions on Windows to disable MSVC warnings about
CRT secure and nonstandard functions. While there may be a better
solution, this is needed to be able to see other more serious warnings.
Add missing file comment block and copyright statement for clamsubmit.c.
Also change json-c/json.h include filename to json.h in clamsubmit.c.
The directory name is not required.
Changed the hash table data integer type from long, which is poorly
defined, to size_t -- which is capable of storing a pointer. Fixed a
bunch of casts regarding this variable to eliminate warnings.
Fixed two bugs causing utf8 encoding unit tests to fail on Windows:
- The in_size variable should be the number of bytes, not the character
count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8
transcoding test to only transcode half the bytes.
- It turns out that the MultiByteToWideChar() API can't transcode
UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer
and flip the bytes on each uint16_t. This but was causing the UTF16-BE
to UTF8 tests to fail.
I also split up the utf8 transcoding tests into separate tests so I
could see all of the failures instead of just the first one.
Added a flags parameter to the unit test function to open testfiles
because it turns out that on Windows if a file contains the \r\n it will
replace it with just \n if you opened the file as a text file instead of
as binary. However, if we open the CBC files as binary, then a bunch of
bytecode tests fail. So I've changed the tests to open the CBC files in
the bytecode tests as text files and open all other files as binary.
Ported the feature tests from shell scripts to Python using a modified
version of our QA test-framework, which is largely compatible and will
allow us to migrate some QA tests into this repo. I'd like to add GitHub
Actions pipelines in the future so that all public PR's get some testing
before anyone has to manually review them.
The clamd --log option was missing from the help string, though it
definitely works. I've added it in this commit.
It appears that clamd.c was never clang-format'd, so this commit also
reformats clamd.c.
Some of the check_clamd tests expected the path returned by clamd to
match character for character with original path sent to clamd. However,
as we now evaluate real paths before a scan, the path returned by clamd
isn't going to match the relative (and possibly symlink-ridden) path
passed to clamdscan. I fixed this test by changing the test to search
for the basename: <signature> FOUND within the response instead of
matching the exact path.
Autotools: Link check_clamd with libclamav so we can use our utility
functions in check_clamd.c.
The fmap_duplicate function is used create a new fmap with a view into
an existing fmap. When the new view is a different size than the old
fmap, a new hash must be calculated for the duplicate fmap. However,
when the duplicated fmap is the same size as the original fmap, the hash
will be the same and there's no point recalculating.
The issue is apparent when scanning large EXE files because the hash was
being calculated at the beginning and end of the scan.
Digging into this issue revealed that hash calculations for fmaps were
also being performed at the wrong place. For scans of maps we use
fmap_duplicate() early in the process to apply the name API argument to
the duplicate fmap. Fixing the logic so we doing recalculate the hash
revealed that we never calculated hashes for fmap's created from buffers
in the first place, so that also had to be fixed be relocating where the
hash is calculated.
I also found that fmap_duplicate()'s offset argument used an off_t,
though it and all caller offsets are not allowed to be negative. This
was a bit of tangent to fix a bunch of off_t variables and paramters
that should've been size_t.
Added a couple unit tests to verify that making duplicate fmaps, and
duplicate-duplicate fmaps works as expected after the change.
Changed CLI_ISCONTAINED() and CLI_ISCONTAINED2() macros to cast to
size_t, because pointers and buffer sizes may not be negative, and these
two macros do not rely on substraction.
This patch adds experimental-quality CMake build tooling.
The libmspack build required a modification to use "" instead of <> for
header #includes. This will hopefully be included in the libmspack
upstream project when adding CMake build tooling to libmspack.
Removed use of libltdl when using CMake.
Flex & Bison are now required to build.
If -DMAINTAINER_MODE, then GPERF is also required, though it currently
doesn't actually do anything. TODO!
I found that the autotools build system was generating the lexer output
but not actually compiling it, instead using previously generated (and
manually renamed) lexer c source. As a consequence, changes to the .l
and .y files weren't making it into the build. To resolve this, I
removed generated flex/bison files and fixed the tooling to use the
freshly generated files. Flex and bison are now required build tools.
On Windows, this adds a dependency on the winflexbison package,
which can be obtained using Chocolatey or may be manually installed.
CMake tooling only has partial support for building with external LLVM
library, and no support for the internal LLVM (to be removed in the
future). I.e. The CMake build currently only supports the bytecode
interpreter.
Many files used include paths relative to the top source directory or
relative to the current project, rather than relative to each build
target. Modern CMake support requires including internal dependency
headers the same way you would external dependency headers (albeit
with "" instead of <>). This meant correcting all header includes to
be relative to the build targets and not relative to the workspace.
For example, ...
```c
include "../libclamav/clamav.h"
include "clamd/clamd_others.h"
```
... becomes:
```c
// libclamav
include "clamav.h"
// clamd
include "clamd_others.h"
```
Fixes header name conflicts by renaming a few of the files.
Converted the "shared" code into a static library, which depends on
libclamav. The ironically named "shared" static library provides
features common to the ClamAV apps which are not required in
libclamav itself and are not intended for use by downstream projects.
This change was required for correct modern CMake practices but was
also required to use the automake "subdir-objects" option.
This eliminates warnings when running autoreconf which, in the next
version of autoconf & automake are likely to break the build.
libclamav used to build in multiple stages where an earlier stage is
a static library containing utils required by the "shared" code.
Linking clamdscan and clamdtop with this libclamav utils static lib
allowed these two apps to function without libclamav. While this is
nice in theory, the practical gains are minimal and it complicates
the build system. As such, the autotools and CMake tooling was
simplified for improved maintainability and this feature was thrown
out. clamdtop and clamdscan now require libclamav to function.
Removed the nopthreads version of the autotools
libclamav_internal_utils static library and added pthread linking to
a couple apps that may have issues building on some platforms without
it, with the intention of removing needless complexity from the
source. Kept the regular version of libclamav_internal_utils.la
though it is no longer used anywhere but in libclamav.
Added an experimental doxygen build option which attempts to build
clamav.h and libfreshclam doxygen html docs.
The CMake build tooling also may build the example program(s), which
isn't a feature in the Autotools build system.
Changed C standard to C90+ due to inline linking issues with socket.h
when linking libfreshclam.so on Linux.
Generate common.rc for win32.
Fix tabs/spaces in shared Makefile.am, and remove vestigial ifndef
from misc.c.
Add CMake files to the automake dist, so users can try the new
CMake tooling w/out having to build from a git clone.
clamonacc changes:
- Renamed FANOTIFY macro to HAVE_SYS_FANOTIFY_H to better match other
similar macros.
- Added a new clamav-clamonacc.service systemd unit file, based on
the work of ChadDevOps & Aaron Brighton.
- Added missing clamonacc man page.
Updates to clamdscan man page, add missing options.
Remove vestigial CL_NOLIBCLAMAV definitions (all apps now use
libclamav).
Rename Windows mspack.dll to libmspack.dll so all ClamAV-built
libraries have the lib-prefix with Visual Studio as with CMake.
Also relocated codepage table from msdoc.h to entconv.h
Also adds new macros for codepages to reduce use of magic numbers when
referencing code pages elsewhere in libclamav.
Fixes an fmap leak in the bytecode switch_input() API. The
switch_input() API provides a way to read from an extracted file instead
of reading from the current file. The issue is that the current
implementation fails to free the fmap created to read from the extracted
file on cleanup or when switching back to the original fmap. In
addition, it fails to use the cli_bytecode_context_setfile() function
to restore the file_size in the context for the current fmap.
Fixes a couple fmap leaks in the unit tests.
Tests in libcheck 0.13 must have {} between START_TEST and END_TEST
else it will not compile.
Also replaced all deprecated "fail_" macros with "ck_" macros.
E.g. fail_unless() becomes ck_assert_msg()
The checks_common.h header file provided a couple of macros to
support versions older than 0.9.3. As these older versions are
no longer relevant, I've removed those compatibility macros
entirely.
New features added to freshclam:
- Update signature definitions over HTTPS.
- Support for HTTP protocol v1.1 (formerly v1.0).
- New libfreshclam library with an all new API and versioning separate from libclamav (v2.0.0). This library is now build and installed alongside libclamav as a hard dependency of freshclam.
- The ability to opt-in and opt-out of standard and optional official ClamAV databases (ExtraDatabase, ExcludeDatabase)
- The option to specify the protocol and port number of official and private mirror servers.
- Support for additional types of proxy servers beyond plain HTTP (SOCKS 4, SOCKS 5).
Features removed from freshclam:
- Mirror management (mirrors.dat) file. This feature is no longer needed as official signature databases are distributed using a paid content delivery network (Cloudflare).
This commit also adds the following features for Windows users:
- The clamsubmit tool.
- The json-c library dependency, which will enable the --gen-json option in clamscan.
- Third party libraries under the win32/3rdparty directory have been removed. Developers will need to build the libraries separately from ClamAV and provide the headers and lib/dll library files the same way they do for OpenSSL. This includes libxml2, pthread-win32, bzip2, zlib, pcre2 as well as new dependencies: curl, json-c. Developers are encouraged to use the build tool Mussels to simplify this task.
Updated libclamav documentation detailing new scan options structure.
Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'.
Renamed references to 'scan all' to 'scan all match'.
Renamed a couple of 'Hueristic.*' signature names as 'Heuristics.*' signatures (plural) to match majority of other heuristics.