If you receive an error message like `gcc: error: /opt/csw/lib/libstdc++.so: No such file or directory`, change versions with `/opt/csw/sbin/alternatives --config automake`
- ``--prefix=`pwd`/build``: This will cause `make install` to install into the specified directory to avoid potentially tainting a release install of ClamAV that you may have.
-`--enable-debug`: This will define *CL_DEBUG*, which mostly just enables additional print statements that are useful for debugging.
-`--enable-check`: Enables the unit tests, which can be run with `make check`.
-`--enable-coverage`: If using gcc, sets `-fprofile-arcs -ftest-coverage` so that code coverage metrics will get generated when the program is run. Note that the code inserted to store program flow data may show up in any generated flame graphs or profiling output, so if you don't care about code coverage, omit this.
-`--enable-libjson`: Enables `libjson`, which enables the `--gen-json` option. The json output contains additional metadata that might be helpful when debugging.
-`--with-systemdsystemunitdir=no`: Don't try to register `clamd` as a `systemd` service (on systems that use `systemd`). You likely don't want this development build of `clamd` to register as a service, and this eliminates the need to run `make install` with `sudo`.
- You might want to include the following flags also so that the optional functionality is enabled: `--enable-experimental --enable-clamdtop --enable-libjson --enable-milter --enable-xml --enable-pcre`. Note that this may require you to install additional development libraries.
-`--disable-llvm`: When enabled, LLVM provides the capability to just-in-time compile ClamAV bytecode signatures. Without LLVM, ClamAV uses a built-in bytecode interpreter to execute bytecode signatures. The mechanism is different, but the results are same and the performance overall is comparable. At present only LLVM versions up to LLVM 3.6.2 are supported by ClamAV, and LLVM 3.6.2 is old enough that newer distributions no longer provide it. Therefore, we recommend using the `--disable-llvm` configure option.
NOTE: It is possible to build libclamav as a static library and have it statically linked into clamscan/clamd (to do this, run `./configure` with `--enable-static --disable-shared`). This is useful for using tools like `gprof` that do not support profiling code in shared objects. However, there are two drawbacks to doing this:
-`clamscan`/`clamd` will not be able to extract files from RAR archives. Based on the software license of the unrar library that ClamAV uses, the library can only be dynamically loaded. ClamAV will attempt to dlopen the unrar library shared object and will continue on without RAR extraction support if the library can't be found (or if it doesn't get built, which is what happens if you indicate that shared libraries should not be built).
- If you make changes to libclamav, you'll need to `make clean`, `make`, and `make install` again to have `clamscan`/`clamd` rebuilt using the new `libclamav.a`. The makefiles don't seem to know to rebuild `clamscan`/`clamd` when `libclamav.a` changes (TODO, fix this).
Run the following to finishing building. `-j2` in the code below is used to indicate that the build process should use 2 cores. Increase this if your machine is more powerful.
If you plan to use custom rules for testing, you can invoke `clamscan` via `./installed/bin/clamscan`, specifying your custom rule files via `-d` parameters.
If you want to download the official ruleset to use with `clamscan`, do the following:
-`--statistics=pcre --statistics=bytecode`: Print execution statistics on any PCRE and bytecode rules that were evaluated
-`--dev-performance`: Print per-file statistics regarding how long scanning took and the times spent in various scanning stages
-`--detect-broken`: This will attempt to detect broken executable files. If an executable is determined to be broken, some functionality might not get invoked for the sample, and this could be an indication of an issue parsing the PE header or file. This causes those binary to generate an alert instead of just continuing on. NOTE: This will be renamed to `--alert-broken` starting in ClamAV 0.101.
Effectively disables all file limits and maximums for scanning. This is useful if you'd like to ensure that all files in a set get scanned, and would prefer clam to just run slowly or crash rather than skip a file because it encounters one of these thresholds
-`--leave-temps --tmpdir=/tmp`: By default, ClamAV will attempt to extract embedded files that it finds, normalize certain text files before looking for matches, and unpack packed executables that it has unpacking support for. These flags tell ClamAV to write these intermediate files out to the directory specified. Usually when a file is written, it will mention the file name in the --debug output, so you can have some idea at what stage in the scanning process a tmp file was created.
-`--dump-certs`: For signed PE files that match a rule, display information about the certificates stored within the binary. Note - sigtool has this functionality as well and doesn't require a rule match to view the cert data
When using ClamAV without libclamav statically linked, if you set breakpoints on libclamav functions by name, you'll need to make sure to indicate that the breakpoints should be resolved after libraries have been loaded.
For other documentation about how to use `gdb`, check out the following resources:
You can easily hunt for memory leaks with valgrind. Check out this guide to get started: [Valgrind Quick Start](http://valgrind.org/docs/manual/quick-start.html)
If checking for leaks, be sure to run `clamscan` with samples that will hit as many of the unique code paths in the code you are testing. An example invocation is as follows:
gcov/lcov can be used to produce a code coverage report indicating which lines of code were executed on a single run or by multiple runs of `clamscan`. NOTE: for these metrics to be collected, ClamAV needs to have been configured with the `--enable-coverage` option.
[FlameGraph](https://github.com/brendangregg/FlameGraph) is a great tool for generating interactive flamegraphs based collected profiling data. The github page has thorough documentation on how to use the tool, but an overview is presented below:
First, install `perf`, which on Linux can be done via:
The `-F` parameter indicates how many samples should be collected during program execution. If your scan will take a long time to run, a lower value should be sufficient. Otherwise, consider choosing a higher value (on Ubuntu 18.04, 7250 is the max frequency, but it can be increased via `/proc/sys/kernel/perf_event_max_sample_rate`.
Check out the FlameGraph project and run the following commands to generate the flame graph:
Callgrind is a profiling tool included with `valgrind`. This can be done by prepending `valgrind --tool=callgrind ` to the `clamscan` command.
[kcachegrind](https://kcachegrind.github.io/html/Home.html) is a follow-on tool that will graphically present the profiling data and allow you to explore it visually, although if you don't already use KDE you'll have to install lots of extra packages to use it.
strace can be used to track the system calls that are performed and provide the number of calls / time spent in each system call. This can be done by prepending `strace -c ` to a `clamscan` command. Results will look something like this:
`strace` can also be used for cool things like system call fault injection. For instance, let's say you are curious whether the `read` bytecode API call is implemented in such a way that the underlying `read` system call could handle `EINTR` being returned (which can happen periodically). To test this, write the following bytecode rule:
Compiled the rule, and make a test file to match against it. Then run it under `strace` to determine what underlying read system call is being used for the bytecode `read` function:
This command tells `strace` to skip the first 20 `pread64` calls (these appear to be used by the loader, which didn't seem to handle `EINTR` correctly) but to inject `EINTR` for every 10th call afterward. We can see the injection in action and that the system call is retried successfully:
More documentation on using `strace` to perform system call fault injection, see [this presentation](https://archive.fosdem.org/2017/schedule/event/failing_strace/attachments/slides/1630/export/events/attachments/failing_strace/slides/1630/strace_fosdem2017_ta_slides.pdf) from FOSDEM 2017.