2013-12-04 17:03:06 -05:00
.TH "clamscan" "1" "December 4, 2013" "ClamAV @VERSION@" "Clam AntiVirus"
2003-07-29 15:37:11 +00:00
.SH "NAME"
2020-04-01 17:21:46 -07:00
.LP
2007-02-12 18:38:32 +00:00
clamscan \- scan files and directories for viruses
2003-07-29 15:37:11 +00:00
.SH "SYNOPSIS"
2020-04-01 17:21:46 -07:00
.LP
2003-07-29 15:37:11 +00:00
clamscan [options] [file/directory/\- ]
.SH "DESCRIPTION"
2020-04-01 17:21:46 -07:00
.LP
2007-02-12 18:38:32 +00:00
clamscan is a command line anti\- virus scanner.
2003-07-29 15:37:11 +00:00
.SH "OPTIONS"
2020-04-01 17:21:46 -07:00
.LP
2008-12-30 10:33:43 +00:00
Most of the options are simple switches which enable or disable some features. Options marked with [=yes/no(*)] can be optionally followed by =yes/=no; if they get called without the boolean argument the scanner will assume 'yes'. The asterisk marks the default internal setting for a given option.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- h, \- \- help\fR
2004-02-20 15:49:29 +00:00
Print help information and exit.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- V, \- \- version\fR
2004-02-20 15:49:29 +00:00
Print version number and exit.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- v, \- \- verbose\fR
2003-11-11 22:10:27 +00:00
Be verbose.
2013-12-04 17:03:06 -05:00
.TP
\fB \- a, \- \- archive\- verbose\fR
Show filenames inside scanned archives
2020-04-01 17:21:46 -07:00
.TP
2003-11-11 22:10:27 +00:00
\fB \- \- debug\fR
2007-02-12 18:38:32 +00:00
Display debug messages from libclamav.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- \- quiet\fR
2004-02-20 15:49:29 +00:00
Be quiet (only print error messages).
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- \- stdout\fR
2005-01-26 16:50:10 +00:00
Write all messages (except for libclamav output) to the standard output (stdout).
2013-12-04 17:03:06 -05:00
.TP
\fB \- \- no\- summary\fR
Do not display summary at the end of scanning.
.TP
\fB \- i, \- \- infected\fR
Only print infected files.
.TP
\fB \- o, \- \- suppress\- ok\- results\fR
Skip printing OK files
.TP
\fB \- \- bell\fR
Sound bell on virus detection.
.TP
\fB \- \- tempdir=DIRECTORY\fR
2021-07-28 14:52:39 -07:00
Create temporary files in DIRECTORY. Directory must be writable for the '@CLAMAV_USER@' user or unprivileged user running clamscan.
2013-12-04 17:03:06 -05:00
.TP
\fB \- \- leave\- temps\fR
Do not remove temporary files.
2020-04-01 17:21:46 -07:00
.TP
2018-02-06 16:23:07 -05:00
\fB \- \- gen\- json\fR
Generate JSON description of scanned file(s). JSON will be printed and also dropped to the temp directory if --leave-temps is enabled.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- d FILE/DIR, \- \- database=FILE/DIR\fR
2004-02-20 15:49:29 +00:00
Load virus database from FILE or load all virus database files from DIR.
2020-04-01 17:21:46 -07:00
.TP
2009-11-10 19:30:33 +01:00
\fB \- \- official\- db\- only=[yes/no(*)]\fR
Only load the official signatures published by the ClamAV project.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- l FILE, \- \- log=FILE\fR
2004-02-20 15:49:29 +00:00
Save scan report to FILE.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- r, \- \- recursive\fR
2004-02-20 15:49:29 +00:00
Scan directories recursively. All the subdirectories in the given directory will be scanned.
2020-04-01 17:21:46 -07:00
.TP
2012-11-27 14:48:50 -08:00
\fB \- z, \- \- allmatch\fR
After a match, continue scanning within the file for additional matches.
2020-04-01 17:21:46 -07:00
.TP
2009-08-05 16:27:48 +02:00
\fB \- \- cross\- fs=[yes(*)/no]\fR
Scan files and directories on other filesystems.
2020-04-01 17:21:46 -07:00
.TP
2010-12-28 18:24:51 +01:00
\fB \- \- follow\- dir\- symlinks=[0/1(*)/2]\fR
2011-03-28 22:51:22 +02:00
Follow directory symlinks. There are 3 options: 0 - never follow directory symlinks, 1 (default) - only follow directory symlinks, which are passed as direct arguments to clamscan. 2 - always follow directory symlinks.
2020-04-01 17:21:46 -07:00
.TP
2010-12-28 18:24:51 +01:00
\fB \- \- follow\- file\- symlinks=[0/1(*)/2]\fR
2011-03-28 22:51:22 +02:00
Follow file symlinks. There are 3 options: 0 - never follow file symlinks, 1 (default) - only follow file symlinks, which are passed as direct arguments to clamscan. 2 - always follow file symlinks.
2020-04-01 17:21:46 -07:00
.TP
2013-12-04 17:03:06 -05:00
\fB \- f FILE, \- \- file\- list=FILE\fR
Scan files listed line by line in FILE.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- remove[=yes/no(*)]\fR
2013-12-04 17:03:06 -05:00
Remove infected files. \fB Be careful!\fR
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
\fB \- \- move=DIRECTORY\fR
2021-07-28 14:52:39 -07:00
Move infected files into DIRECTORY. Directory must be writable for the '@CLAMAV_USER@' user or unprivileged user running clamscan.
2020-04-01 17:21:46 -07:00
.TP
2007-02-12 18:38:32 +00:00
\fB \- \- copy=DIRECTORY\fR
2021-07-28 14:52:39 -07:00
Copy infected files into DIRECTORY. Directory must be writable for the '@CLAMAV_USER@' user or unprivileged user running clamscan.
2020-04-01 17:21:46 -07:00
.TP
2013-12-04 17:03:06 -05:00
\fB \- \- exclude=REGEX, \- \- exclude\- dir=REGEX\fR
Don't scan file/directory names matching regular expression. These options can be used multiple times.
2020-04-01 17:21:46 -07:00
.TP
2013-12-04 17:03:06 -05:00
\fB \- \- include=REGEX, \- \- include\- dir=REGEX\fR
Only scan file/directory matching regular expression. These options can be used multiple times.
2020-04-01 17:21:46 -07:00
.TP
2010-03-19 17:42:25 +01:00
\fB \- \- bytecode[=yes(*)/no]\fR
With this option enabled ClamAV will load bytecode from the database. It is highly recommended you keep this option turned on, otherwise you may miss detections for many new viruses.
2020-04-01 17:21:46 -07:00
.TP
2011-02-17 19:17:35 +01:00
\fB \- \- bytecode\- unsigned[=yes/no(*)]\fR
2020-07-06 13:03:35 -07:00
Allow loading bytecode from outside digitally signed .c[lv]d files. **Caution**: You should NEVER run bytecode signatures from untrusted sources. Doing so may result in arbitrary code execution.
2020-04-01 17:21:46 -07:00
.TP
2010-03-24 18:24:12 +01:00
\fB \- \- bytecode\- timeout=N\fR
2022-07-29 10:40:13 +09:00
Set bytecode timeout in milliseconds (default: 10000 = 10s)
2020-04-01 17:21:46 -07:00
.TP
2015-02-19 12:47:20 -05:00
\fB \- \- statistics[=none(*)/bytecode/pcre]\fR
Collect and print execution statistics.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- detect\- pua[=yes/no(*)]\fR
2007-08-13 18:10:35 +00:00
Detect Possibly Unwanted Applications.
2020-04-01 17:21:46 -07:00
.TP
2008-08-01 19:01:22 +00:00
\fB \- \- exclude\- pua=CATEGORY\fR
2021-07-17 16:15:33 -07:00
Exclude a specific PUA category. This option can be used multiple times. See https://docs.clamav.net/faq/faq-pua.html for the complete list of PUA
2020-04-01 17:21:46 -07:00
.TP
2008-08-01 19:01:22 +00:00
\fB \- \- include\- pua=CATEGORY\fR
2021-07-17 16:15:33 -07:00
Only include a specific PUA category. This option can be used multiple times. See https://docs.clamav.net/faq/faq-pua.html for the complete list of PUA
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- detect\- structured[=yes/no(*)]\fR
Use the DLP (Data Loss Prevention) module to detect SSN and Credit Card numbers inside documents/text files.
2020-04-01 17:21:46 -07:00
.TP
2008-05-07 10:51:23 +00:00
\fB \- \- structured\- ssn\- format=X\fR
2008-08-29 23:33:12 +00:00
X=0: search for valid SSNs formatted as xxx-yy-zzzz (normal); X=1: search for valid SSNs formatted as xxxyyzzzz (stripped); X=2: search for both formats. Default is 0.
2020-04-01 17:21:46 -07:00
.TP
2008-05-07 10:51:23 +00:00
\fB \- \- structured\- ssn\- count=#n\fR
2008-08-29 10:32:33 +00:00
This option sets the lowest number of Social Security Numbers found in a file to generate a detect (default: 3).
2020-04-01 17:21:46 -07:00
.TP
2008-05-07 10:51:23 +00:00
\fB \- \- structured\- cc\- count=#n\fR
2008-08-29 10:32:33 +00:00
This option sets the lowest number of Credit Card numbers found in a file to generate a detect (default: 3).
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- scan\- mail[=yes(*)/no]\fR
2011-08-02 17:05:20 +02:00
Scan mail files. If you turn off this option, the original files will still be scanned, but without parsing individual messages/attachments.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- phishing\- sigs[=yes(*)/no]\fR
2018-10-10 06:02:28 -07:00
Enable email signature-based phishing detection.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- phishing\- scan\- urls[=yes(*)/no]\fR
2021-10-29 14:11:48 -07:00
Enable URL signature-based phishing detection (Heuristics.Phishing.Email.*)
2018-10-10 06:02:28 -07:00
.TP
\fB \- \- heuristic\- alerts[=yes(*)/no]\fR
In some cases (eg. complex malware, exploits in graphic files, and others), ClamAV uses special algorithms to provide accurate detection. This option can be used to control the algorithmic detection.
2006-09-14 19:38:17 +00:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- heuristic\- scan\- precedence[=yes/no(*)]\fR
2021-10-29 14:11:48 -07:00
Allow heuristic match to take precedence. When enabled, if a heuristic scan (such as phishingScan) detects a possible virus/phish it will stop scan immediately. Recommended, saves CPU scan-time. When disabled, virus/phish detected by heuristic scans will be reported only at the end of a scan. If an archive contains both a heuristically detected virus/phish, and a real malware, the real malware will be reported Keep this disabled if you intend to handle "Heuristics.*" viruses differently from "real" malware. If a non-heuristically-detected virus (signature-based) is found first, the scan is interrupted immediately, regardless of this config option.
2006-09-14 19:38:17 +00:00
.TP
2016-06-02 18:08:36 -04:00
\fB \- \- normalize[=yes(*)/no]\fR
2020-04-01 17:21:46 -07:00
Normalize (compress whitespace, downcase, etc.) html, script, and text files. Use normalize=no for yara compatibility.
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- scan\- pe[=yes(*)/no]\fR
2011-08-02 17:05:20 +02:00
PE stands for Portable Executable \- it's an executable file format used in all 32\- bit versions of Windows operating systems. By default ClamAV performs deeper analysis of executable files and attempts to decompress popular executable packers such as UPX, Petite, and FSG. If you turn off this option, the original files will still be scanned but without additional processing.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- scan\- elf[=yes(*)/no]\fR
2011-08-02 17:05:20 +02:00
Executable and Linking Format is a standard format for UN*X executables. This option controls the ELF support. If you turn it off, the original files will still be scanned but without additional processing.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- scan\- ole2[=yes(*)/no]\fR
2011-08-02 17:05:20 +02:00
Scan Microsoft Office documents and .msi files. If you turn off this option, the original files will still be scanned but without additional processing.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- scan\- pdf[=yes(*)/no]\fR
2011-08-02 17:05:20 +02:00
Scan within PDF files. If you turn off this option, the original files will still be scanned, but without decoding and additional processing.
2013-12-04 17:03:06 -05:00
.TP
\fB \- \- scan\- swf[=yes(*)/no]\fR
Scan SWF files. If you turn off this option, the original files will still be scanned but without additional processing.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- scan\- html[=yes(*)/no]\fR
2011-08-02 17:05:20 +02:00
Detect, normalize/decrypt and scan HTML files and embedded scripts. If you turn off this option, the original files will still be scanned, but without additional processing.
2020-04-01 17:21:46 -07:00
.TP
2016-02-02 14:23:13 -05:00
\fB \- \- scan\- xmldocs[=yes(*)/no]\fR
Scan xml-based document files supported by libclamav. If you turn off this option, the original files will still be scanned, but without additional processing.
2020-04-01 17:21:46 -07:00
.TP
2016-02-02 14:23:13 -05:00
\fB \- \- scan\- hwp3[=yes(*)/no]\fR
Scan HWP3 files. If you turn off this option, the original files will still be scanned, but without additional processing.
2020-04-01 17:21:46 -07:00
.TP
2008-12-30 10:33:43 +00:00
\fB \- \- scan\- archive[=yes(*)/no]\fR
2011-08-02 17:05:20 +02:00
Scan archives supported by libclamav. If you turn off this option, the original files will still be scanned, but without unpacking and additional processing.
2020-04-01 17:21:46 -07:00
.TP
2018-10-10 06:02:28 -07:00
\fB \- \- alert\- broken[=yes/no(*)]\fR
Alert on broken executable files (PE & ELF).
2020-04-01 17:21:46 -07:00
.TP
2018-10-10 06:02:28 -07:00
\fB \- \- alert\- encrypted[=yes/no(*)]\fR
Alert on encrypted archives and documents (encrypted .zip, .7zip, .rar, .pdf).
2020-04-01 17:21:46 -07:00
.TP
2018-10-10 06:02:28 -07:00
\fB \- \- alert\- encrypted-archive[=yes/no(*)]\fR
Alert on encrypted archives (encrypted .zip, .7zip, .rar, .pdf).
2020-04-01 17:21:46 -07:00
.TP
2018-10-10 06:02:28 -07:00
\fB \- \- alert\- encrypted-doc[=yes/no(*)]\fR
Alert on encrypted documents (encrypted .zip, .7zip, .rar, .pdf).
2020-04-01 17:21:46 -07:00
.TP
2018-10-10 06:02:28 -07:00
\fB \- \- alert\- macros[=yes/no(*)]\fR
Alert on OLE2 files containing VBA macros (Heuristics.OLE2.ContainsMacros).
2020-04-01 17:21:46 -07:00
.TP
2018-10-10 06:02:28 -07:00
\fB \- \- alert\- exceeds\- max[=yes/no(*)]\fR
Alert on files that exceed max file size, max scan size, or max recursion limit (Heuristics.Limits.Exceeded).
2020-04-01 17:21:46 -07:00
.TP
2018-10-10 06:02:28 -07:00
\fB \- \- alert\- phishing\- ssl[=yes/no(*)]\fR
Alert on emails containing SSL mismatches in URLs (might lead to false positives!).
.TP
\fB \- \- alert\- phishing\- cloak[=yes/no(*)]\fR
Alert on emails containing cloaked URLs (might lead to some false positives).
.TP
\fB \- \- alert\- partition\- intersection[=yes/no(*)]\fR
Detect partition intersections in raw disk images using heuristics.
2020-04-01 17:21:46 -07:00
.TP
2021-05-27 19:24:18 -07:00
\fB \- \- nocerts\fR
Disable authenticode certificate chain verification in PE files.
.TP
\fB \- \- dumpcerts\fR
Dump authenticode certificate chain in PE files.
.TP
2020-04-01 17:21:46 -07:00
\fB \- \- max\- scantime=#n\fR
The maximum time to scan before giving up. The value is in milliseconds. The value of 0 disables the limit. This option protects your system against DoS attacks (default: 120000 = 120s or 2min)
.TP
2008-02-15 12:16:11 +00:00
\fB \- \- max\- filesize=#n\fR
2013-12-04 17:03:06 -05:00
Extract and scan at most #n bytes from each archive. You may pass the value in kilobytes in format xK or xk, or megabytes in format xM or xm, where x is a number. This option protects your system against DoS attacks (default: 25 MB, max: <4 GB)
2020-04-01 17:21:46 -07:00
.TP
2008-02-15 12:16:11 +00:00
\fB \- \- max\- scansize=#n\fR
2014-02-05 10:58:16 -05:00
Extract and scan at most #n bytes from each archive. The size the archive plus the sum of the sizes of all files within archive count toward the scan size. For example, a 1M uncompressed archive containing a single 1M inner file counts as 2M toward max-scansize. You may pass the value in kilobytes in format xK or xk, or megabytes in format xM or xm, where x is a number. This option protects your system against DoS attacks (default: 100 MB, max: <4 GB)
2013-12-04 17:03:06 -05:00
.TP
\fB \- \- max\- files=#n\fR
Extract at most #n files from each scanned file (when this is an archive, a document or another kind of container). This option protects your system against DoS attacks (default: 10000)
2020-04-01 17:21:46 -07:00
.TP
2008-02-15 12:16:11 +00:00
\fB \- \- max\- recursion=#n\fR
libclamav: Fix scan recursion tracking
Scan recursion is the process of identifying files embedded in other
files and then scanning them, recursively.
Internally this process is more complex than it may sound because a file
may have multiple layers of types before finding a new "file".
At present we treat the recursion count in the scanning context as an
index into both our fmap list AND our container list. These two lists
are conceptually a part of the same thing and should be unified.
But what's concerning is that the "recursion level" isn't actually
incremented or decremented at the same time that we add a layer to the
fmap or container lists but instead is more touchy-feely, increasing
when we find a new "file".
To account for this shadiness, the size of the fmap and container lists
has always been a little longer than our "max scan recursion" limit so
we don't accidentally overflow the fmap or container arrays (!).
I've implemented a single recursion-stack as an array, similar to before,
which includes a pointer to each fmap at each layer, along with the size
and type. Push and pop functions add and remove layers whenever a new
fmap is added. A boolean argument when pushing indicates if the new layer
represents a new buffer or new file (descriptor). A new buffer will reset
the "nested fmap level" (described below).
This commit also provides a solution for an issue where we detect
embedded files more than once during scan recursion.
For illustration, imagine a tarball named foo.tar.gz with this structure:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| └── foo.tar | TAR | 1 | 0 |
| ├── bar.zip | ZIP | 2 | 1 |
| │  └── hola.txt | ASCII | 3 | 0 |
| └── baz.exe | PE | 2 | 1 |
But suppose baz.exe embeds a ZIP archive and a 7Z archive, like this:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| baz.exe | PE | 0 | 0 |
| ├── sfx.zip | ZIP | 1 | 1 |
| │  └── hello.txt | ASCII | 2 | 0 |
| └── sfx.7z | 7Z | 1 | 1 |
|   └── world.txt | ASCII | 2 | 0 |
(A) If we scan for embedded files at any layer, we may detect:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| ├── foo.tar | TAR | 1 | 0 |
| │ ├── bar.zip | ZIP | 2 | 1 |
| │ │  └── hola.txt | ASCII | 3 | 0 |
| │ ├── baz.exe | PE | 2 | 1 |
| │ │ ├── sfx.zip | ZIP | 3 | 1 |
| │ │ │  └── hello.txt | ASCII | 4 | 0 |
| │ │ └── sfx.7z | 7Z | 3 | 1 |
| │ │   └── world.txt | ASCII | 4 | 0 |
| │ ├── sfx.zip | ZIP | 2 | 1 |
| │ │  └── hello.txt | ASCII | 3 | 0 |
| │ └── sfx.7z | 7Z | 2 | 1 |
| │  └── world.txt | ASCII | 3 | 0 |
| ├── sfx.zip | ZIP | 1 | 1 |
| └── sfx.7z | 7Z | 1 | 1 |
(A) is bad because it scans content more than once.
Note that for the GZ layer, it may detect the ZIP and 7Z if the
signature hits on the compressed data, which it might, though
extracting the ZIP and 7Z will likely fail.
The reason the above doesn't happen now is that we restrict embedded
type scans for a bunch of archive formats to include GZ and TAR.
(B) If we scan for embedded files at the foo.tar layer, we may detect:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| └── foo.tar | TAR | 1 | 0 |
| ├── bar.zip | ZIP | 2 | 1 |
| │  └── hola.txt | ASCII | 3 | 0 |
| ├── baz.exe | PE | 2 | 1 |
| ├── sfx.zip | ZIP | 2 | 1 |
| │  └── hello.txt | ASCII | 3 | 0 |
| └── sfx.7z | 7Z | 2 | 1 |
|   └── world.txt | ASCII | 3 | 0 |
(B) is almost right. But we can achieve it easily enough only scanning for
embedded content in the current fmap when the "nested fmap level" is 0.
The upside is that it should safely detect all embedded content, even if
it may think the sfz.zip and sfx.7z are in foo.tar instead of in baz.exe.
The biggest risk I can think of affects ZIPs. SFXZIP detection
is identical to ZIP detection, which is why we don't allow SFXZIP to be
detected if insize of a ZIP. If we only allow embedded type scanning at
fmap-layer 0 in each buffer, this will fail to detect the embedded ZIP
if the bar.exe was not compressed in foo.zip and if non-compressed files
extracted from ZIPs aren't extracted as new buffers:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.zip | ZIP | 0 | 0 |
| └── bar.exe | PE | 1 | 1 |
| └── sfx.zip | ZIP | 2 | 2 |
Provided that we ensure all files extracted from zips are scanned in
new buffers, option (B) should be safe.
(C) If we scan for embedded files at the baz.exe layer, we may detect:
| description | type | rec level | nested fmap level |
| ------------------------- | ----- | --------- | ----------------- |
| foo.tar.gz | GZ | 0 | 0 |
| └── foo.tar | TAR | 1 | 0 |
| ├── bar.zip | ZIP | 2 | 1 |
| │  └── hola.txt | ASCII | 3 | 0 |
| └── baz.exe | PE | 2 | 1 |
| ├── sfx.zip | ZIP | 3 | 1 |
| │  └── hello.txt | ASCII | 4 | 0 |
| └── sfx.7z | 7Z | 3 | 1 |
|   └── world.txt | ASCII | 4 | 0 |
(C) is right. But it's harder to achieve. For this example we can get it by
restricting 7ZSFX and ZIPSFX detection only when scanning an executable.
But that may mean losing detection of archives embedded elsewhere.
And we'd have to identify allowable container types for each possible
embedded type, which would be very difficult.
So this commit aims to solve the issue the (B)-way.
Note that in all situations, we still have to scan with file typing
enabled to determine if we need to reassign the current file type, such
as re-identifying a Bzip2 archive as a DMG that happens to be Bzip2-
compressed. Detection of DMG and a handful of other types rely on
finding data partway through or near the ned of a file before
reassigning the entire file as the new type.
Other fixes and considerations in this commit:
- The utf16 HTML parser has weak error handling, particularly with respect
to creating a nested fmap for scanning the ascii decoded file.
This commit cleans up the error handling and wraps the nested scan with
the recursion-stack push()/pop() for correct recursion tracking.
Before this commit, each container layer had a flag to indicate if the
container layer is valid.
We need something similar so that the cli_recursion_stack_get_*()
functions ignore normalized layers. Details...
Imagine an LDB signature for HTML content that specifies a ZIP
container. If the signature actually alerts on the normalized HTML and
you don't ignore normalized layers for the container check, it will
appear as though the alert is in an HTML container rather than a ZIP
container.
This commit accomplishes this with a boolean you set in the scan context
before scanning a new layer. Then when the new fmap is created, it will
use that flag to set similar flag for the layer. The context flag is
reset those that anything after this doesn't have that flag.
The flag allows the new recursion_stack_get() function to ignore
normalized layers when iterating the stack to return a layer at a
requested index, negative or positive.
Scanning normalized extracted/normalized javascript and VBA should also
use the 'layer is normalized' flag.
- This commit also fixes Heuristic.Broken.Executable alert for ELF files
to make sure that:
A) these only alert if cli_append_virus() returns CL_VIRUS (aka it
respects the FP check).
B) all broken-executable alerts for ELF only happen if the
SCAN_HEURISTIC_BROKEN option is enabled.
- This commit also cleans up the error handling in cli_magic_scan_dir().
This was needed so we could correctly apply the layer-is-normalized-flag
to all VBA macros extracted to a directory when scanning the directory.
- Also fix an issue where exceeding scan maximums wouldn't cause embedded
file detection scans to abort. Granted we don't actually want to abort
if max filesize or max recursion depth are exceeded... only if max
scansize, max files, and max scantime are exceeded.
Add 'abort_scan' flag to scan context, to protect against depending on
correct error propagation for fatal conditions. Instead, setting this
flag in the scan context should guarantee that a fatal condition deep in
scan recursion isn't lost which result in more stuff being scanned
instead of aborting. This shouldn't be necessary, but some status codes
like CL_ETIMEOUT never used to be fatal and it's easier to do this than
to verify every parser only returns CL_ETIMEOUT and other "fatal
status codes" in fatal conditions.
- Remove duplicate is_tar() prototype from filestypes.c and include
is_tar.h instead.
- Presently we create the fmap hash when creating the fmap.
This wastes a bit of CPU if the hash is never needed.
Now that we're creating fmap's for all embedded files discovered with
file type recognition scans, this is a much more frequent occurence and
really slows things down.
This commit fixes the issue by only creating fmap hashes as needed.
This should not only resolve the perfomance impact of creating fmap's
for all embedded files, but also should improve performance in general.
- Add allmatch check to the zip parser after the central-header meta
match. That way we don't multiple alerts with the same match except in
allmatch mode. Clean up error handling in the zip parser a tiny bit.
- Fixes to ensure that the scan limits such as scansize, filesize,
recursion depth, # of embedded files, and scantime are always reported
if AlertExceedsMax (--alert-exceeds-max) is enabled.
- Fixed an issue where non-fatal alerts for exceeding scan maximums may
mask signature matches later on. I changed it so these alerts use the
"possibly unwanted" alert-type and thus only alert if no other alerts
were found or if all-match or heuristic-precedence are enabled.
- Added the "Heuristics.Limits.Exceeded.*" events to the JSON metadata
when the --gen-json feature is enabled. These will show up once under
"ParseErrors" the first time a limit is exceeded. In the present
implementation, only one limits-exceeded events will be added, so as to
prevent a malicious or malformed sample from filling the JSON buffer
with millions of events and using a tonne of RAM.
2021-09-11 14:15:21 -07:00
Set archive recursion level limit. This option protects your system against DoS attacks (default: 17).
2020-04-01 17:21:46 -07:00
.TP
2005-06-18 23:00:56 +00:00
\fB \- \- max\- dir\- recursion=#n\fR
2005-03-25 22:27:48 +00:00
Maximum depth directories are scanned at (default: 15).
2013-12-04 17:03:06 -05:00
.TP
\fB \- \- max\- embeddedpe=#n\fR
Maximum size file to check for embedded PE. You may pass the value in kilobytes in format xK or xk, or megabytes in format xM or xm, where x is a number (default: 10 MB, max: <4 GB).
.TP
\fB \- \- max\- htmlnormalize=#n\fR
Maximum size of HTML file to normalize. You may pass the value in kilobytes in format xK or xk, or megabytes in format xM or xm, where x is a number (default: 10 MB, max: <4 GB).
.TP
\fB \- \- max\- htmlnotags=#n\fR
Maximum size of normalized HTML file to scan. You may pass the value in kilobytes in format xK or xk, or megabytes in format xM or xm, where x is a number (default: 2 MB, max: <4 GB).
.TP
\fB \- \- max\- scriptnormalize=#n\fR
Maximum size of script file to normalize. You may pass the value in kilobytes in format xK or xk, or megabytes in format xM or xm, where x is a number (default: 5 MB, max: <4 GB).
.TP
2014-03-06 18:19:11 -05:00
\fB \- \- max\- ziptypercg=#n\fR
2013-12-04 17:03:06 -05:00
Maximum size zip to type reanalyze. You may pass the value in kilobytes in format xK or xk, or megabytes in format xM or xm, where x is a number (default: 1 MB, max: <4 GB).
2014-03-05 17:20:41 -05:00
.TP
2014-03-06 18:19:11 -05:00
\fB \- \- max\- partitions=#n\fR
2014-03-05 17:20:41 -05:00
This option sets the maximum number of partitions of a raw disk image to be scanned. This must be a positive integer (default: 50).
2014-03-06 18:19:11 -05:00
.TP
\fB \- \- max\- iconspe=#n\fR
This option sets the maximum number of icons within a PE to be scanned. This must be a positive integer (default: 100).
2014-03-07 13:59:17 -05:00
.TP
2016-01-19 14:25:55 -05:00
\fB \- \- max\- rechwp3=#n\fR
This option sets the maximum recursive calls to HWP3 parsing function (default: 16).
.TP
2015-02-05 08:30:26 -08:00
\fB \- \- pcre-match-limit=#n\fR
2017-03-01 16:18:27 -05:00
Maximum calls to the PCRE match function (default: 100000).
2015-02-05 08:30:26 -08:00
.TP
\fB \- \- pcre-recmatch-limit=#n\fR
2018-07-10 15:20:51 -04:00
Maximum recursive calls to the PCRE match function (default: 2000).
2015-02-05 08:30:26 -08:00
.TP
\fB \- \- pcre-max-filesize=#n\fR
Maximum size file to perform PCRE subsig matching (default: 25 MB, max: <4 GB).
.TP
2015-12-21 16:50:33 -05:00
\fB \- \- disable\- cache\fR
Disable caching and cache checks for hash sums of scanned files.
2021-08-17 09:52:46 -07:00
.SH "ENVIRONMENT VARIABLES"
.LP
clamscan uses the following environment variables:
.TP
LD_LIBRARY_PATH - May be used on startup to find the libclamunrar_iface shared library module to enable RAR archive support.
2003-07-29 15:37:11 +00:00
.SH "EXAMPLES"
2020-04-01 17:21:46 -07:00
.LP
.TP
2007-02-12 18:38:32 +00:00
(0) Scan a single file:
2003-07-29 15:37:11 +00:00
\fB clamscan file\fR
2020-04-01 17:21:46 -07:00
.TP
2007-02-12 18:38:32 +00:00
(1) Scan a current working directory:
2003-07-29 15:37:11 +00:00
\fB clamscan\fR
2020-04-01 17:21:46 -07:00
.TP
2003-11-11 22:10:27 +00:00
(2) Scan all files (and subdirectories) in /home:
2003-07-29 15:37:11 +00:00
\fB clamscan \- r /home\fR
2020-04-01 17:21:46 -07:00
.TP
2008-10-14 21:26:38 +00:00
(3) Load database from a file:
2003-07-29 15:37:11 +00:00
2008-10-14 21:26:38 +00:00
\fB clamscan \- d /tmp/newclamdb \- r /tmp\fR
2020-04-01 17:21:46 -07:00
.TP
2007-02-12 18:38:32 +00:00
(4) Scan a data stream:
2003-07-29 15:37:11 +00:00
\fB cat testfile | clamscan \- \fR
2020-04-01 17:21:46 -07:00
.TP
2007-02-12 18:38:32 +00:00
(5) Scan a mail spool directory:
2003-07-29 15:37:11 +00:00
2004-09-27 02:04:08 +00:00
\fB clamscan \- r /var/spool/mail\fR
2003-07-29 15:37:11 +00:00
.SH "RETURN CODES"
2020-04-01 17:21:46 -07:00
.LP
2003-07-29 15:37:11 +00:00
0 : No virus found.
2020-04-01 17:21:46 -07:00
.TP
2003-07-29 15:37:11 +00:00
1 : Virus(es) found.
2020-04-01 17:21:46 -07:00
.TP
2016-10-19 12:26:33 -04:00
2 : Some error(s) occurred.
2003-07-29 15:37:11 +00:00
.SH "CREDITS"
Please check the full documentation for credits.
.SH "AUTHOR"
2020-04-01 17:21:46 -07:00
.LP
2013-12-04 17:03:06 -05:00
Tomasz Kojm <tkojm@clamav.net>, Kevin Lin <klin@sourcefire.com>
2003-07-29 15:37:11 +00:00
.SH "SEE ALSO"
2020-04-01 17:21:46 -07:00
.LP
2010-05-06 17:02:53 +02:00
clamdscan(1), freshclam(1), freshclam.conf(5)