clamav/unit_tests/clamd_test.py
Micah Snyder 2552cfd0d1 CMake: Add CTest support to match Autotools checks
An ENABLE_TESTS CMake option is provided so that users can disable
testing if they don't want it. Instructions for how to use this
included in the INSTALL.cmake.md file.

If you run `ctest`, each testcase will write out a log file to the
<build>/unit_tests directory.

As with Autotools' make check, the test files are from test/.split
and unit_tests/.split files, but for CMake these are generated at
build time instead of at test time.

On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled
libraries can be loaded when running tests.

On Windows systems, CTest will identify and collect all library
dependencies and assemble a temporarily install under the
build/unit_tests directory so that the libraries can be loaded when
running tests.

The same feature is used on Windows when using CMake to install to
collect all DLL dependencies so that users don't have to install them
manually afterwards.

Each of the CTest tests are run using a custom wrapper around Python's
unittest framework, which is also responsible for finding and inserting
valgrind into the valgrind tests on Posix systems.

Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by
default, if Valgrind can be found. There's no need to set VG=1.
CTest's memcheck module is NOT supported, because we use Python to
orchestrate our tests.

Added a bunch of Windows compatibility changes to the unit tests.
These were primarily changing / to PATHSEP and making adjustments
to use Win32 C headers and ifdef out the POSIX ones which aren't
available on Windows. Also disabled a bunch of tests on Win32
that don't work on Windows, notably the mmap ones and FD-passing
(i.e. FILEDES) ones.

Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate
warnings on Windows where json.h is included after inttypes.h because
json-c's inttypes replacement relies on it.
This is a it of a hack and may be removed if json-c fixes their
inttypes header stuff in the future.

Add preprocessor definitions on Windows to disable MSVC warnings about
CRT secure and nonstandard functions. While there may be a better
solution, this is needed to be able to see other more serious warnings.

Add missing file comment block and copyright statement for clamsubmit.c.
Also change json-c/json.h include filename to json.h in clamsubmit.c.
The directory name is not required.

Changed the hash table data integer type from long, which is poorly
defined, to size_t -- which is capable of storing a pointer. Fixed a
bunch of casts regarding this variable to eliminate warnings.

Fixed two bugs causing utf8 encoding unit tests to fail on Windows:
- The in_size variable should be the number of bytes, not the character
  count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8
  transcoding test to only transcode half the bytes.
- It turns out that the MultiByteToWideChar() API can't transcode
  UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer
  and flip the bytes on each uint16_t. This but was causing the UTF16-BE
  to UTF8 tests to fail.

I also split up the utf8 transcoding tests into separate tests so I
could see all of the failures instead of just the first one.

Added a flags parameter to the unit test function to open testfiles
because it turns out that on Windows if a file contains the \r\n it will
replace it with just \n if you opened the file as a text file instead of
as binary. However, if we open the CBC files as binary, then a bunch of
bytecode tests fail. So I've changed the tests to open the CBC files in
the bytecode tests as text files and open all other files as binary.

Ported the feature tests from shell scripts to Python using a modified
version of our QA test-framework, which is largely compatible and will
allow us to migrate some QA tests into this repo. I'd like to add GitHub
Actions pipelines in the future so that all public PR's get some testing
before anyone has to manually review them.

The clamd --log option was missing from the help string, though it
definitely works. I've added it in this commit.
It appears that clamd.c was never clang-format'd, so this commit also
reformats clamd.c.

Some of the check_clamd tests expected the path returned by clamd to
match character for character with original path sent to clamd. However,
as we now evaluate real paths before a scan, the path returned by clamd
isn't going to match the relative (and possibly symlink-ridden) path
passed to clamdscan. I fixed this test by changing the test to search
for the basename: <signature> FOUND within the response instead of
matching the exact path.

Autotools: Link check_clamd with libclamav so we can use our utility
functions in check_clamd.c.
2021-02-25 11:41:26 -08:00

455 lines
18 KiB
Python

# Copyright (C) 2020 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
"""
Run clamd tests.
"""
import os
from pathlib import Path
import platform
import socket
import subprocess
import shutil
import sys
import time
import unittest
import testcase
os_platform = platform.platform()
operating_system = os_platform.split('-')[0].lower()
def check_port_available(port_num: int) -> bool:
'''
Check if port # is available
'''
port_is_available = True # It's probably available...
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
location = ("127.0.0.1", port_num)
result_of_check = sock.connect_ex(location)
if result_of_check == 0:
port_is_available = False # Oh nevermind! Someone was listening!
sock.close()
return port_is_available
class TC(testcase.TestCase):
@classmethod
def setUpClass(cls):
super(TC, cls).setUpClass()
TC.testpaths = list(TC.path_build.glob('test/clam*')) # A list of Path()'s of each of our generated test files
TC.clamd_pid = TC.path_tmp / 'clamd-test.pid'
TC.clamd_socket = TC.path_build / 'unit_tests' / 'clamd-test.socket' # <-- this is hard-coded into the `check_clamd` program
TC.clamd_port_num = 3319 # <-- this is hard-coded into the `check_clamd` program
TC.path_db = TC.path_tmp / 'database'
TC.path_db.mkdir(parents=True)
shutil.copy(
str(TC.path_build / 'unit_tests' / 'clamav.hdb'),
str(TC.path_db),
)
shutil.copy(
str(TC.path_source / 'unit_tests' / 'input' / 'daily.pdb'),
str(TC.path_db),
)
# Identify a TCP port we can use.
# Presently disabled because check_clamd's port # is hardcoded.
#found_open_port = False
#for port_num in range(3310, 3410):
# if check_port_available(port_num) == True:
# found_open_port = True
# break
#assert found_open_port == True
# Prep a clamd.conf to use for most (if not all) of the tests.
config = f'''
Foreground yes
PidFile {TC.clamd_pid}
DatabaseDirectory {TC.path_db}
LogFileMaxSize 0
LogTime yes
#Debug yes
LogClean yes
LogVerbose yes
ExitOnOOM yes
DetectPUA yes
ScanPDF yes
CommandReadTimeout 1
MaxQueue 800
MaxConnectionQueueLength 1024
'''
if operating_system == 'windows':
# Only have TCP socket option for Windows.
config += f'''
TCPSocket {TC.clamd_port_num}
TCPAddr 127.0.0.1
'''
else:
# Use LocalSocket for Posix, because that's what check_clamd expects.
config += f'''
LocalSocket {TC.clamd_socket}
TCPSocket {TC.clamd_port_num}
TCPAddr 127.0.0.1
'''
TC.clamd_config = TC.path_tmp / 'clamd-test.conf'
TC.clamd_config.write_text(config)
# Check if fdpassing is supported.
TC.has_fdpass_support = False
with (TC.path_build / 'clamav-config.h').open('r') as clamav_config:
if "#define HAVE_FD_PASSING 1" in clamav_config.read():
TC.has_fdpass_support = True
@classmethod
def tearDownClass(cls):
super(TC, cls).tearDownClass()
def setUp(self):
super(TC, self).setUp()
self.proc = None
def tearDown(self):
super(TC, self).tearDown()
# Kill clamd (if running)
if self.proc != None:
try:
self.proc.terminate()
self.proc.wait(timeout=120)
self.proc.stdin.close()
except OSError as exc:
self.log.warning(f'Unexpected exception {exc}')
pass # ignore
self.proc = None
TC.clamd_pid.unlink(missing_ok=True)
TC.clamd_socket.unlink(missing_ok=True)
self.verify_valgrind_log()
def start_clamd(self):
'''
Start clamd
'''
command = f'{TC.valgrind} {TC.valgrind_args} {TC.clamd} --config-file={TC.clamd_config}'
self.log.info(f'Starting clamd: {command}')
self.proc = subprocess.Popen(
command.strip().split(' '),
stdin=subprocess.PIPE,
stdout=sys.stdout.buffer,
stderr=sys.stdout.buffer,
)
def run_clamdscan(self,
scan_args,
expected_ec=0,
expected_out=[],
expected_err=[],
unexpected_out=[],
unexpected_err=[]):
'''
Run clamdscan in each mode
The first scan uses ping & wait to give clamd time to start.
'''
# default (filepath) mode
output = self.execute_command(f'{TC.clamdscan} --ping 5 --wait -c {TC.clamd_config} {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
# multi mode
output = self.execute_command(f'{TC.clamdscan} -c {TC.clamd_config} -m {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
if TC.has_fdpass_support:
# fdpass
output = self.execute_command(f'{TC.clamdscan} -c {TC.clamd_config} --fdpass {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
# fdpass multi mode
output = self.execute_command(f'{TC.clamdscan} -c {TC.clamd_config} --fdpass -m {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
# stream
output = self.execute_command(f'{TC.clamdscan} -c {TC.clamd_config} --stream {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
# stream multi mode
output = self.execute_command(f'{TC.clamdscan} -c {TC.clamd_config} --stream -m {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
def run_clamdscan_file_only(self,
scan_args,
expected_ec=0,
expected_out=[],
expected_err=[],
unexpected_out=[],
unexpected_err=[]):
'''
Run clamdscan in filepath mode (and filepath multi mode)
The first scan uses ping & wait to give clamd time to start.
'''
# default mode
output = self.execute_command(f'{TC.clamdscan} --ping 5 --wait -c {TC.clamd_config} {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
# multi mode
output = self.execute_command(f'{TC.clamdscan} -c {TC.clamd_config} -m {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
def run_clamdscan_fdpass_only(self,
scan_args,
expected_ec=0,
expected_out=[],
expected_err=[],
unexpected_out=[],
unexpected_err=[]):
'''
Run clamdscan fdpass mode only
Use ping & wait to give clamd time to start.
'''
# fdpass
output = self.execute_command(f'{TC.clamdscan} --ping 5 --wait -c {TC.clamd_config} --fdpass {scan_args}')
assert output.ec == expected_ec
if expected_out != [] or unexpected_out != []:
self.verify_output(output.out, expected=expected_out, unexpected=unexpected_out)
if expected_err != [] or unexpected_err != []:
self.verify_output(output.err, expected=expected_err, unexpected=unexpected_err)
def test_clamd_00_version(self):
'''
verify that clamd -v returns the version
'''
self.step_name('clamd version test')
command = f'{TC.valgrind} {TC.valgrind_args} {TC.clamd} --config-file={TC.clamd_config} -V'
output = self.execute_command(command)
assert output.ec == 0 # success
expected_results = [
f'ClamAV {TC.version}',
]
self.verify_output(output.out, expected=expected_results)
def test_clamd_01_ping_pong(self):
'''
Verify that clamd responds to a PING command
'''
self.step_name('Testing clamd + clamdscan PING PONG feature')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
output = self.execute_command(f'{TC.clamdscan} -p 5 -c {TC.clamd_config}')
assert output.ec == 0 # success
self.verify_output(output.out, expected=['PONG'])
def test_clamd_02_clamdscan_version(self):
'''
Verify that clamdscan --version returns the expected version #
Explanation: clamdscan --version will query clamd for it's version
and print out clamd's version. If it can't connect to clamd, it'll
throw and error saying as much and then report it's own version.
In this test, we want to check clamd's version through clamdscan.
'''
self.step_name('Testing clamd + clamdscan version feature')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
# First we'll ping-pong to make sure clamd is up
# If clamd isn't up before the version test, clamdscan will return it's
# own version, which isn't really the point of the test.
output = self.execute_command(f'{TC.clamdscan} --ping 5 -c {TC.clamd_config}')
assert output.ec == 0 # success
self.verify_output(output.out, expected=['PONG'])
# Ok now it's up, let's check clamd's version via clamdscan.
output = self.execute_command(f'{TC.clamdscan} --version -c {TC.clamd_config}')
assert output.ec == 0 # success
self.verify_output(output.out,
expected=[f'ClamAV {TC.version}'], unexpected=['Could not connect to clamd'])
def test_clamd_03_reload(self):
'''
In this test, it is not supposed to detect until we actually put the
signature there and reload!
'''
self.step_name('Test scan before & after reload')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
(TC.path_tmp / 'reload-testfile').write_bytes(b'ClamAV-RELOAD-Test')
self.run_clamdscan(f'{TC.path_tmp / "reload-testfile"}',
expected_ec=0, expected_out=['reload-testfile: OK', 'Infected files: 0'])
(TC.path_db / 'reload-test.ndb').write_text('ClamAV-RELOAD-TestFile:0:0:436c616d41562d52454c4f41442d54657374')
output = self.execute_command(f'{TC.clamdscan} --reload -c {TC.clamd_config}')
assert output.ec == 0 # success
time.sleep(2) # give clamd a moment to reload before trying again
# with multi-threaded reloading will clamd would happily
# re-scan with the old engine while it reloads.
self.run_clamdscan(f'{TC.path_tmp / "reload-testfile"}',
expected_ec=1, expected_out=['ClamAV-RELOAD-TestFile.UNOFFICIAL FOUND', 'Infected files: 1'])
def test_clamd_04_all_testfiles(self):
'''
Verify that clamd + clamdscan detect each of our <build>/test/clam* test files.
'''
self.step_name('Testing clamd + clamdscan scan of all `test` files')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
testfiles = ' '.join([str(testpath) for testpath in TC.testpaths])
expected_results = [f'{testpath.name}: ClamAV-Test-File.UNOFFICIAL FOUND' for testpath in TC.testpaths]
expected_results.append(f'Infected files: {len(TC.testpaths)}')
self.run_clamdscan(f'{testfiles}',
expected_ec=1, expected_out=expected_results)
def test_clamd_05_check_clamd(self):
'''
Uses the check_clamd program to test clamd's socket API in various ways
that aren't possible with clamdscan.
'''
self.step_name('Testing clamd + check_clamd')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
# Let's first use the ping-pong test to make sure clamd is listening.
output = self.execute_command(f'{TC.clamdscan} -p 5 -c {TC.clamd_config}')
assert output.ec == 0 # success
self.verify_output(output.out, expected=['PONG'])
# Ok now run check_clamd to have fun with clamd's API
output = self.execute_command(f'{TC.check_clamd}')
assert output.ec == 0 # success
expected_results = [
'100%', 'Failures: 0', 'Errors: 0'
]
self.verify_output(output.out, expected=expected_results)
# Let's do another ping-pong test to see if `check_clamd` killed clamd (Mu-ha-ha).
output = self.execute_command(f'{TC.clamdscan} -p 5 -c {TC.clamd_config}')
assert output.ec == 0 # success
self.verify_output(output.out, expected=['PONG'])
def test_clamd_06_HeuristicScanPrecedence_off(self):
'''
Verify that HeuristicScanPrecedence off works as expected (default)
In a later test, we'll add `HeuristicScanPrecedence yes` to the config
and retest with it on.
With it off, we expect the scan to complete and the "real" virus to alert
rather than the heuristic.
'''
self.step_name('Testing clamd + clamdscan w/ HeuristicScanPrecedence no (default)')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
self.run_clamdscan(f'{TC.path_build / "unit_tests" / "clam-phish-exe"}',
expected_ec=1, expected_out=['ClamAV-Test-File'])
def test_clamd_07_HeuristicScanPrecedence_on(self):
'''
Verify that HeuristicScanPrecedence on works as expected.
With it on, we expect the scan to stop and raise an alert as soon as
the phishing heuristic is detected.
'''
self.step_name('Testing clamd + clamdscan w/ HeuristicScanPrecedence yes')
with TC.clamd_config.open('a') as config:
config.write('''
HeuristicScanPrecedence yes
''')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
self.run_clamdscan(f'{TC.path_build / "unit_tests" / "clam-phish-exe"}',
expected_ec=1, expected_out=['Heuristics.Phishing.Email.SpoofedDomain'])
@unittest.skipIf(operating_system == 'windows', 'This test uses a shell script to test virus-action. TODO: add Windows support to this test.')
def test_clamd_08_VirusEvent(self):
'''
Test that VirusEvent works
'''
self.step_name('Testing clamd + clamdscan w/ VirusEvent')
with TC.clamd_config.open('a') as config:
config.write(f'VirusEvent {TC.path_source / "unit_tests" / "virusaction-test.sh"} {TC.path_tmp} "Virus found: %v"\n')
self.start_clamd()
poll = self.proc.poll()
assert poll == None # subprocess is alive if poll() returns None
self.run_clamdscan_file_only(f'{TC.path_build / "test" / "clam.exe"}',
expected_ec=1)#, expected_out=['Virus found: ClamAV-Test-File.UNOFFICIAL'])
self.log.info(f'verifying log output from virusaction-test.sh: {str(TC.path_tmp / "test-clamd.log")}')
self.verify_log(str(TC.path_tmp / 'test-clamd.log'),
expected=['Virus found: ClamAV-Test-File.UNOFFICIAL'],
unexpected=['VirusEvent incorrect', 'VirusName incorrect'])