2018-10-08 12:59:42 -04:00
|
|
|
/*
|
2025-02-14 10:24:30 -05:00
|
|
|
* Copyright (C) 2019-2025 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
|
2018-10-08 12:59:42 -04:00
|
|
|
*
|
|
|
|
* EGG is an archive format created by ESTsoft used by their ALZip
|
|
|
|
* archiving software.
|
|
|
|
*
|
|
|
|
* This software is written from scratch based solely from ESTsoft's
|
|
|
|
* file format documentation and from testing with EGG format archives.
|
|
|
|
* ESTsoft's "unEGG" module was not used in the creation of this capability
|
|
|
|
* in order to avoid to licensing restrictions on the ESTsoft "unEGG" module.
|
|
|
|
*
|
libclamav: scan-layer callback API functions
Add the following scan callbacks:
```c
cl_engine_set_scan_callback(engine, &pre_hash_callback, CL_SCAN_CALLBACK_PRE_HASH);
cl_engine_set_scan_callback(engine, &pre_scan_callback, CL_SCAN_CALLBACK_PRE_SCAN);
cl_engine_set_scan_callback(engine, &post_scan_callback, CL_SCAN_CALLBACK_POST_SCAN);
cl_engine_set_scan_callback(engine, &alert_callback, CL_SCAN_CALLBACK_ALERT);
cl_engine_set_scan_callback(engine, &file_type_callback, CL_SCAN_CALLBACK_FILE_TYPE);
```
Each callback may alter scan behavior using the following return codes:
* CL_BREAK
Scan aborted by callback (the rest of the scan is skipped).
This does not mark the file as clean or infected, it just skips the rest of the scan.
* CL_SUCCESS / CL_CLEAN
File scan will continue.
This is different than CL_VERIFIED because it does not affect prior or future alerts.
Return CL_VERIFIED instead if you want to remove prior alerts for this layer and skip
the rest of the scan for this layer.
* CL_VIRUS
This means you don't trust the file. A new alert will be added.
For CL_SCAN_CALLBACK_ALERT: Means you agree with the alert (no extra alert needed).
* CL_VERIFIED
Layer explicitly trusted by the callback and previous alerts removed FOR THIS layer.
You might want to do this if you trust the hash or verified a digital signature.
The rest of the scan will be skipped FOR THIS layer.
For contained files, this does NOT mean that the parent or adjacent layers are trusted.
Each callback is given a pointer to the current scan layer from which
they can get previous layers, can get the the layer's fmap, and then
various attributes of the layer and of the fmap such as:
- layer recursion level
- layer object id
- layer file type
- layer attributes (was decerypted, normalized, embedded, or re-typed)
- layer last alert
- fmap name
- fmap hash (md5, sha1, or sha2-256)
- fmap data (pointer and size)
- fmap file descriptor, if any (fd, offset, size)
- fmap filepath, if any (filepath, offset, size)
To make this possible, this commits introduced a handful of new APIs to
query scan-layer details and fmap details:
- `cl_error_t cl_fmap_set_name(cl_fmap_t *map, const char *name);`
- `cl_error_t cl_fmap_get_name(cl_fmap_t *map, const char **name_out);`
- `cl_error_t cl_fmap_set_path(cl_fmap_t *map, const char *path);`
- `cl_error_t cl_fmap_get_path(cl_fmap_t *map, const char **path_out, size_t *offset_out, size_t *len_out);`
- `cl_error_t cl_fmap_get_fd(const cl_fmap_t *map, int *fd_out, size_t *offset_out, size_t *len_out);`
- `cl_error_t cl_fmap_get_size(const cl_fmap_t *map, size_t *size_out);`
- `cl_error_t cl_fmap_set_hash(const cl_fmap_t *map, const char *hash_alg, char hash);`
- `cl_error_t cl_fmap_have_hash(const cl_fmap_t *map, const char *hash_alg, bool *have_hash_out);`
- `cl_error_t cl_fmap_will_need_hash_later(const cl_fmap_t *map, const char *hash_alg);`
- `cl_error_t cl_fmap_get_hash(const cl_fmap_t *map, const char *hash_alg, const char **hash_out);`
- `cl_error_t cl_fmap_get_data(const cl_fmap_t *map, size_t offset, size_t len, const uint8_t **data_out, size_t *data_len_out);`
- `cl_error_t cl_scan_layer_get_fmap(cl_scan_layer_t *layer, cl_fmap_t **fmap_out);`
- `cl_error_t cl_scan_layer_get_parent_layer(cl_scan_layer_t *layer, cl_scan_layer_t **parent_layer_out);`
- `cl_error_t cl_scan_layer_get_type(cl_scan_layer_t *layer, const char **type_out);`
- `cl_error_t cl_scan_layer_get_recursion_level(cl_scan_layer_t *layer, uint32_t *recursion_level_out);`
- `cl_error_t cl_scan_layer_get_object_id(cl_scan_layer_t *layer, uint64_t *object_id_out);`
- `cl_error_t cl_scan_layer_get_last_alert(cl_scan_layer_t *layer, const char **alert_name_out);`
- `cl_error_t cl_scan_layer_get_attributes(cl_scan_layer_t *layer, uint32_t *attributes_out);`
This commit deprecates but does not remove the existing scan callbacks:
- `void cl_engine_set_clcb_pre_cache(struct cl_engine *engine, clcb_pre_cache callback);`
- `void cl_engine_set_clcb_file_inspection(struct cl_engine *engine, clcb_file_inspection callback);`
- `void cl_engine_set_clcb_pre_scan(struct cl_engine *engine, clcb_pre_scan callback);`
- `void cl_engine_set_clcb_post_scan(struct cl_engine *engine, clcb_post_scan callback);`
- `void cl_engine_set_clcb_virus_found(struct cl_engine *engine, clcb_virus_found callback);`
- `void cl_engine_set_clcb_hash(struct cl_engine *engine, clcb_hash callback);`
This commit also adds an interactive test program to demonstrate the callbacks.
See: `examples/ex_scan_callbacks.c`
CLAM-255
CLAM-2485
CLAM-2626
2025-06-22 14:37:03 -04:00
|
|
|
* Authors: Valerie Snyder
|
2018-10-08 12:59:42 -04:00
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
* it under the terms of the GNU General Public License version 2 as
|
|
|
|
* published by the Free Software Foundation.
|
|
|
|
*
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
* GNU General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public License
|
|
|
|
* along with this program; if not, write to the Free Software
|
|
|
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
|
|
|
|
* MA 02110-1301, USA.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef _EGG_H
|
|
|
|
#define _EGG_H
|
|
|
|
|
|
|
|
#include <clamav.h>
|
|
|
|
#include <others.h>
|
|
|
|
|
2019-05-24 10:00:35 -04:00
|
|
|
/**
|
|
|
|
* @brief Metadata list node structure modeled after the ClamAV RAR metadata structure.
|
|
|
|
*
|
|
|
|
* Information is primarily used by the scan metadata feature.
|
|
|
|
*/
|
2018-10-08 12:59:42 -04:00
|
|
|
typedef struct cl_egg_metadata {
|
|
|
|
uint64_t pack_size;
|
|
|
|
uint64_t unpack_size;
|
|
|
|
char* filename;
|
|
|
|
struct cl_egg_metadata* next;
|
|
|
|
unsigned int encrypted;
|
|
|
|
uint32_t is_dir;
|
|
|
|
} cl_egg_metadata;
|
|
|
|
|
2019-05-24 10:00:35 -04:00
|
|
|
/**
|
|
|
|
* @brief Given an fmap to en EGG archive, open a handle for extracting archive contents.
|
|
|
|
*
|
|
|
|
* A best effort will be made for split archives, though it is incapable of properly extracting split
|
|
|
|
* archives since it can only accept 1 file at a time.
|
|
|
|
*
|
|
|
|
* @param map fmap representing archive file.
|
|
|
|
* @param sfx_offset 0 for a regular file, or an offset into the fmap for the EGG archive if found embedded in another file.
|
2021-07-16 11:47:23 -07:00
|
|
|
* @param[out] hArchive Handle to opened archive.
|
|
|
|
* @param[out] comments Array of null terminated archive comments, if present in archive. Array will be free'd by cli_egg_close()
|
|
|
|
* @param[out] nComments Number of archive comments in array.
|
2019-07-01 16:08:14 -04:00
|
|
|
* @return cl_error_t CL_SUCCESS if success.
|
2019-05-24 10:00:35 -04:00
|
|
|
*/
|
2019-07-01 16:08:14 -04:00
|
|
|
cl_error_t cli_egg_open(
|
2019-05-24 10:00:35 -04:00
|
|
|
fmap_t* map,
|
|
|
|
void** hArchive,
|
|
|
|
char*** comments,
|
|
|
|
uint32_t* nComments);
|
|
|
|
|
|
|
|
/**
|
2023-11-26 15:01:19 -08:00
|
|
|
* @brief Peek at the next file in the archive, without incremented the current file index.
|
2019-05-24 10:00:35 -04:00
|
|
|
*
|
|
|
|
* @param hArchive An open EGG archive handle from cli_egg_open()
|
|
|
|
* @param file_metadata Metadata describing the next file to be extracted (or skipped).
|
2019-07-01 16:08:14 -04:00
|
|
|
* @return cl_error_t CL_SUCCESS if success.
|
2019-05-24 10:00:35 -04:00
|
|
|
*/
|
2019-07-01 16:08:14 -04:00
|
|
|
cl_error_t cli_egg_peek_file_header(
|
2019-05-24 10:00:35 -04:00
|
|
|
void* hArchive,
|
|
|
|
cl_egg_metadata* file_metadata);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Extract the next file in the archive.
|
|
|
|
*
|
|
|
|
* Does not return all of the metadata provided by cli_egg_peek_file_header(), so both should be used to get file information.
|
2023-11-26 15:01:19 -08:00
|
|
|
* The current file index will be incremented on both success and failure.
|
2019-05-24 10:00:35 -04:00
|
|
|
*
|
2021-07-16 11:47:23 -07:00
|
|
|
* @param hArchive An open EGG archive handle from cli_egg_open()
|
|
|
|
* @param[out] filename The filename of the extracted file, in UTF-8.
|
|
|
|
* @param[out] output_buffer A malloc'd buffer of the file contents. Must be free()'d by caller. Set to NULL on failure.
|
|
|
|
* @param[out] output_buffer_length Size of buffer in bytes.
|
|
|
|
* @return cl_error_t CL_SUCCESS if success.
|
2019-05-24 10:00:35 -04:00
|
|
|
*/
|
2019-07-01 16:08:14 -04:00
|
|
|
cl_error_t cli_egg_extract_file(
|
2019-05-24 10:00:35 -04:00
|
|
|
void* hArchive,
|
|
|
|
const char** filename,
|
|
|
|
const char** output_buffer,
|
|
|
|
size_t* output_buffer_length);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Skip the next file.
|
|
|
|
*
|
|
|
|
* This is useful to skip things like directories, encrypted files, or file that are too large.
|
|
|
|
*
|
|
|
|
* @param hArchive An open EGG archive handle from cli_egg_open()
|
2019-07-01 16:08:14 -04:00
|
|
|
* @return cl_error_t CL_SUCCESS if success.
|
2019-05-24 10:00:35 -04:00
|
|
|
*/
|
2019-07-01 16:08:14 -04:00
|
|
|
cl_error_t cli_egg_skip_file(void* hArchive);
|
2019-05-24 10:00:35 -04:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Close the handle to the EGG archive and free the associated resources.
|
|
|
|
*
|
|
|
|
* @param hArchive An open EGG archive handle from cli_egg_open()
|
|
|
|
*/
|
2018-10-08 12:59:42 -04:00
|
|
|
void cli_egg_close(void* hArchive);
|
|
|
|
|
|
|
|
#endif // _EGG_H
|