2006-12-26 16:17:02 +00:00
|
|
|
/*
|
2024-01-12 17:03:59 -05:00
|
|
|
* Copyright (C) 2013-2024 Cisco Systems, Inc. and/or its affiliates. All rights reserved.
|
2019-01-25 10:15:50 -05:00
|
|
|
* Copyright (C) 2007-2013 Sourcefire, Inc.
|
2008-04-02 15:24:51 +00:00
|
|
|
*
|
|
|
|
* Authors: Török Edvin
|
2020-01-03 15:44:07 -05:00
|
|
|
*
|
2018-03-05 16:34:35 -05:00
|
|
|
* Summary: Hash-table and -set data structures.
|
2020-01-03 15:44:07 -05:00
|
|
|
*
|
|
|
|
* Acknowledgements: hash32shift() is an implementation of Thomas Wang's
|
|
|
|
* 32-bit integer hash function:
|
2018-03-05 16:34:35 -05:00
|
|
|
* http://www.cris.com/~Ttwang/tech/inthash.htm
|
2006-12-26 16:17:02 +00:00
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
2008-04-02 15:24:51 +00:00
|
|
|
* it under the terms of the GNU General Public License version 2 as
|
2007-06-30 11:50:56 +00:00
|
|
|
* published by the Free Software Foundation.
|
2006-12-26 16:17:02 +00:00
|
|
|
*
|
|
|
|
* This program is distributed in the hope that it will be useful,
|
|
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
* GNU General Public License for more details.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public License
|
|
|
|
* along with this program; if not, write to the Free Software
|
|
|
|
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
|
|
|
|
* MA 02110-1301, USA.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef _HASHTAB_H
|
|
|
|
#define _HASHTAB_H
|
2008-05-27 16:30:47 +00:00
|
|
|
#include <stdio.h>
|
|
|
|
#include <stddef.h>
|
2010-01-04 14:56:04 +01:00
|
|
|
#include <sys/types.h>
|
2022-08-12 16:59:35 -07:00
|
|
|
#include <stdbool.h>
|
2018-12-07 16:00:54 -05:00
|
|
|
|
2022-08-12 16:59:35 -07:00
|
|
|
#include "clamav.h"
|
2018-12-07 16:00:54 -05:00
|
|
|
#include "clamav-config.h"
|
2010-10-14 17:18:20 +02:00
|
|
|
#include "mpool.h"
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/******************************************************************************/
|
|
|
|
/* A hash table.
|
|
|
|
*
|
|
|
|
* There are two types:
|
|
|
|
* 1. hashtable:
|
|
|
|
* The key is a const char* (string)
|
|
|
|
* The value (data) is a buffer, stored as a size_t (instead of a void *) and an offset.
|
|
|
|
*
|
|
|
|
* 2. htu32 (hashtable uint32_t)
|
|
|
|
* Th ekey is a uint32_t number
|
|
|
|
* The value (data) is a buffer, stored as either a size_t, or as a void *, and an offset.
|
|
|
|
*/
|
|
|
|
/******************************************************************************/
|
|
|
|
|
CMake: Add CTest support to match Autotools checks
An ENABLE_TESTS CMake option is provided so that users can disable
testing if they don't want it. Instructions for how to use this
included in the INSTALL.cmake.md file.
If you run `ctest`, each testcase will write out a log file to the
<build>/unit_tests directory.
As with Autotools' make check, the test files are from test/.split
and unit_tests/.split files, but for CMake these are generated at
build time instead of at test time.
On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled
libraries can be loaded when running tests.
On Windows systems, CTest will identify and collect all library
dependencies and assemble a temporarily install under the
build/unit_tests directory so that the libraries can be loaded when
running tests.
The same feature is used on Windows when using CMake to install to
collect all DLL dependencies so that users don't have to install them
manually afterwards.
Each of the CTest tests are run using a custom wrapper around Python's
unittest framework, which is also responsible for finding and inserting
valgrind into the valgrind tests on Posix systems.
Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by
default, if Valgrind can be found. There's no need to set VG=1.
CTest's memcheck module is NOT supported, because we use Python to
orchestrate our tests.
Added a bunch of Windows compatibility changes to the unit tests.
These were primarily changing / to PATHSEP and making adjustments
to use Win32 C headers and ifdef out the POSIX ones which aren't
available on Windows. Also disabled a bunch of tests on Win32
that don't work on Windows, notably the mmap ones and FD-passing
(i.e. FILEDES) ones.
Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate
warnings on Windows where json.h is included after inttypes.h because
json-c's inttypes replacement relies on it.
This is a it of a hack and may be removed if json-c fixes their
inttypes header stuff in the future.
Add preprocessor definitions on Windows to disable MSVC warnings about
CRT secure and nonstandard functions. While there may be a better
solution, this is needed to be able to see other more serious warnings.
Add missing file comment block and copyright statement for clamsubmit.c.
Also change json-c/json.h include filename to json.h in clamsubmit.c.
The directory name is not required.
Changed the hash table data integer type from long, which is poorly
defined, to size_t -- which is capable of storing a pointer. Fixed a
bunch of casts regarding this variable to eliminate warnings.
Fixed two bugs causing utf8 encoding unit tests to fail on Windows:
- The in_size variable should be the number of bytes, not the character
count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8
transcoding test to only transcode half the bytes.
- It turns out that the MultiByteToWideChar() API can't transcode
UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer
and flip the bytes on each uint16_t. This but was causing the UTF16-BE
to UTF8 tests to fail.
I also split up the utf8 transcoding tests into separate tests so I
could see all of the failures instead of just the first one.
Added a flags parameter to the unit test function to open testfiles
because it turns out that on Windows if a file contains the \r\n it will
replace it with just \n if you opened the file as a text file instead of
as binary. However, if we open the CBC files as binary, then a bunch of
bytecode tests fail. So I've changed the tests to open the CBC files in
the bytecode tests as text files and open all other files as binary.
Ported the feature tests from shell scripts to Python using a modified
version of our QA test-framework, which is largely compatible and will
allow us to migrate some QA tests into this repo. I'd like to add GitHub
Actions pipelines in the future so that all public PR's get some testing
before anyone has to manually review them.
The clamd --log option was missing from the help string, though it
definitely works. I've added it in this commit.
It appears that clamd.c was never clang-format'd, so this commit also
reformats clamd.c.
Some of the check_clamd tests expected the path returned by clamd to
match character for character with original path sent to clamd. However,
as we now evaluate real paths before a scan, the path returned by clamd
isn't going to match the relative (and possibly symlink-ridden) path
passed to clamdscan. I fixed this test by changing the test to search
for the basename: <signature> FOUND within the response instead of
matching the exact path.
Autotools: Link check_clamd with libclamav so we can use our utility
functions in check_clamd.c.
2020-08-25 23:14:23 -07:00
|
|
|
typedef size_t cli_element_data;
|
2006-12-26 16:17:02 +00:00
|
|
|
|
|
|
|
/* define this for debugging/profiling purposes only, NOT in production/release code */
|
|
|
|
#ifdef PROFILE_HASHTABLE
|
|
|
|
|
|
|
|
typedef struct {
|
2018-12-03 12:40:13 -05:00
|
|
|
size_t calc_hash;
|
|
|
|
size_t found;
|
|
|
|
size_t find_req;
|
|
|
|
size_t found_tries;
|
|
|
|
size_t not_found;
|
|
|
|
size_t not_found_tries;
|
|
|
|
size_t grow_found;
|
|
|
|
size_t grow_found_tries;
|
|
|
|
size_t grow;
|
|
|
|
size_t update;
|
|
|
|
size_t update_tries;
|
|
|
|
size_t inserts;
|
|
|
|
size_t insert_tries;
|
|
|
|
size_t deleted_reuse;
|
|
|
|
size_t deleted_tries;
|
|
|
|
size_t deletes;
|
|
|
|
size_t clear;
|
|
|
|
size_t hash_exhausted;
|
2006-12-26 16:17:02 +00:00
|
|
|
} PROFILE_STRUCT_;
|
|
|
|
|
|
|
|
#define STRUCT_PROFILE PROFILE_STRUCT_ PROFILE_STRUCT;
|
|
|
|
#else
|
|
|
|
|
|
|
|
#define STRUCT_PROFILE
|
|
|
|
|
|
|
|
#endif
|
2018-12-03 12:40:13 -05:00
|
|
|
struct cli_element {
|
|
|
|
const char *key;
|
|
|
|
cli_element_data data;
|
|
|
|
size_t len;
|
2006-12-26 16:17:02 +00:00
|
|
|
};
|
|
|
|
|
2009-08-04 23:17:28 +02:00
|
|
|
struct cli_hashtable {
|
2018-12-03 12:40:13 -05:00
|
|
|
struct cli_element *htable;
|
|
|
|
size_t capacity;
|
|
|
|
size_t used;
|
|
|
|
size_t maxfill; /* 80% */
|
2006-12-26 16:17:02 +00:00
|
|
|
|
2018-12-03 12:40:13 -05:00
|
|
|
STRUCT_PROFILE
|
2006-12-26 16:17:02 +00:00
|
|
|
};
|
|
|
|
|
2022-08-12 16:59:35 -07:00
|
|
|
/**
|
|
|
|
* @brief Generate C source code that represents the given hash table
|
|
|
|
*
|
|
|
|
* Comment: We don't really use this.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param name Some string name for the elements of this generated table.
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashtab_generate_c(const struct cli_hashtable *s, const char *name);
|
|
|
|
|
2018-12-03 12:40:13 -05:00
|
|
|
struct cli_element *cli_hashtab_find(const struct cli_hashtable *s, const char *key, const size_t len);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Create a new hashtab with a given capacity.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param capacity
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashtab_init(struct cli_hashtable *s, size_t capacity);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Insert a new key with data into the hashtable.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param key
|
|
|
|
* @param len
|
|
|
|
* @param data
|
|
|
|
* @return const struct cli_element*
|
|
|
|
*/
|
2018-12-03 12:40:13 -05:00
|
|
|
const struct cli_element *cli_hashtab_insert(struct cli_hashtable *s, const char *key, const size_t len, const cli_element_data data);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Delete a key from the hash table
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param key
|
|
|
|
* @param len
|
|
|
|
*/
|
2018-12-03 12:40:13 -05:00
|
|
|
void cli_hashtab_delete(struct cli_hashtable *s, const char *key, const size_t len);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Remove all keys from the hashtable
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
*/
|
2009-08-04 23:17:28 +02:00
|
|
|
void cli_hashtab_clear(struct cli_hashtable *s);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Free the hash table
|
|
|
|
*
|
|
|
|
* This will clear the hash table first. You don't need to clear it manually first.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
*/
|
2009-08-04 23:17:28 +02:00
|
|
|
void cli_hashtab_free(struct cli_hashtable *s);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Load a hash table from a file. (unpickle!)
|
|
|
|
*
|
|
|
|
* @param in
|
|
|
|
* @param s
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashtab_load(FILE *in, struct cli_hashtable *s);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Write a hash table to a file. (pickle!)
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param out
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashtab_store(const struct cli_hashtable *s, FILE *out);
|
2011-01-05 23:39:28 +01:00
|
|
|
|
|
|
|
struct cli_htu32_element {
|
|
|
|
uint32_t key;
|
|
|
|
union {
|
CMake: Add CTest support to match Autotools checks
An ENABLE_TESTS CMake option is provided so that users can disable
testing if they don't want it. Instructions for how to use this
included in the INSTALL.cmake.md file.
If you run `ctest`, each testcase will write out a log file to the
<build>/unit_tests directory.
As with Autotools' make check, the test files are from test/.split
and unit_tests/.split files, but for CMake these are generated at
build time instead of at test time.
On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled
libraries can be loaded when running tests.
On Windows systems, CTest will identify and collect all library
dependencies and assemble a temporarily install under the
build/unit_tests directory so that the libraries can be loaded when
running tests.
The same feature is used on Windows when using CMake to install to
collect all DLL dependencies so that users don't have to install them
manually afterwards.
Each of the CTest tests are run using a custom wrapper around Python's
unittest framework, which is also responsible for finding and inserting
valgrind into the valgrind tests on Posix systems.
Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by
default, if Valgrind can be found. There's no need to set VG=1.
CTest's memcheck module is NOT supported, because we use Python to
orchestrate our tests.
Added a bunch of Windows compatibility changes to the unit tests.
These were primarily changing / to PATHSEP and making adjustments
to use Win32 C headers and ifdef out the POSIX ones which aren't
available on Windows. Also disabled a bunch of tests on Win32
that don't work on Windows, notably the mmap ones and FD-passing
(i.e. FILEDES) ones.
Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate
warnings on Windows where json.h is included after inttypes.h because
json-c's inttypes replacement relies on it.
This is a it of a hack and may be removed if json-c fixes their
inttypes header stuff in the future.
Add preprocessor definitions on Windows to disable MSVC warnings about
CRT secure and nonstandard functions. While there may be a better
solution, this is needed to be able to see other more serious warnings.
Add missing file comment block and copyright statement for clamsubmit.c.
Also change json-c/json.h include filename to json.h in clamsubmit.c.
The directory name is not required.
Changed the hash table data integer type from long, which is poorly
defined, to size_t -- which is capable of storing a pointer. Fixed a
bunch of casts regarding this variable to eliminate warnings.
Fixed two bugs causing utf8 encoding unit tests to fail on Windows:
- The in_size variable should be the number of bytes, not the character
count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8
transcoding test to only transcode half the bytes.
- It turns out that the MultiByteToWideChar() API can't transcode
UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer
and flip the bytes on each uint16_t. This but was causing the UTF16-BE
to UTF8 tests to fail.
I also split up the utf8 transcoding tests into separate tests so I
could see all of the failures instead of just the first one.
Added a flags parameter to the unit test function to open testfiles
because it turns out that on Windows if a file contains the \r\n it will
replace it with just \n if you opened the file as a text file instead of
as binary. However, if we open the CBC files as binary, then a bunch of
bytecode tests fail. So I've changed the tests to open the CBC files in
the bytecode tests as text files and open all other files as binary.
Ported the feature tests from shell scripts to Python using a modified
version of our QA test-framework, which is largely compatible and will
allow us to migrate some QA tests into this repo. I'd like to add GitHub
Actions pipelines in the future so that all public PR's get some testing
before anyone has to manually review them.
The clamd --log option was missing from the help string, though it
definitely works. I've added it in this commit.
It appears that clamd.c was never clang-format'd, so this commit also
reformats clamd.c.
Some of the check_clamd tests expected the path returned by clamd to
match character for character with original path sent to clamd. However,
as we now evaluate real paths before a scan, the path returned by clamd
isn't going to match the relative (and possibly symlink-ridden) path
passed to clamdscan. I fixed this test by changing the test to search
for the basename: <signature> FOUND within the response instead of
matching the exact path.
Autotools: Link check_clamd with libclamav so we can use our utility
functions in check_clamd.c.
2020-08-25 23:14:23 -07:00
|
|
|
size_t as_size_t;
|
2018-12-03 12:40:13 -05:00
|
|
|
void *as_ptr;
|
2011-01-05 23:39:28 +01:00
|
|
|
} data;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct cli_htu32 {
|
2018-12-03 12:40:13 -05:00
|
|
|
struct cli_htu32_element *htable;
|
2011-01-05 23:39:28 +01:00
|
|
|
size_t capacity;
|
|
|
|
size_t used;
|
2018-12-03 12:40:13 -05:00
|
|
|
size_t maxfill; /* 80% */
|
2011-01-05 23:39:28 +01:00
|
|
|
|
|
|
|
STRUCT_PROFILE
|
|
|
|
};
|
|
|
|
|
2022-08-12 16:59:35 -07:00
|
|
|
#ifdef USE_MPOOL
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief A macro to wrap cli_htu32_init() where you can assume MEMPOOL is enabled,
|
|
|
|
* but will replace the last partment with NULL if MEMPOOL is not enabled.
|
|
|
|
*/
|
|
|
|
#define CLI_HTU32_INIT(A, B, C) cli_htu32_init(A, B, C)
|
|
|
|
/**
|
|
|
|
* @brief A macro to wrap cli_htu32_insert() where you can assume MEMPOOL is enabled,
|
|
|
|
* but will replace the last partment with NULL if MEMPOOL is not enabled.
|
|
|
|
*/
|
|
|
|
#define CLI_HTU32_INSERT(A, B, C) cli_htu32_insert(A, B, C)
|
|
|
|
/**
|
|
|
|
* @brief A macro to wrap cli_htu32_free() where you can assume MEMPOOL is enabled,
|
|
|
|
* but will replace the last partment with NULL if MEMPOOL is not enabled.
|
|
|
|
*/
|
|
|
|
#define CLI_HTU32_FREE(A, B) cli_htu32_free(A, B)
|
|
|
|
|
|
|
|
#else
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief A macro to wrap cli_htu32_init() where you can assume MEMPOOL is enabled,
|
|
|
|
* but will replace the last partment with NULL if MEMPOOL is not enabled.
|
|
|
|
*/
|
|
|
|
#define CLI_HTU32_INIT(A, B, C) cli_htu32_init(A, B, NULL)
|
|
|
|
/**
|
|
|
|
* @brief A macro to wrap cli_htu32_insert() where you can assume MEMPOOL is enabled,
|
|
|
|
* but will replace the last partment with NULL if MEMPOOL is not enabled.
|
|
|
|
*/
|
|
|
|
#define CLI_HTU32_INSERT(A, B, C) cli_htu32_insert(A, B, NULL)
|
|
|
|
/**
|
|
|
|
* @brief A macro to wrap cli_htu32_free() where you can assume MEMPOOL is enabled,
|
|
|
|
* but will replace the last partment with NULL if MEMPOOL is not enabled.
|
|
|
|
*/
|
|
|
|
#define CLI_HTU32_FREE(A, B) cli_htu32_free(A, NULL)
|
|
|
|
|
2011-01-14 23:25:27 +01:00
|
|
|
#endif
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Initialize a new u32 hashtable.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param capacity
|
|
|
|
* @param mempool If MEMPOOL not enabled, this can be NULL.
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_htu32_init(struct cli_htu32 *s, size_t capacity, mpool_t *mempool);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Insert a new element into the u32 hashtable.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param item
|
|
|
|
* @param mempool
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_htu32_insert(struct cli_htu32 *s, const struct cli_htu32_element *item, mpool_t *mempool);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Free the u32 hashtable.
|
|
|
|
*
|
|
|
|
* This will clear the hash table first. You don't need to clear it manually first.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param mempool
|
|
|
|
*/
|
|
|
|
void cli_htu32_free(struct cli_htu32 *s, mpool_t *mempool);
|
|
|
|
|
|
|
|
/**
|
2023-11-26 15:01:19 -08:00
|
|
|
* @brief Find a specific element by key in the u32 hashtable.
|
2022-08-12 16:59:35 -07:00
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param key
|
|
|
|
* @return const struct cli_htu32_element*
|
|
|
|
*/
|
2011-01-05 23:39:28 +01:00
|
|
|
const struct cli_htu32_element *cli_htu32_find(const struct cli_htu32 *s, uint32_t key);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Remove a specific element from the u32 hashtable.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param key
|
|
|
|
*/
|
2011-01-05 23:39:28 +01:00
|
|
|
void cli_htu32_delete(struct cli_htu32 *s, uint32_t key);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Remove all elements from the u32 hashtable.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
*/
|
2011-01-05 23:39:28 +01:00
|
|
|
void cli_htu32_clear(struct cli_htu32 *s);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Get the next element in the table, following the provided element
|
|
|
|
*
|
|
|
|
* Use this to enumerate the table linearly.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @param current If you feed it NULL, it will give you the first element.
|
|
|
|
* @return const struct cli_htu32_element* Will return the next element, or NULL if there are no further elements.
|
|
|
|
*/
|
2011-01-06 14:35:46 +01:00
|
|
|
const struct cli_htu32_element *cli_htu32_next(const struct cli_htu32 *s, const struct cli_htu32_element *current);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Get the number of items in the u32 hashtable.
|
|
|
|
*
|
|
|
|
* @param s
|
|
|
|
* @return size_t
|
|
|
|
*/
|
2011-01-07 02:59:41 +01:00
|
|
|
size_t cli_htu32_numitems(struct cli_htu32 *s);
|
2011-01-05 23:39:28 +01:00
|
|
|
|
2022-08-12 16:59:35 -07:00
|
|
|
/******************************************************************************/
|
2010-05-12 18:26:02 +03:00
|
|
|
/* a hashtable that stores the values too */
|
2022-08-12 16:59:35 -07:00
|
|
|
/******************************************************************************/
|
|
|
|
|
2010-05-12 18:26:02 +03:00
|
|
|
struct cli_map_value {
|
|
|
|
void *value;
|
|
|
|
int32_t valuesize;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct cli_map {
|
|
|
|
struct cli_hashtable htab;
|
|
|
|
union {
|
2018-12-03 12:40:13 -05:00
|
|
|
struct cli_map_value *unsized_values;
|
|
|
|
void *sized_values;
|
2010-05-12 18:26:02 +03:00
|
|
|
} u;
|
|
|
|
uint32_t nvalues;
|
|
|
|
int32_t keysize;
|
|
|
|
int32_t valuesize;
|
|
|
|
int32_t last_insert;
|
|
|
|
int32_t last_find;
|
|
|
|
};
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Initialize a new map
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
* @param keysize
|
|
|
|
* @param valuesize
|
|
|
|
* @param capacity
|
|
|
|
* @return cl_error_t CL_SUCCESS on success
|
2023-11-26 15:01:19 -08:00
|
|
|
* @return cl_error_t CL_E* if some error occurred
|
2022-08-12 16:59:35 -07:00
|
|
|
*/
|
|
|
|
cl_error_t cli_map_init(struct cli_map *m, int32_t keysize, int32_t valuesize,
|
|
|
|
int32_t capacity);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief add key to the map
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
* @param key
|
|
|
|
* @param keysize
|
|
|
|
* @return cl_error_t CL_SUCCESS if added.
|
|
|
|
* @return cl_error_t CL_ECREAT if already present.
|
2023-11-26 15:01:19 -08:00
|
|
|
* @return cl_error_t CL_E* if some error occurred.
|
2022-08-12 16:59:35 -07:00
|
|
|
*/
|
|
|
|
cl_error_t cli_map_addkey(struct cli_map *m, const void *key, int32_t keysize);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief remove key from the map
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
* @param key
|
|
|
|
* @param keysize
|
|
|
|
* @return cl_error_t CL_SUCCESS if removed.
|
|
|
|
* @return cl_error_t CL_EUNLINK if not present, so didn't need to be removed.
|
2023-11-26 15:01:19 -08:00
|
|
|
* @return cl_error_t CL_E* if some error occurred.
|
2022-08-12 16:59:35 -07:00
|
|
|
*/
|
|
|
|
cl_error_t cli_map_removekey(struct cli_map *m, const void *key, int32_t keysize);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief set the value for the last inserted key with map_addkey
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
* @param value
|
|
|
|
* @param valuesize
|
|
|
|
* @return cl_error_t CL_SUCCESS on success
|
2023-11-26 15:01:19 -08:00
|
|
|
* @return cl_error_t CL_E* if some error occurred
|
2022-08-12 16:59:35 -07:00
|
|
|
*/
|
|
|
|
cl_error_t cli_map_setvalue(struct cli_map *m, const void *value, int32_t valuesize);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief find key in the map
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
* @param key
|
|
|
|
* @param keysize
|
|
|
|
* @return cl_error_t CL_SUCCESS if found
|
|
|
|
* @return cl_error_t CL_EACCES if NOT found
|
2023-11-26 15:01:19 -08:00
|
|
|
* @return cl_error_t CL_E* if some error occurred.
|
2022-08-12 16:59:35 -07:00
|
|
|
*/
|
|
|
|
cl_error_t cli_map_find(struct cli_map *m, const void *key, int32_t keysize);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief get the size of value obtained during the last map_find
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
* @return int the value size on success
|
|
|
|
* @return int -1 on failure
|
|
|
|
*/
|
2018-12-03 12:40:13 -05:00
|
|
|
int cli_map_getvalue_size(struct cli_map *m);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief get the value obtained during the last map_find
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
* @return void* the value on success
|
|
|
|
* @return void* NULL on failure
|
|
|
|
*/
|
2018-12-03 12:40:13 -05:00
|
|
|
void *cli_map_getvalue(struct cli_map *m);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief delete the map
|
|
|
|
*
|
|
|
|
* @param m
|
|
|
|
*/
|
2010-05-12 18:26:02 +03:00
|
|
|
void cli_map_delete(struct cli_map *m);
|
|
|
|
|
2022-08-12 16:59:35 -07:00
|
|
|
/******************************************************************************/
|
|
|
|
/* A set of unique keys (no values).
|
|
|
|
* The keys are just uint32_t numbers. */
|
|
|
|
/******************************************************************************/
|
|
|
|
|
2009-08-04 23:17:28 +02:00
|
|
|
struct cli_hashset {
|
2018-12-03 12:40:13 -05:00
|
|
|
uint32_t *keys;
|
|
|
|
uint32_t *bitmap;
|
|
|
|
mpool_t *mempool;
|
|
|
|
uint32_t capacity;
|
|
|
|
uint32_t mask;
|
|
|
|
uint32_t count;
|
|
|
|
uint32_t limit;
|
2008-02-06 18:53:23 +00:00
|
|
|
};
|
|
|
|
|
2022-08-12 16:59:35 -07:00
|
|
|
/**
|
|
|
|
* @brief Initialize hashset.
|
|
|
|
*
|
|
|
|
* When capacity * (load_factor/100) is reached, the hashset is growed.
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
* @param initial_capacity is rounded to nearest power of 2.
|
|
|
|
* @param load_factor is between 50 and 99.
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashset_init(struct cli_hashset *hs, size_t initial_capacity, uint8_t load_factor);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Initialize hashset using the clamav MEMPOOL instead of just malloc/realloc.
|
|
|
|
*
|
|
|
|
* Comment: not presently used in any parsers or signature loaders or anything.
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
* @param initial_capacity is rounded to nearest power of 2.
|
|
|
|
* @param load_factor is between 50 and 99.
|
|
|
|
* @param mempool the mempool
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashset_init_pool(struct cli_hashset *hs, size_t initial_capacity, uint8_t load_factor, mpool_t *mempool);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Add a key to the hashset.
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
* @param key
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashset_addkey(struct cli_hashset *hs, const uint32_t key);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Remove a key from the hashset
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
* @param key
|
|
|
|
* @return cl_error_t
|
|
|
|
*/
|
|
|
|
cl_error_t cli_hashset_removekey(struct cli_hashset *hs, const uint32_t key);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Find out if hashset contains akey
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
* @param key
|
|
|
|
* @return true If found
|
|
|
|
* @return false If not found
|
|
|
|
*/
|
|
|
|
bool cli_hashset_contains(const struct cli_hashset *hs, const uint32_t key);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Destroy/deallocate a hashset.
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
*/
|
2018-12-03 12:40:13 -05:00
|
|
|
void cli_hashset_destroy(struct cli_hashset *hs);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief Convert the hashset to an array of uint32_t's
|
|
|
|
*
|
|
|
|
* It will allocate a 0-length array! You are still responsible for freeing it if
|
|
|
|
* it returns 0!
|
|
|
|
*
|
|
|
|
* You don't need to free anything if it returns -1.
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
* @param [out] array Allocated array of the length returned. Caller must free it.
|
|
|
|
* @return ssize_t The length of the array if success, or else -1 if failed.
|
|
|
|
*/
|
2018-12-03 12:40:13 -05:00
|
|
|
ssize_t cli_hashset_toarray(const struct cli_hashset *hs, uint32_t **array);
|
2010-01-04 17:08:59 +02:00
|
|
|
|
2022-08-12 16:59:35 -07:00
|
|
|
/**
|
|
|
|
* @brief Initializes the set without allocating memory
|
|
|
|
*
|
|
|
|
* Initializes the set without allocating memory, you can do lookups on it
|
2010-01-04 17:08:59 +02:00
|
|
|
* using _contains_maybe_noalloc. You need to initialize it using _init
|
2022-08-12 16:59:35 -07:00
|
|
|
* before using _addkey or _removekey though
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
*/
|
2010-01-04 17:08:59 +02:00
|
|
|
void cli_hashset_init_noalloc(struct cli_hashset *hs);
|
2022-08-12 16:59:35 -07:00
|
|
|
|
|
|
|
/**
|
|
|
|
* @brief
|
|
|
|
*
|
|
|
|
* this works like cli_hashset_contains (above), except that the hashset may
|
|
|
|
* have not been initialized by _init, only by _init_noalloc
|
|
|
|
*
|
|
|
|
* @param hs
|
|
|
|
* @param key
|
|
|
|
* @return true If found
|
|
|
|
* @return false If not found
|
|
|
|
*/
|
|
|
|
bool cli_hashset_contains_maybe_noalloc(const struct cli_hashset *hs, const uint32_t key);
|
|
|
|
|
2006-12-26 16:17:02 +00:00
|
|
|
#endif
|