--- title: Monitoring category: Getting Started chapter: 1 order: 13 --- ### Health Starting with v4.8.0, Dependency-Track exposes health information according to the [MicroProfile Health] specification. Refer to the specification for details on how the exposed endpoints behave (i.e. [MicroProfile Health REST interfaces specifications]). Currently, only a single [readiness check] is included. The *database* check verifies that database connections can be acquired and used successfully. The check spans both connection pools (see [Connection Pooling]). ```json { "status": "UP", "checks": [ { "name": "database", "status": "UP", "data": { "nontx_connection_pool": "UP", "tx_connection_pool": "UP" } } ] } ``` ### Logging Logging of the API server is configured via [Logback]. All distributions of the API server ship with a [default Logback configuration]. It defines the following behavior: 1. Log messages from the embedded Jetty server to: * `$HOME/.dependency-track/server..log` 2. Log messages from Dependency-Track and the underlying Alpine framework to: * `$HOME/.dependency-track/dependency-track..log` * Standard Output 3. Log security-related messages to: * `$HOME/.dependency-track/dependency-track-audit..log` * Standard Output 4. For log files: * Create a new log file once the current one exceeds 10MB in size * Retain a history of up to 9 files per log before overwriting them 5. Output logs in a human-friendly format > For containerized deployments, `$HOME` will refer to the `/data` directory. #### Custom Logging Configuration When operating Dependency-Track in container-centric environments, where logs are typically forwarded from containers' standard output to a centralized log aggregator (e.g. ElasticSearch, OpenSearch, Splunk), it is desirable to disable logging to disk, and even change the output to a more machine-readable format. Starting with Dependency-Track v4.9.0, it is possible to provide a custom Logback configuration, and configure JSON as output format (powered by [logstash-logback-encoder]). An example configuration file for JSON logging to standard output ([`logback-json.xml`]) is included in the API server container image, and can be enabled using the `LOGGING_CONFIG_PATH` environment variable: ```shell # (Other configuration options omitted for brevity) docker run -it --rm \ -e "LOGGING_CONFIG_PATH=logback-json.xml" \ dependencytrack/apiserver:latest ``` Refer to the [logstash-logback-encoder documentation] for advanced customization details. In order to use a truly custom configuration file, it has to be mounted into the container, e.g.: ```shell # (Other configuration options omitted for brevity) docker run -it --rm \ -v "./path/to/logback-custom.xml:/etc/dtrack/logback-custom.xml:ro" \ -e "LOGGING_CONFIG_PATH=/etc/dtrack/logback-custom.xml" \ dependencytrack/apiserver:latest ``` For non-containerized distributions of the API server, a custom configuration file may be provided via the `logback.configurationFile` JVM property: ```shell # (Other configuration options omitted for brevity) java -Dlogback.configurationFile=/path/to/logback-custom.xml \ -jar dependency-track-apiserver.jar ``` ### Metrics The API server can be configured to expose system metrics via the Prometheus [text-based exposition format]. They can then be collected and visualized using tools like [Prometheus] and [Grafana]. Especially for containerized deployments where directly attaching to the underlying Java Virtual Machine (JVM) is not possible, monitoring system metrics via Prometheus is crucial for observability. > System metrics are not the same as portfolio metrics exposed via `/api/v1/metrics` REST API or > the web UI. The metrics described here are of technical nature and meant for monitoring > the application itself, not the data managed by it. If exposition of portfolio statistics via Prometheus is desired, > refer to [community integrations] like Jetstack's [dependency-track-exporter]. To enable metrics exposition, set the `alpine.metrics.enabled` property to `true` (see [Configuration]). Metrics will be exposed in the `/metrics` endpoint, and can optionally be protected using basic authentication via `alpine.metrics.auth.username` and `alpine.metrics.auth.password`. #### Exposed Metrics Exposed metrics include various general purpose system and JVM statistics (CPU and Heap usage, thread states, garbage collector activity etc.), but also some related to Dependency-Track's internal event and notification system. More metrics covering other areas of Dependency-Track will be added in future versions. ##### Database Metrics of the ORM used by the Dependency-Track API server are exposed under the `datanucleus` namespace. They provide a high-level overview of how many, and which kind of persistence operations are performed: ```yaml # HELP datanucleus_transactions_rolledback_total Total number of rolled-back transactions # TYPE datanucleus_transactions_rolledback_total counter datanucleus_transactions_rolledback_total 0.0 # HELP datanucleus_queries_failed_total Total number of queries that completed with an error # TYPE datanucleus_queries_failed_total counter datanucleus_queries_failed_total 0.0 # HELP datanucleus_query_execution_time_ms_avg Average query execution time in milliseconds # TYPE datanucleus_query_execution_time_ms_avg gauge datanucleus_query_execution_time_ms_avg 0.0 # HELP datanucleus_transaction_execution_time_ms_avg Average transaction execution time in milliseconds # TYPE datanucleus_transaction_execution_time_ms_avg gauge datanucleus_transaction_execution_time_ms_avg 77.0 # HELP datanucleus_datastore_reads_total Total number of read operations from the datastore # TYPE datanucleus_datastore_reads_total counter datanucleus_datastore_reads_total 5650.0 # HELP datanucleus_datastore_writes_total Total number of write operations to the datastore # TYPE datanucleus_datastore_writes_total counter datanucleus_datastore_writes_total 1045.0 # HELP datanucleus_object_deletes_total Total number of objects deleted from the datastore # TYPE datanucleus_object_deletes_total counter datanucleus_object_deletes_total 0.0 # HELP datanucleus_transactions_total Total number of transactions # TYPE datanucleus_transactions_total counter datanucleus_transactions_total 1107.0 # HELP datanucleus_queries_active Number of currently active queries # TYPE datanucleus_queries_active gauge datanucleus_queries_active 0.0 # HELP datanucleus_queries_executed_total Total number of executed queries # TYPE datanucleus_queries_executed_total counter datanucleus_queries_executed_total 4095.0 # HELP datanucleus_connections_active Number of currently active managed datastore connections # TYPE datanucleus_connections_active gauge datanucleus_connections_active 0.0 # HELP datanucleus_object_inserts_total Total number of objects inserted into the datastore # TYPE datanucleus_object_inserts_total counter datanucleus_object_inserts_total 6.0 # HELP datanucleus_object_fetches_total Total number of objects fetched from the datastore # TYPE datanucleus_object_fetches_total counter datanucleus_object_fetches_total 981.0 # HELP datanucleus_transactions_active_total Number of currently active transactions # TYPE datanucleus_transactions_active_total counter datanucleus_transactions_active_total 0.0 # HELP datanucleus_object_updates_total Total number of objects updated in the datastore # TYPE datanucleus_object_updates_total counter datanucleus_object_updates_total 1039.0 # HELP datanucleus_transactions_committed_total Total number of committed transactions # TYPE datanucleus_transactions_committed_total counter datanucleus_transactions_committed_total 1107.0 ``` Additionally, metrics about the database connection pools are exposed under the `hikaricp` namespace. Monitoring these metrics is essential for tweaking the connection pool configuration (see [Connection Pooling]): ```yaml # HELP hikaricp_connections Total connections # TYPE hikaricp_connections gauge hikaricp_connections{pool="non-transactional",} 13.0 hikaricp_connections{pool="transactional",} 12.0 # HELP hikaricp_connections_usage_seconds Connection usage time # TYPE hikaricp_connections_usage_seconds summary hikaricp_connections_usage_seconds_count{pool="non-transactional",} 5888.0 hikaricp_connections_usage_seconds_sum{pool="non-transactional",} 60.928 hikaricp_connections_usage_seconds_count{pool="transactional",} 138.0 hikaricp_connections_usage_seconds_sum{pool="transactional",} 0.036 # HELP hikaricp_connections_usage_seconds_max Connection usage time # TYPE hikaricp_connections_usage_seconds_max gauge hikaricp_connections_usage_seconds_max{pool="non-transactional",} 1.319 hikaricp_connections_usage_seconds_max{pool="transactional",} 0.007 # HELP hikaricp_connections_min Min connections # TYPE hikaricp_connections_min gauge hikaricp_connections_min{pool="non-transactional",} 10.0 hikaricp_connections_min{pool="transactional",} 10.0 # HELP hikaricp_connections_pending Pending threads # TYPE hikaricp_connections_pending gauge hikaricp_connections_pending{pool="non-transactional",} 0.0 hikaricp_connections_pending{pool="transactional",} 0.0 # HELP hikaricp_connections_idle Idle connections # TYPE hikaricp_connections_idle gauge hikaricp_connections_idle{pool="non-transactional",} 13.0 hikaricp_connections_idle{pool="transactional",} 12.0 # HELP hikaricp_connections_timeout_total Connection timeout total count # TYPE hikaricp_connections_timeout_total counter hikaricp_connections_timeout_total{pool="non-transactional",} 0.0 hikaricp_connections_timeout_total{pool="transactional",} 0.0 # HELP hikaricp_connections_creation_seconds_max Connection creation time # TYPE hikaricp_connections_creation_seconds_max gauge hikaricp_connections_creation_seconds_max{pool="non-transactional",} 0.0 hikaricp_connections_creation_seconds_max{pool="transactional",} 0.0 # HELP hikaricp_connections_creation_seconds Connection creation time # TYPE hikaricp_connections_creation_seconds summary hikaricp_connections_creation_seconds_count{pool="non-transactional",} 12.0 hikaricp_connections_creation_seconds_sum{pool="non-transactional",} 0.0 hikaricp_connections_creation_seconds_count{pool="transactional",} 11.0 hikaricp_connections_creation_seconds_sum{pool="transactional",} 0.0 # HELP hikaricp_connections_active Active connections # TYPE hikaricp_connections_active gauge hikaricp_connections_active{pool="non-transactional",} 0.0 hikaricp_connections_active{pool="transactional",} 0.0 # HELP hikaricp_connections_max Max connections # TYPE hikaricp_connections_max gauge hikaricp_connections_max{pool="non-transactional",} 20.0 hikaricp_connections_max{pool="transactional",} 20.0 # HELP hikaricp_connections_acquire_seconds Connection acquire time # TYPE hikaricp_connections_acquire_seconds summary hikaricp_connections_acquire_seconds_count{pool="non-transactional",} 5888.0 hikaricp_connections_acquire_seconds_sum{pool="non-transactional",} 0.009996981 hikaricp_connections_acquire_seconds_count{pool="transactional",} 138.0 hikaricp_connections_acquire_seconds_sum{pool="transactional",} 4.68092E-4 # HELP hikaricp_connections_acquire_seconds_max Connection acquire time # TYPE hikaricp_connections_acquire_seconds_max gauge hikaricp_connections_acquire_seconds_max{pool="non-transactional",} 1.41889E-4 hikaricp_connections_acquire_seconds_max{pool="transactional",} 1.77837E-4 ``` ##### Event and Notification System Event and notification metrics include the following: ```yaml # HELP alpine_events_published_total Total number of published events # TYPE alpine_events_published_total counter alpine_events_published_total{event="",publisher="",} 1.0 # HELP alpine_notifications_published_total Total number of published notifications # TYPE alpine_notifications_published_total counter alpine_notifications_published_total{group="",level="",scope="",} 1.0 # HELP alpine_event_processing_seconds # TYPE alpine_event_processing_seconds summary alpine_event_processing_seconds_count{event="",subscriber="",} 1.0 alpine_event_processing_seconds_sum{event="",subscriber="",} 0.047599797 # HELP alpine_event_processing_seconds_max # TYPE alpine_event_processing_seconds_max gauge alpine_event_processing_seconds_max{event="",subscriber="",} 0.047599797 ``` > `alpine_notifications_published_total` will report all notifications, not only those for which an alert has been configured. Events and notifications are processed by [executors]. The executor *Alpine-EventService* corresponds to what is typically referred to as *worker pool*, and is responsible for executing the majority of events in Dependency-Track. *Alpine-SingleThreadedEventService* is a dedicated executor for events that can't safely be executed in parallel. The *SnykAnalysisTask* executor is used to perform API requests to [Snyk] (if enabled) in parallel, in order to work around the missing batch functionality in Snyk's REST API. The following executor metrics are available: ```yaml # HELP executor_pool_max_threads The maximum allowed number of threads in the pool # TYPE executor_pool_max_threads gauge executor_pool_max_threads{name="Alpine-NotificationService",} 4.0 executor_pool_max_threads{name="Alpine-SingleThreadedEventService",} 1.0 executor_pool_max_threads{name="Alpine-EventService",} 40.0 executor_pool_max_threads{name="SnykAnalysisTask",} 10.0 # HELP executor_pool_core_threads The core number of threads for the pool # TYPE executor_pool_core_threads gauge executor_pool_core_threads{name="Alpine-NotificationService",} 4.0 executor_pool_core_threads{name="Alpine-SingleThreadedEventService",} 1.0 executor_pool_core_threads{name="Alpine-EventService",} 40.0 executor_pool_core_threads{name="SnykAnalysisTask",} 10.0 # HELP executor_pool_size_threads The current number of threads in the pool # TYPE executor_pool_size_threads gauge executor_pool_size_threads{name="Alpine-NotificationService",} 0.0 executor_pool_size_threads{name="Alpine-SingleThreadedEventService",} 1.0 executor_pool_size_threads{name="Alpine-EventService",} 7.0 executor_pool_size_threads{name="SnykAnalysisTask",} 10.0 # HELP executor_active_threads The approximate number of threads that are actively executing tasks # TYPE executor_active_threads gauge executor_active_threads{name="Alpine-NotificationService",} 0.0 executor_active_threads{name="Alpine-SingleThreadedEventService",} 1.0 executor_active_threads{name="Alpine-EventService",} 2.0 executor_active_threads{name="SnykAnalysisTask",} 0.0 # HELP executor_completed_tasks_total The approximate total number of tasks that have completed execution # TYPE executor_completed_tasks_total counter executor_completed_tasks_total{name="Alpine-NotificationService",} 0.0 executor_completed_tasks_total{name="Alpine-SingleThreadedEventService",} 0.0 executor_completed_tasks_total{name="Alpine-EventService",} 5.0 executor_completed_tasks_total{name="SnykAnalysisTask",} 132.0 # HELP executor_queued_tasks The approximate number of tasks that are queued for execution # TYPE executor_queued_tasks gauge executor_queued_tasks{name="Alpine-NotificationService",} 0.0 executor_queued_tasks{name="Alpine-SingleThreadedEventService",} 160269.0 executor_queued_tasks{name="Alpine-EventService",} 0.0 executor_queued_tasks{name="SnykAnalysisTask",} 0.0 # HELP executor_queue_remaining_tasks The number of additional elements that this queue can ideally accept without blocking # TYPE executor_queue_remaining_tasks gauge executor_queue_remaining_tasks{name="Alpine-NotificationService",} 2.147483647E9 executor_queue_remaining_tasks{name="Alpine-SingleThreadedEventService",} 2.147323378E9 executor_queue_remaining_tasks{name="Alpine-EventService",} 2.147483647E9 executor_queue_remaining_tasks{name="SnykAnalysisTask",} 2.147483647E9 ``` Executor metrics are a good way to monitor how busy an API server instance is, and how good of a job it's doing keeping up with the work it's being exposed to. For example, a constantly maxed-out `executor_active_threads` value combined with a high number of `executor_queued_tasks` may indicate that the configured `alpine.worker.pool.size` is too small for the workload at hand. ##### Retries Dependency-Track will occasionally retry requests to external services. Metrics about this behavior are exposed in the following format: ``` resilience4j_retry_calls_total{kind="successful_with_retry",name="snyk-api",} 42.0 resilience4j_retry_calls_total{kind="failed_without_retry",name="snyk-api",} 0.0 resilience4j_retry_calls_total{kind="failed_with_retry",name="snyk-api",} 0.0 resilience4j_retry_calls_total{kind="successful_without_retry",name="snyk-api",} 9014.0 ``` Where `name` describes the remote endpoint that Dependency-Track uses retries for. #### Grafana Dashboard Because [Micrometer](https://micrometer.io/) is used to collect and expose metrics, common Grafana dashboards for Micrometer should just work. An [example dashboard] is provided as a quickstart. Refer to the [Grafana documentation] for instructions on how to import it. > The example dashboard is meant to be a starting point. Users are strongly encouraged to explore the available metrics > and build their own dashboards, tailored to their needs. The sample dashboard is not actively maintained by the project > team, however community contributions are more than welcome. ![System Metrics in Grafana]({{ site.baseurl }}/images/screenshots/monitoring-metrics-system.png) ![Event Metrics in Grafana]({{ site.baseurl }}/images/screenshots/monitoring-metrics-events.png) [community integrations]: {{ site.baseurl }}{% link _docs/integrations/community-integrations.md %} [Configuration]: {{ site.baseurl }}{% link _docs/getting-started/configuration.md %} [Connection Pooling]: {{ site.baseurl }}{% link _docs/getting-started/database-support.md %}#connection-pooling [default Logback configuration]: https://github.com/DependencyTrack/dependency-track/blob/master/src/main/docker/logback.xml [dependency-track-exporter]: https://github.com/jetstack/dependency-track-exporter [example dashboard]: {{ site.baseurl }}/files/grafana-dashboard.json [executors]: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/concurrent/ThreadPoolExecutor.html [Grafana]: https://grafana.com/ [Grafana documentation]: https://grafana.com/docs/grafana/latest/dashboards/export-import/#import-dashboard [Logback]: https://logback.qos.ch/ [`logback-json.xml`]: https://github.com/DependencyTrack/dependency-track/blob/master/src/main/docker/logback-json.xml [logstash-logback-encoder]: https://github.com/logfellow/logstash-logback-encoder [logstash-logback-encoder documentation]: https://github.com/logfellow/logstash-logback-encoder/tree/logstash-logback-encoder-7.3#loggingevent-fields [MicroProfile Health]: https://download.eclipse.org/microprofile/microprofile-health-3.1/microprofile-health-spec-3.1.html [MicroProfile Health REST interfaces specifications]: https://download.eclipse.org/microprofile/microprofile-health-3.1/microprofile-health-spec-3.1.html#_appendix_a_rest_interfaces_specifications [Prometheus]: https://prometheus.io/ [readiness check]: https://download.eclipse.org/microprofile/microprofile-health-3.1/microprofile-health-spec-3.1.html#_readiness_check [Snyk]: {{ site.baseurl }}{% link _docs/datasources/snyk.md %} [text-based exposition format]: https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format [thread states]: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Thread.State.html