--- stage: Security Risk Management group: Security Insights info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments title: Vulnerability tracking overview --- At GitLab we run Git combined with automated security testing in Continuous Integration and Continuous Delivery (CI/CD) processes. These processes continuously monitor code changes to detect security vulnerabilities as early as possible. Security testing often involves multiple Static Application Security Testing (SAST) tools, each specialized in detecting specific vulnerabilities, such as hardcoded passwords or insecure data flows. A heterogeneous SAST setup, using multiple tools, helps minimize the software's attack surface. The security findings from these tools undergo Vulnerability Management, a semi-manual process of understanding, categorizing, storing, and acting on them. Code volatility (the constant change of the project's source code) and double reporting (the overlap of findings reported by multiple tools) are potential sources of duplication, imposing futile auditing effort on the analyst. Vulnerability tracking is an automated process that helps deduplicate and track vulnerabilities throughout the lifetime of a software project. Our Vulnerability tracking method is based on [Scope+Offset](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator/-/blob/master/README.md) (internal). The predecessor to the `Scope+Offset` method was line-based fingerprinting which is more fragile, resulting in many already detected vulnerabilities to be re-introduced. Avoiding duplication was the motivation for implementing the `Scope+Offset` method. [See the corresponding research issue for more background](https://gitlab.com/groups/gitlab-org/-/epics/4626) (internal). ## Components On a very high level, the vulnerability tracking flow is depicted below. For the remainder of this section, we assume that the SAST analyzer and the Tracking Calculator represent the tracking signature *producer* component and the Rails backend represents the tracking signature *consumer* component for the purposes Vulnerability tracking. The components are explained in more detail below. ``` mermaid flowchart LR R["Repository"] S("SAST Analyzer [CI]") T("tracking-calculator [CI]") B("Rails backend") R --code--> S --gl-sast-report.json--> T --augmented gl-sast-report.json--> B R --code --> T ``` ### Tracking signature producer The SAST Analyzer runs in a CI context, analyzes the source code and produces a `gl-sast-report.json` file. The [Tracking Calculator](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator) computes scopes by means of the source code and matches them with the vulnerabilities listed in the `gl-sast-report.json`. If there is a match, Tracking Calculator computes signatures (by means of Scope+Offset) and includes each into the original report (augmenting `gl-sast-report`) by means of the `tracking` object (depicted below). ``` json "tracking": { "type": "source", "items": [ { "file": "test.c", "line_start": 12, "line_end": 12, "signatures": [ { "algorithm": "scope_offset_compressed", "value": "test.c|main()[0]:5" }, { "algorithm": "scope_offset", "value": "test.c|main()[0]:8" } ] } ] } ``` Tracking Calculator is directly embedded into the [Docker image of the SAST Analyzer](https://gitlab.com/gitlab-org/security-products/analyzers/semgrep/-/blob/52bedd15745ddb6124662e0dcda331e2e64b000b/Dockerfile#L5) (internal) and invoked by means of [this script](https://gitlab.com/gitlab-org/security-products/post-analyzers/scripts/-/blob/474cfd78054d97291155045eaef66aa3b7919368/start.sh). Tracking Calculator already [performs deduplication](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator/-/blob/c7b6f255ad030e6b9da58c12fa87204b8df71129/trackinginfo/sast.go#L127) that is enabled by default. In the example above we have two different algorithms `scope_offset_compressed` and `scope_offset` where `scope_offset_compressed` is considered an improvement of `scope_offset` so that `scope_offset_compressed` is assigned a higher priority. If `scope_offset` and `scope_offset_compressed` agree on the same fingerprint, only the result from `scope_offset_compressed` would be added as it is considered the algorithm with the higher priority. The report is then ingested into the consumer component where these signatures are used to generate vulnerability fingerprints by means of the vulnerability UUID. --- ### Tracking signature consumer In the Rails code we differentiate between security findings (findings that originate from the report) and vulnerability findings (persisted in the DB). Security findings are generated when the [reports is parsed](https://gitlab.com/gitlab-org/gitlab/-/blob/e2f0c25d56d7ee5e85e00093331e55197fe66151/lib/gitlab/ci/parsers/security/common.rb#L98); this is also the place where the [UUID is generated](https://gitlab.com/gitlab-org/gitlab/-/blob/415453f3bf788579f47fb8b471629beb1e063d56/app/services/security/vulnerability_uuid.rb#L6). #### Storing security findings temporarily The diagram below depicts the flow that is executed on all pipelines for storing security findings temporarily. One of the most interesting Components from the vulnerability tracking perspective is the `OverrideUuidsService`. The `OverrideUuidsService` matches security findings against vulnerability findings on the signature level. If there is a match, the UUID of the security finding is overwritten accordingly. The `StoreFindingsService` stores the re-calibrated findings in the `security_findings` table. Detailed documentation about how vulnerabilities are created, starting from the security report, is available [here](security_report_ingestion_overview.md#vulnerability-creation-from-security-reports). Source Code References: - [StoreScansWorker](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/workers/security/store_scans_worker.rb#L19) - [StoreScansService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_scans_service.rb#L19) - [StoreGroupedScansService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_grouped_scans_service.rb#L60) - [StoreScanService](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/security/store_scan_service.rb#L47) - [OverrideUuidsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/override_uuids_service.rb) - [StoreFindingsService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_findings_service.rb) ``` mermaid sequenceDiagram Producer->>Sidekiq: gl-sast-report.json Sidekiq->>StoreScansWorker: <> StoreScansWorker->>StoreScansService: pipeline id loop for all artifacts in "grouped" artifacts StoreScansService->>StoreGroupedScansService: artifacts loop for every artifact in artifacts StoreGroupedScansService->>StoreScanService: artifact StoreScanService->>OverrideUuidsService: security-report StoreScanService->>StoreFindingsService: store findings end end ``` #### Scenario 2: Merge request security widget The second scenario relates to the merge request security widget. Source code references: - [MergeRequest](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/app/models/merge_request.rb?page=2#L1975) - [CompareSecurityReportsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/ee/app/services/ci/compare_security_reports_service.rb#L10) - [VulnerabilityReportsComparer](https://gitlab.com/gitlab-org/gitlab/-/blob/da6e2037cd494ac8b73bc3ee9e69009c4cdcf124/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L96) The `VulnerabilityReportsComparer` computes the number of newly added or fixed findings. It first compares the security findings between default and non-default branches to compute the number of added and fixed findings. This component filters results by not re-displaying security findings that correspond to vulnerability findings by [recalibrating the security finding UUIDs](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L70). The logic implemented in the [`UUIDOverrider`](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L161) is very similar to [OverrideUuidsService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_scan_service.rb#L47). ``` mermaid sequenceDiagram MergeRequestModel->>CompareSecurityReportsService: compare_sast_reports CompareSecurityReportsService->>VulnerabilityReportsComparer: calculate_changes ``` #### Scenario 3: Report ingestion This is the point where either a security finding becomes a vulnerability or the vulnerability that corresponds to a security finding is updated. This scenario becomes relevant when a pipeline triggered on the default branch upon merging a non-default branch into the default branch. In our context, we are most interested in those cases where we have security findings with `overridden_uuid` set which implies that there was a clash with an already existing vulnerability; `overridden_uuid` holds the UUID of the security finding that was overridden by the corresponding vulnerability UUID. The sequence below is executed to update the UUID of a vulnerability (fingerprint). The recomputation takes place in the `UpdateVulnerabilityUuids`, ultimately invoking a database update by means of [`UpdateVulnerabilityUuidsVulnerabilityFinding` class](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/tasks/update_vulnerability_uuids/vulnerability_findings.rb). Source Code References: - [IngestReportsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_reports_service.rb#L55) - [IngestReportService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_report_service.rb#L41) - [IngestReportSliceService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_report_slice_service.rb#L37) - [UpdateVulnerabilityUuids](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/tasks/update_vulnerability_uuids.rb#L67) - [FindingMap](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/finding_map.rb) ``` mermaid sequenceDiagram IngestReportsService->>IngestReportService: security_scan IngestReportService->>IngestReportSliceService: sliced security_scan IngestReportSliceService->>UpdateVulnerabilityUuids: findings map ``` ## Hierarchy: Why are algorithms prioritized and what is the impact of this prioritization? The supported algorithms are defined in [`VulnerabilityFindingSignatureHelpers`](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/app/models/concerns/vulnerability_finding_signature_helpers.rb). Algorithms are assigned priorities (the integer values in the map below). A higher priority indicates that an algorithm is considered as better than a lower priority algorithm. In other words, going from a lower priority to a higher priority algorithms corresponds to `coarsening` (better deduplication performance) and going from a higher priority algorithm to a lower priority algorithm corresponds to a `refinement` (weaker deduplication performance). ``` ruby ALGORITHM_TYPES = { hash: 1, location: 2, scope_offset: 3, scope_offset_compressed: 4, rule_value: 5 }.with_indifferent_access.freeze ```