---
stage: Security Risk Management
group: Security Insights
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments
title: Vulnerability tracking overview
---

At GitLab we run Git combined with automated security testing in Continuous
Integration and Continuous Delivery (CI/CD) processes. These processes
continuously monitor code changes to detect security vulnerabilities as early
as possible. Security testing often involves multiple Static Application
Security Testing (SAST) tools, each specialized in detecting specific
vulnerabilities, such as hardcoded passwords or insecure data flows. A
heterogeneous SAST setup, using multiple tools, helps minimize the software's
attack surface. The security findings from these tools undergo Vulnerability
Management, a semi-manual process of understanding, categorizing, storing, and
acting on them.

Code volatility (the constant change of the project's source code) and double reporting
(the overlap of findings reported by multiple tools) are potential sources of duplication,
imposing futile auditing effort on the analyst.

Vulnerability tracking is an automated process that helps deduplicate and
track vulnerabilities throughout the lifetime of a software project.

Our Vulnerability tracking method is based on [Scope+Offset](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator/-/blob/master/README.md) (internal).

The predecessor to the `Scope+Offset` method was line-based fingerprinting which is more
fragile, resulting in many already detected vulnerabilities to be re-introduced.
Avoiding duplication was the motivation for implementing the `Scope+Offset` method.
[See the corresponding research issue for more background](https://gitlab.com/groups/gitlab-org/-/epics/4626) (internal).

## Components

On a very high level, the vulnerability tracking flow is depicted below. For the remainder of this section, we assume that the SAST analyzer and the Tracking Calculator represent the tracking signature *producer* component and the Rails backend represents the tracking signature *consumer* component for the purposes Vulnerability tracking. The components are explained in more detail below.

``` mermaid
flowchart LR
  R["Repository"]
  S("SAST Analyzer [CI]")
  T("tracking-calculator [CI]")
  B("Rails backend")

  R --code--> S --gl-sast-report.json--> T --augmented gl-sast-report.json--> B
  R --code --> T
```

### Tracking signature producer

The SAST Analyzer runs in a CI context, analyzes the source code and produces a `gl-sast-report.json` file. The [Tracking Calculator](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator) computes scopes by means of the source code and matches them with the vulnerabilities listed in the `gl-sast-report.json`. If there is a match, Tracking Calculator computes signatures (by means of Scope+Offset) and includes each into the original report (augmenting `gl-sast-report`) by means of the `tracking` object (depicted below).

``` json
      "tracking": {
        "type": "source",
        "items": [
          {
            "file": "test.c",
            "line_start": 12,
            "line_end": 12,
            "signatures": [
              {
                "algorithm": "scope_offset_compressed",
                "value": "test.c|main()[0]:5"
              },
              {
                "algorithm": "scope_offset",
                "value": "test.c|main()[0]:8"
              }
            ]
          }
        ]
      }
```

Tracking Calculator is directly embedded into the [Docker image of the SAST Analyzer](https://gitlab.com/gitlab-org/security-products/analyzers/semgrep/-/blob/52bedd15745ddb6124662e0dcda331e2e64b000b/Dockerfile#L5) (internal)
and invoked by means of [this script](https://gitlab.com/gitlab-org/security-products/post-analyzers/scripts/-/blob/474cfd78054d97291155045eaef66aa3b7919368/start.sh).

Tracking Calculator already [performs deduplication](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator/-/blob/c7b6f255ad030e6b9da58c12fa87204b8df71129/trackinginfo/sast.go#L127)
that is enabled by default. In the example above we have two different
algorithms `scope_offset_compressed` and `scope_offset` where
`scope_offset_compressed` is considered an improvement of `scope_offset` so
that `scope_offset_compressed` is assigned a higher priority.

If `scope_offset` and `scope_offset_compressed` agree on the same fingerprint,
only the result from  `scope_offset_compressed` would be added as it is
considered the algorithm with the higher priority.

The report is then ingested into the consumer component where these signatures
are used to generate vulnerability fingerprints by means of the vulnerability
UUID.

---

### Tracking signature consumer

In the Rails code we differentiate between security findings (findings that
originate from the report) and vulnerability findings (persisted in the DB).
Security findings are generated when the [reports is parsed](https://gitlab.com/gitlab-org/gitlab/-/blob/e2f0c25d56d7ee5e85e00093331e55197fe66151/lib/gitlab/ci/parsers/security/common.rb#L98);
this is also the place where the [UUID is generated](https://gitlab.com/gitlab-org/gitlab/-/blob/415453f3bf788579f47fb8b471629beb1e063d56/app/services/security/vulnerability_uuid.rb#L6).

#### Storing security findings temporarily

The diagram below depicts the flow that is executed on all pipelines for
storing security findings temporarily. One of the most interesting Components
from the vulnerability tracking perspective is the `OverrideUuidsService`.
The `OverrideUuidsService` matches security findings against vulnerability findings on the signature level. If
there is a match, the UUID of the security finding is overwritten
accordingly. The `StoreFindingsService` stores the re-calibrated findings in
the `security_findings` table. Detailed documentation about how
vulnerabilities are created, starting from the security report, is available
[here](security_report_ingestion_overview.md#vulnerability-creation-from-security-reports).

Source Code References:

- [StoreScansWorker](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/workers/security/store_scans_worker.rb#L19)
- [StoreScansService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_scans_service.rb#L19)
- [StoreGroupedScansService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_grouped_scans_service.rb#L60)
- [StoreScanService](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/security/store_scan_service.rb#L47)
- [OverrideUuidsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/override_uuids_service.rb)
- [StoreFindingsService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_findings_service.rb)

``` mermaid
sequenceDiagram
    Producer->>Sidekiq: gl-sast-report.json
    Sidekiq->>StoreScansWorker: <<start>>
    StoreScansWorker->>StoreScansService: pipeline id
    loop for all artifacts in "grouped" artifacts
     StoreScansService->>StoreGroupedScansService: artifacts

     loop for every artifact in artifacts
        StoreGroupedScansService->>StoreScanService: artifact
        StoreScanService->>OverrideUuidsService: security-report

        StoreScanService->>StoreFindingsService: store findings
     end
    end
```

#### Scenario 2: Merge request security widget

The second scenario relates to the merge request security widget.

Source code references:

- [MergeRequest](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/app/models/merge_request.rb?page=2#L1975)
- [CompareSecurityReportsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/ee/app/services/ci/compare_security_reports_service.rb#L10)
- [VulnerabilityReportsComparer](https://gitlab.com/gitlab-org/gitlab/-/blob/da6e2037cd494ac8b73bc3ee9e69009c4cdcf124/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L96)

The `VulnerabilityReportsComparer` computes the number of newly added or fixed
findings. It first compares the security findings between default and
non-default branches to compute the number of added and fixed findings. This
component filters results by not re-displaying security findings that
correspond to vulnerability findings by [recalibrating the security finding UUIDs](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L70).
The logic implemented in the
[`UUIDOverrider`](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L161)
is very similar to
[OverrideUuidsService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_scan_service.rb#L47).

``` mermaid
sequenceDiagram
    MergeRequestModel->>CompareSecurityReportsService: compare_sast_reports
    CompareSecurityReportsService->>VulnerabilityReportsComparer: calculate_changes
```

#### Scenario 3: Report ingestion

This is the point where either a security finding becomes a vulnerability or the
vulnerability that corresponds to a security finding is updated. This scenario
becomes relevant when a pipeline triggered on the default branch upon merging a
non-default branch into the default branch. In our context, we are most
interested in those cases where we have security findings with
`overridden_uuid` set which implies that there was a clash with an already
existing vulnerability; `overridden_uuid` holds the UUID of the security
finding that was overridden by the corresponding vulnerability UUID.

The sequence below is executed to update the UUID of a vulnerability
(fingerprint). The recomputation takes place in the
`UpdateVulnerabilityUuids`, ultimately invoking a database update by means of
[`UpdateVulnerabilityUuidsVulnerabilityFinding` class](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/tasks/update_vulnerability_uuids/vulnerability_findings.rb).

Source Code References:

- [IngestReportsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_reports_service.rb#L55)
- [IngestReportService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_report_service.rb#L41)
- [IngestReportSliceService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_report_slice_service.rb#L37)
- [UpdateVulnerabilityUuids](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/tasks/update_vulnerability_uuids.rb#L67)
- [FindingMap](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/finding_map.rb)

``` mermaid
sequenceDiagram
    IngestReportsService->>IngestReportService: security_scan
    IngestReportService->>IngestReportSliceService: sliced security_scan
    IngestReportSliceService->>UpdateVulnerabilityUuids: findings map
```

## Hierarchy: Why are algorithms prioritized and what is the impact of this prioritization?

The supported algorithms are defined in [`VulnerabilityFindingSignatureHelpers`](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/app/models/concerns/vulnerability_finding_signature_helpers.rb). Algorithms are assigned priorities (the integer values in the map below). A higher priority indicates that an algorithm is considered as better than a lower priority algorithm. In other words, going from a lower priority to a higher priority algorithms corresponds to `coarsening` (better deduplication performance) and going from a higher priority algorithm to a lower priority algorithm corresponds to a `refinement` (weaker deduplication performance).

``` ruby
  ALGORITHM_TYPES = {
    hash: 1,
    location: 2,
    scope_offset: 3,
    scope_offset_compressed: 4,
    rule_value: 5
  }.with_indifferent_access.freeze
```