gitlab-foss/vulnerability_tracking.md at a46ff9290cd4ff14330eb667761d20cda3e6e34c

mirror of https://gitlab.com/gitlab-org/gitlab-foss.git synced 2025-08-10 01:31:45 +00:00

Files

GitLab Bot 20dda16549 Add latest changes from gitlab-org/gitlab@master

2025-05-06 21:11:45 +00:00

209 lines

12 KiB

Markdown

Raw Blame History

 ---
 stage: Security Risk Management
 group: Security Insights
 info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://handbook.gitlab.com/handbook/product/ux/technical-writing/#assignments
 title: Vulnerability tracking overview
 ---
 At GitLab we run Git combined with automated security testing in Continuous
 Integration and Continuous Delivery (CI/CD) processes. These processes
 continuously monitor code changes to detect security vulnerabilities as early
 as possible. Security testing often involves multiple Static Application
 Security Testing (SAST) tools, each specialized in detecting specific
 vulnerabilities, such as hardcoded passwords or insecure data flows. A
 heterogeneous SAST setup, using multiple tools, helps minimize the software's
 attack surface. The security findings from these tools undergo Vulnerability
 Management, a semi-manual process of understanding, categorizing, storing, and
 acting on them.
 Code volatility (the constant change of the project's source code) and double reporting
 (the overlap of findings reported by multiple tools) are potential sources of duplication,
 imposing futile auditing effort on the analyst.
 Vulnerability tracking is an automated process that helps deduplicate and
 track vulnerabilities throughout the lifetime of a software project.
 Our Vulnerability tracking method is based on [Scope+Offset](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator/-/blob/master/README.md) (internal).
 The predecessor to the `Scope+Offset` method was line-based fingerprinting which is more
 fragile, resulting in many already detected vulnerabilities to be re-introduced.
 Avoiding duplication was the motivation for implementing the `Scope+Offset` method.
 [See the corresponding research issue for more background](https://gitlab.com/groups/gitlab-org/-/epics/4626) (internal).
 ## Components
 On a very high level, the vulnerability tracking flow is depicted below. For the remainder of this section, we assume that the SAST analyzer and the Tracking Calculator represent the tracking signature *producer* component and the Rails backend represents the tracking signature *consumer* component for the purposes Vulnerability tracking. The components are explained in more detail below.
 ``` mermaid
 flowchart LR
   R["Repository"]
   S("SAST Analyzer [CI]")
   T("tracking-calculator [CI]")
   B("Rails backend")
   R --code--> S --gl-sast-report.json--> T --augmented gl-sast-report.json--> B
   R --code --> T
 ```
 ### Tracking signature producer
 The SAST Analyzer runs in a CI context, analyzes the source code and produces a `gl-sast-report.json` file. The [Tracking Calculator](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator) computes scopes by means of the source code and matches them with the vulnerabilities listed in the `gl-sast-report.json`. If there is a match, Tracking Calculator computes signatures (by means of Scope+Offset) and includes each into the original report (augmenting `gl-sast-report`) by means of the `tracking` object (depicted below).
 ``` json
       "tracking": {
         "type": "source",
         "items": [
           {
             "file": "test.c",
             "line_start": 12,
             "line_end": 12,
             "signatures": [
               {
                 "algorithm": "scope_offset_compressed",
                 "value": "test.c|main()[0]:5"
               },
               {
                 "algorithm": "scope_offset",
                 "value": "test.c|main()[0]:8"
               }
             ]
           }
         ]
       }
 ```
 Tracking Calculator is directly embedded into the [Docker image of the SAST Analyzer](https://gitlab.com/gitlab-org/security-products/analyzers/semgrep/-/blob/52bedd15745ddb6124662e0dcda331e2e64b000b/Dockerfile#L5) (internal)
 and invoked by means of [this script](https://gitlab.com/gitlab-org/security-products/post-analyzers/scripts/-/blob/474cfd78054d97291155045eaef66aa3b7919368/start.sh).
 Tracking Calculator already [performs deduplication](https://gitlab.com/gitlab-org/security-products/post-analyzers/tracking-calculator/-/blob/c7b6f255ad030e6b9da58c12fa87204b8df71129/trackinginfo/sast.go#L127)
 that is enabled by default. In the example above we have two different
 algorithms `scope_offset_compressed` and `scope_offset` where
 `scope_offset_compressed` is considered an improvement of `scope_offset` so
 that `scope_offset_compressed` is assigned a higher priority.
 If `scope_offset` and `scope_offset_compressed` agree on the same fingerprint,
 only the result from  `scope_offset_compressed` would be added as it is
 considered the algorithm with the higher priority.
 The report is then ingested into the consumer component where these signatures
 are used to generate vulnerability fingerprints by means of the vulnerability
 UUID.
 ---
 ### Tracking signature consumer
 In the Rails code we differentiate between security findings (findings that
 originate from the report) and vulnerability findings (persisted in the DB).
 Security findings are generated when the [reports is parsed](https://gitlab.com/gitlab-org/gitlab/-/blob/e2f0c25d56d7ee5e85e00093331e55197fe66151/lib/gitlab/ci/parsers/security/common.rb#L98);
 this is also the place where the [UUID is generated](https://gitlab.com/gitlab-org/gitlab/-/blob/415453f3bf788579f47fb8b471629beb1e063d56/app/services/security/vulnerability_uuid.rb#L6).
 #### Storing security findings temporarily
 The diagram below depicts the flow that is executed on all pipelines for
 storing security findings temporarily. One of the most interesting Components
 from the vulnerability tracking perspective is the `OverrideUuidsService`.
 The `OverrideUuidsService` matches security findings against vulnerability findings on the signature level. If
 there is a match, the UUID of the security finding is overwritten
 accordingly. The `StoreFindingsService` stores the re-calibrated findings in
 the `security_findings` table. Detailed documentation about how
 vulnerabilities are created, starting from the security report, is available
 [here](security_report_ingestion_overview.md#vulnerability-creation-from-security-reports).
 Source Code References:
 - [StoreScansWorker](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/workers/security/store_scans_worker.rb#L19)
 - [StoreScansService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_scans_service.rb#L19)
 - [StoreGroupedScansService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_grouped_scans_service.rb#L60)
 - [StoreScanService](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/services/security/store_scan_service.rb#L47)
 - [OverrideUuidsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/override_uuids_service.rb)
 - [StoreFindingsService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_findings_service.rb)
 ``` mermaid
 sequenceDiagram
     Producer->>Sidekiq: gl-sast-report.json
     Sidekiq->>StoreScansWorker: <<start>>
     StoreScansWorker->>StoreScansService: pipeline id
     loop for all artifacts in "grouped" artifacts
      StoreScansService->>StoreGroupedScansService: artifacts
      loop for every artifact in artifacts
         StoreGroupedScansService->>StoreScanService: artifact
         StoreScanService->>OverrideUuidsService: security-report
         StoreScanService->>StoreFindingsService: store findings
      end
     end
 ```
 #### Scenario 2: Merge request security widget
 The second scenario relates to the merge request security widget.
 Source code references:
 - [MergeRequest](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/app/models/merge_request.rb?page=2#L1975)
 - [CompareSecurityReportsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/ee/app/services/ci/compare_security_reports_service.rb#L10)
 - [VulnerabilityReportsComparer](https://gitlab.com/gitlab-org/gitlab/-/blob/da6e2037cd494ac8b73bc3ee9e69009c4cdcf124/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L96)
 The `VulnerabilityReportsComparer` computes the number of newly added or fixed
 findings. It first compares the security findings between default and
 non-default branches to compute the number of added and fixed findings. This
 component filters results by not re-displaying security findings that
 correspond to vulnerability findings by [recalibrating the security finding UUIDs](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L70).
 The logic implemented in the
 [`UUIDOverrider`](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/ee/lib/gitlab/ci/reports/security/vulnerability_reports_comparer.rb#L161)
 is very similar to
 [OverrideUuidsService](https://gitlab.com/gitlab-org/gitlab/-/blob/308529403c2d5ec0049b223cf444163bede4672e/ee/app/services/security/store_scan_service.rb#L47).
 ``` mermaid
 sequenceDiagram
     MergeRequestModel->>CompareSecurityReportsService: compare_sast_reports
     CompareSecurityReportsService->>VulnerabilityReportsComparer: calculate_changes
 ```
 #### Scenario 3: Report ingestion
 This is the point where either a security finding becomes a vulnerability or the
 vulnerability that corresponds to a security finding is updated. This scenario
 becomes relevant when a pipeline triggered on the default branch upon merging a
 non-default branch into the default branch. In our context, we are most
 interested in those cases where we have security findings with
 `overridden_uuid` set which implies that there was a clash with an already
 existing vulnerability; `overridden_uuid` holds the UUID of the security
 finding that was overridden by the corresponding vulnerability UUID.
 The sequence below is executed to update the UUID of a vulnerability
 (fingerprint). The recomputation takes place in the
 `UpdateVulnerabilityUuids`, ultimately invoking a database update by means of
 [`UpdateVulnerabilityUuidsVulnerabilityFinding` class](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/tasks/update_vulnerability_uuids/vulnerability_findings.rb).
 Source Code References:
 - [IngestReportsService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_reports_service.rb#L55)
 - [IngestReportService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_report_service.rb#L41)
 - [IngestReportSliceService](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/ingest_report_slice_service.rb#L37)
 - [UpdateVulnerabilityUuids](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/tasks/update_vulnerability_uuids.rb#L67)
 - [FindingMap](https://gitlab.com/gitlab-org/gitlab/-/blob/1b2cc434e43b533c0b393b8c319797e69745498e/ee/app/services/security/ingestion/finding_map.rb)
 ``` mermaid
 sequenceDiagram
     IngestReportsService->>IngestReportService: security_scan
     IngestReportService->>IngestReportSliceService: sliced security_scan
     IngestReportSliceService->>UpdateVulnerabilityUuids: findings map
 ```
 ## Hierarchy: Why are algorithms prioritized and what is the impact of this prioritization?
 The supported algorithms are defined in [`VulnerabilityFindingSignatureHelpers`](https://gitlab.com/gitlab-org/gitlab/-/blob/1172e63f2485b8f3690895a3798f067429d98732/app/models/concerns/vulnerability_finding_signature_helpers.rb). Algorithms are assigned priorities (the integer values in the map below). A higher priority indicates that an algorithm is considered as better than a lower priority algorithm. In other words, going from a lower priority to a higher priority algorithms corresponds to `coarsening` (better deduplication performance) and going from a higher priority algorithm to a lower priority algorithm corresponds to a `refinement` (weaker deduplication performance).
 ``` ruby
   ALGORITHM_TYPES = {
     hash: 1,
     location: 2,
     scope_offset: 3,
     scope_offset_compressed: 4,
     rule_value: 5
   }.with_indifferent_access.freeze
 ```

209 lines 12 KiB Markdown Raw Blame History

209 lines

12 KiB

Markdown

Raw Blame History