gitlab-foss/group_hierarchy_optimization.md at 43d40db9058db04df6a582a834d9e91f8b1a3f81

mirror of https://gitlab.com/gitlab-org/gitlab-foss.git synced 2025-08-06 10:19:48 +00:00

Files

GitLab Bot a4308a3876 Add latest changes from gitlab-org/gitlab@master

2025-06-05 06:07:16 +00:00

231 lines

8.5 KiB

Markdown

Raw Blame History

 ---
 stage: Data Access
 group: Database Frameworks
 info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/development/development_processes/#development-guidelines-review.
 title: Group hierarchy query optimization
 ---
 This document describes the hierarchy cache optimization strategy that helps with loading all descendants (subgroups or projects) from large group hierarchies with minimal overhead. The optimization was implemented within this GitLab [epic](https://gitlab.com/groups/gitlab-org/-/epics/11469).
 The optimization is enabled automatically via the [`Namespaces::EnableDescendantsCacheCronWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/namespaces/enable_descendants_cache_cron_worker.rb?ref_type=heads) worker for group hierarchies with descendant counts above 700 (projects and groups). Enabling the optimization manually for smaller groups will likely not have noticeable effects.
 ## Performance comparison
 Loading all group IDs for the `gitlab-org` group, including itself and its descendants.
 {{< tabs >}}
 {{< tab title="Optimized cached query" >}}
 **42 buffers** (~336.00 KiB) from the buffer pool
 ```sql
 SELECT "namespaces"."id" FROM UNNEST(
   COALESCE(
     (
       SELECT ids FROM (
         SELECT "namespace_descendants"."self_and_descendant_group_ids" AS ids
         FROM "namespace_descendants"
         WHERE "namespace_descendants"."outdated_at" IS NULL AND
         "namespace_descendants"."namespace_id" = 22
       ) cached_query
     ),
     (
       SELECT ids
       FROM (
         SELECT ARRAY_AGG("namespaces"."id") AS ids
         FROM (
           SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
           FROM "namespaces"
           WHERE "namespaces"."type" = 'Group' AND
           (traversal_ids @> ('{22}'))
         ) namespaces
       ) consistent_query
     )
   )
 ) AS namespaces(id)
 ```
 ```plaintext
  Function Scan on unnest namespaces  (cost=1296.82..1296.92 rows=10 width=8) (actual time=0.193..0.236 rows=GROUP_COUNT loops=1)
    Buffers: shared hit=42
    I/O Timings: read=0.000 write=0.000
    InitPlan 1 (returns $0)
      ->  Index Scan using namespace_descendants_12_pkey on gitlab_partitions_static.namespace_descendants_12 namespace_descendants  (cost=0.14..3.16 rows=1 width=769) (actual time=0.022..0.023 rows=1 loops=1)
            Index Cond: (namespace_descendants.namespace_id = 9970)
            Filter: (namespace_descendants.outdated_at IS NULL)
            Rows Removed by Filter: 0
            Buffers: shared hit=5
            I/O Timings: read=0.000 write=0.000
    InitPlan 2 (returns $1)
      ->  Aggregate  (cost=1293.62..1293.63 rows=1 width=32) (actual time=0.000..0.000 rows=0 loops=0)
            I/O Timings: read=0.000 write=0.000
            ->  Bitmap Heap Scan on public.namespaces namespaces_1  (cost=62.00..1289.72 rows=781 width=28) (actual time=0.000..0.000 rows=0 loops=0)
                  I/O Timings: read=0.000 write=0.000
                  ->  Bitmap Index Scan using index_namespaces_on_traversal_ids_for_groups  (cost=0.00..61.81 rows=781 width=0) (actual time=0.000..0.000 rows=0 loops=0)
                        Index Cond: (namespaces_1.traversal_ids @> '{9970}'::integer[])
                        I/O Timings: read=0.000 write=0.000
 Settings: seq_page_cost = '4', effective_cache_size = '472585MB', jit = 'off', work_mem = '100MB', random_page_cost = '1.5'
 ```
 {{< /tab >}}
 {{< tab title="Traversal ids based lookup query" >}}
 **1037 buffers** (~8.10 MiB) from the buffer pool
 ```sql
 SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
 FROM "namespaces"
 WHERE "namespaces"."type" = 'Group' AND
 (traversal_ids @> ('{22}'))
 ```
 ```plaintext
  Bitmap Heap Scan on public.namespaces  (cost=62.00..1291.67 rows=781 width=4) (actual time=0.670..2.273 rows=GROUP_COUNT loops=1)
    Buffers: shared hit=1037
    I/O Timings: read=0.000 write=0.000
    ->  Bitmap Index Scan using index_namespaces_on_traversal_ids_for_groups  (cost=0.00..61.81 rows=781 width=0) (actual time=0.561..0.561 rows=1154 loops=1)
          Index Cond: (namespaces.traversal_ids @> '{9970}'::integer[])
          Buffers: shared hit=34
          I/O Timings: read=0.000 write=0.000
 Settings: work_mem = '100MB', random_page_cost = '1.5', seq_page_cost = '4', effective_cache_size = '472585MB', jit = 'off'
 ```
 {{< /tab >}}
 {{< /tabs >}}
 ## How to use the optimization
 The optimization will be automatically used if you use one of these ActiveRecord scopes:
 ```ruby
 # Loading all groups:
 group.self_and_descendants
 # Using the IDs in subqueries:
 group.self_and_descendant_ids
 NamespaceSetting.where(namespace_id: group.self_and_descendant_ids)
 # Loading all projects:
 group.all_projects
 # Using the IDs in subqueries
 MergeRequest.where(target_project_id: group.all_project_ids)
 ```
 ## Cache invalidation
 When the group hierarchy changes, for example when a new project or subgroup is added, the cache is invalidated within the same transaction. A periodic worker called [`Namespaces::ProcessOutdatedNamespaceDescendantsCronWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/namespaces/process_outdated_namespace_descendants_cron_worker.rb?ref_type=heads) will update the cache with a slight delay. The invalidation is implemented using ActiveRecord callbacks.
 While the cache is invalidated, the hierarchical database queries will continue returning consistent values using the uncached (unoptimized) `traversal_ids` based query query.
 ## Consistent queries
 The lookup queries implement an `||` (or) functionality in SQL which allows us to check for the cached values first. If those are not present, we fall back to a full lookup of all groups or projects in the hierarchy.
 For simplification, this is how we would implement the lookup in Ruby:
 ```ruby
 if cached? && cache_up_to_date?
   return cached_project_ids
 else
   return Project.where(...).pluck(:id)
 end
 ```
 In `SQL`, we leverage the `COALESCE` function, which returns the first non-NULL expression from a list of expressions. If the first expression is not NULL, the subsequent expressions are not evaluated.
 ```sql
 SELECT COALESCE(
   (SELECT 1), -- cached query
   (SELECT 2 FROM pg_sleep(5)) -- non-cached query
 )
 ```
 The query above returns immediately however, if the first subquery returns null, the DB will execute the second query:
 ```sql
 SELECT COALESCE(
   (SELECT NULL), -- cached query
   (SELECT 2 FROM pg_sleep(5)) -- non-cached query
 )
 ```
 ## The `namespace_descendants` database table
 The cached subgroup and project ids are stored in the `namespace_descendants` database table as arrays, the most important columns:
 - `namespace_id`: primary key, this can be a top-level group ID or a subgroup ID.
 - `self_and_descendant_group_ids`: all group IDs as an array
 - `all_project_ids`: all project IDs as an array
 - `outdated_at`: signals that the cache is outdated
 ## Cached database query
 The query consists of three parts:
 - cached query
 - fallback, non-cached query
 - outer query where additional filtering and data loading (`JOIN`) can be done
 Cached query:
 ```sql
 SELECT ids -- One row, array of ids
 FROM (
   SELECT "namespace_descendants"."self_and_descendant_group_ids" AS ids
   FROM "namespace_descendants"
   WHERE "namespace_descendants"."outdated_at" IS NULL AND
   "namespace_descendants"."namespace_id" = 22
 ) cached_query
 ```
 The query returns `NULL` when the cache is outdated or the cache record does not exist.
 Fallback query, based on the `traversal_ids` lookup:
 ```sql
 SELECT ids -- One row, array of ids
 FROM (
   SELECT ARRAY_AGG("namespaces"."id") AS ids
   FROM (
     SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
     FROM "namespaces"
     WHERE "namespaces"."type" = 'Group' AND
     (traversal_ids @> ('{22}'))
   ) namespaces
 )
 ```
 Final query, combining the queries into one:
 ```sql
 SELECT "namespaces"."id" FROM UNNEST(
   COALESCE(
     (
       SELECT ids FROM (
         SELECT "namespace_descendants"."self_and_descendant_group_ids" AS ids
         FROM "namespace_descendants"
         WHERE "namespace_descendants"."outdated_at" IS NULL AND
         "namespace_descendants"."namespace_id" = 22
       ) cached_query
     ),
     (
       SELECT ids
       FROM (
         SELECT ARRAY_AGG("namespaces"."id") AS ids
         FROM (
           SELECT namespaces.traversal_ids[array_length(namespaces.traversal_ids, 1)] AS id
           FROM "namespaces"
           WHERE "namespaces"."type" = 'Group' AND
           (traversal_ids @> ('{22}'))
         ) namespaces
       ) consistent_query
     )
   )
 ) AS namespaces(id)
 ```

231 lines 8.5 KiB Markdown Raw Blame History

231 lines

8.5 KiB

Markdown

Raw Blame History