gitlab-foss/offset_pagination_optimization.md at 2d39aad5e5ae079992523c918ede3a5afeac3706

mirror of https://gitlab.com/gitlab-org/gitlab-foss.git synced 2025-08-03 16:04:30 +00:00

Files

GitLab Bot 2b2746757e Add latest changes from gitlab-org/gitlab@master

2025-05-30 00:20:13 +00:00

168 lines

7.6 KiB

Markdown

Raw Blame History

 ---
 stage: Data Access
 group: Database Frameworks
 info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/development/development_processes/#development-guidelines-review.
 title: Offset pagination optimization
 ---
 In many REST APIs endpoints, we use [offset-based pagination](pagination_guidelines.md#offset-pagination) which uses the `page` URL parameter to paginate through the results. Offset pagination [scales linearly](pagination_guidelines.md#offset-on-a-large-dataset), the higher the page number, the slower the database query gets. This means that for large page numbers, the database query can time out. This usually happens when third-party integrations and scripts interact with the system as users are unlikely to deliberately visit a high page number.
 The ideal way of dealing with scalability problems related to offset pagination is switching to [keyset pagination](pagination_guidelines.md#keyset-pagination) however, this is means a breaking API change. To provide a temporary, stop-gap measure, you can use the [`Gitlab::Pagination::Offset::PaginationWithIndexOnlyScan`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/gitlab/pagination/offset/pagination_with_index_only_scan.rb) class. The optimization can help in certain cases to improve the performance of offset-paginated queries when high `OFFSET` value is present. The performance improvement means that the queries will continue to scale linearly with improved query timings, which means that reaching database timeout will happen at a much higher `page` number if it happens at all.
 ## Requirements for using the optimization
 The optimization avoids calling `SELECT *` when determining the records based on the `ORDER BY`, `OFFSET`, and `LIMIT` clauses and attempts to use an index only scan to reduce database I/O. To use the optimization, the same requirements must be met as for keyset pagination:
 - `ORDER BY` clause is present.
 - The `ORDER BY` clause uniquely identifies one database column.
   - Good, uses the primary key: `ORDER BY id`
   - Bad, `created_at` not unique: `ORDER BY created_at`
   - Good, there is a [tie-breaker](pagination_performance_guidelines.md#tie-breaker-column): `ORDER BY created_at, id`
 - The query is well-covered with a database index.
 ## How to use the optimizator class
 The optimizator class can be used with `ActiveRecord::Relation` objects, as a result, it will return an optimized, [kaminari-paginated](https://github.com/kaminari/kaminari) `ActiveRecord::Relation` object. In case the optimization cannot be applied, the original `ActiveRecord::Relation` object will be used for the pagination.
 Basic usage:
 ```ruby
 scope = Issue.where(project_id: 1).order(:id)
 records = Gitlab::Pagination::Offset::PaginationWithIndexOnlyScan.new(scope: scope, page: 5, per_page: 100).paginate_with_kaminari
 puts records.to_a
 ```
 Optimizations should be always rolled out with feature flags, you can also target the usage of the optimization when certain conditions are met.
 ```ruby
 # - Only apply optimization for large page number lookups
 # - When label_names filter parameter is given, the optimziation will not have effect (complex JOIN).
 if params[:page] > 100 && params[:label_names].blank? && Feature.enabled?(:my_optimized_offet_query)
   Gitlab::Pagination::Offset::PaginationWithIndexOnlyScan.new(scope: scope, page: params[:page], per_page: params[:per_page]).paginate_with_kaminari
 else
   scope.page(params[:page]).per(params[:per_page])
 end
 ```
 ## How does the optimization work
 The optimization takes the passed `ActiveRecord::Relation` object and moves it into a CTE. Within the CTE, the original query is altered to only
 select the `ORDER BY` columns. This will make it possible for the database to use index only scan.
 When the query is executed, the query within the CTE is evaluated first, the CTE will contain `LIMIT` number of rows with the selected columns.
 Using the `ORDER BY` values, a `LATERAL` query will locate the full rows one by one. `LATERAL` query is used here in order to force out
 a nested loop: for each row in the CTE, look up a full row in the table.
 Original query:
 - Reads `OFFSET + LIMIT` number of entries from the index.
 - Reads `OFFSET + LIMIT` number of rows from the table.
 Optimized query:
 - Reads `OFFSET + LIMIT` number of entries from the index.
 - Reads `LIMIT` number of rows from the table.
 ## Determine if the optimization helps
 By running `EXPLAIN (buffers, analyze)` on the database query with a high (100_000) `OFFSET` value, we can clearly see if the optimization helps.
 Look for the following:
 - The optimized query plan must have an index only scan node.
 - Comparing the cached buffer count and timing should be lower.
   - This can be done when executing the same query 2 or 3 times.
 Consider the following query:
 ```sql
 SELECT issues.*
 FROM issues
 ORDER BY id
 OFFSET 100000
 LIMIT 100
 ```
 It produces an execution plan which uses an index scan:
 ```plaintext
  Limit  (cost=27800.96..27828.77 rows=100 width=1491) (actual time=138.305..138.470 rows=100 loops=1)
    Buffers: shared hit=73212
    I/O Timings: read=0.000 write=0.000
    ->  Index Scan using issues_pkey on public.issues  (cost=0.57..26077453.90 rows=93802448 width=1491) (actual time=0.063..133.688 rows=100100 loops=1)
          Buffers: shared hit=73212
          I/O Timings: read=0.000 write=0.000
 Time: 143.779 ms
   - planning: 5.222 ms
   - execution: 138.557 ms
     - I/O read: 0.000 ms
     - I/O write: 0.000 ms
 Shared buffers:
   - hits: 73212 (~572.00 MiB) from the buffer pool
   - reads: 0 from the OS file cache, including disk I/O
   - dirtied: 0
   - writes: 0
 ```
 The optimized query:
 ```sql
 WITH index_only_scan_pagination_cte AS MATERIALIZED
   (SELECT id
    FROM issues
    ORDER BY id ASC
    LIMIT 100
    OFFSET 100000)
 SELECT issues.*
 FROM
   (SELECT id
    FROM index_only_scan_pagination_cte) index_only_scan_subquery,
      LATERAL
   (SELECT issues.*
    FROM issues
    WHERE issues.id = index_only_scan_subquery.id
    LIMIT 1) issues
 ```
 Execution plan:
 ```plaintext
  Nested Loop  (cost=2453.51..2815.44 rows=100 width=1491) (actual time=23.614..23.973 rows=100 loops=1)
    Buffers: shared hit=56167
    I/O Timings: read=0.000 write=0.000
    CTE index_only_scan_pagination_cte
      ->  Limit  (cost=2450.49..2452.94 rows=100 width=4) (actual time=23.590..23.621 rows=100 loops=1)
            Buffers: shared hit=55667
            I/O Timings: read=0.000 write=0.000
            ->  Index Only Scan using issues_pkey on public.issues issues_1  (cost=0.57..2298090.72 rows=93802448 width=4) (actual time=0.070..20.412 rows=100100 loops=1)
                  Heap Fetches: 1063
                  Buffers: shared hit=55667
                  I/O Timings: read=0.000 write=0.000
    ->  CTE Scan on index_only_scan_pagination_cte  (cost=0.00..2.00 rows=100 width=4) (actual time=23.593..23.641 rows=100 loops=1)
          Buffers: shared hit=55667
          I/O Timings: read=0.000 write=0.000
    ->  Limit  (cost=0.57..3.58 rows=1 width=1491) (actual time=0.003..0.003 rows=1 loops=100)
          Buffers: shared hit=500
          I/O Timings: read=0.000 write=0.000
          ->  Index Scan using issues_pkey on public.issues  (cost=0.57..3.58 rows=1 width=1491) (actual time=0.003..0.003 rows=1 loops=100)
                Index Cond: (issues.id = index_only_scan_pagination_cte.id)
                Buffers: shared hit=500
                I/O Timings: read=0.000 write=0.000
 Time: 29.562 ms
   - planning: 5.506 ms
   - execution: 24.056 ms
     - I/O read: 0.000 ms
     - I/O write: 0.000 ms
 Shared buffers:
   - hits: 56167 (~438.80 MiB) from the buffer pool
   - reads: 0 from the OS file cache, including disk I/O
   - dirtied: 0
   - writes: 0
 ```

168 lines 7.6 KiB Markdown Raw Blame History

168 lines

7.6 KiB

Markdown

Raw Blame History