gitlab-foss/redis.md at cba6b80657852f58cb77a9cda7cc48ef069e2696

mirror of https://gitlab.com/gitlab-org/gitlab-foss.git synced 2025-08-03 16:04:30 +00:00

Files

GitLab Bot 2b2746757e Add latest changes from gitlab-org/gitlab@master

2025-05-30 00:20:13 +00:00

371 lines

17 KiB

Markdown

Raw Blame History

 ---
 stage: none
 group: unassigned
 info: Any user with at least the Maintainer role can merge updates to this content. For details, see https://docs.gitlab.com/development/development_processes/#development-guidelines-review.
 title: Redis development guidelines
 ---
 ## Redis instances
 GitLab uses [Redis](https://redis.io) for the following distinct purposes:
 - [Caching](#caching) (mostly via `Rails.cache`).
 - As a job processing queue with [Sidekiq](sidekiq/_index.md).
 - To manage the shared application state.
 - To store CI trace chunks.
 - As a Pub/Sub queue backend for ActionCable.
 - Rate limiting state storage.
 - Sessions.
 In most environments (including the GDK), all of these point to the same
 Redis instance.
 On GitLab.com, we use [separate Redis instances](../administration/redis/replication_and_failover.md#running-multiple-redis-clusters).
 See the [Redis SRE guide](https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/redis/redis-survival-guide-for-sres.md)
 for more details on our setup.
 Every application process is configured to use the same Redis servers, so they
 can be used for inter-process communication in cases where [PostgreSQL](sql.md)
 is less appropriate. For example, transient state or data that is written much
 more often than it is read.
 If [Geo](geo.md) is enabled, each Geo site gets its own, independent Redis
 database.
 We have [development documentation on adding a new Redis instance](redis/new_redis_instance.md).
 ## Key naming
 Redis is a flat namespace with no hierarchy, which means we must pay attention
 to key names to avoid collisions. Typically we use colon-separated elements to
 provide a semblance of structure at application level. An example might be
 `projects:1:somekey`.
 Although we split our Redis usage by purpose into distinct categories, and
 those may map to separate Redis servers in a Highly Available
 configuration like GitLab.com, the default Omnibus and GDK setups share
 a single Redis server. This means that keys should **always** be
 globally unique across all categories.
 It is usually better to use immutable identifiers - project ID rather than
 full path, for instance - in Redis key names. If full path is used, the key
 stops being consulted if the project is renamed. If the contents of the key are
 invalidated by a name change, it is better to include a hook that expires
 the entry, instead of relying on the key changing.
 ### Multi-key commands
 GitLab supports Redis Cluster for [cache-related workloads](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/redis/cache.rb) type, introduced in [epic 878](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/878).
 This imposes an additional constraint on naming: where GitLab is performing
 operations that require several keys to be held on the same Redis server - for
 instance, diffing two sets held in Redis - the keys should ensure that by
 enclosing the changeable parts in curly braces.
 For example:
 ```plaintext
 project:{1}:set_a
 project:{1}:set_b
 project:{2}:set_c
 ```
 `set_a` and `set_b` are guaranteed to be held on the same Redis server, while `set_c` is not.
 Currently, we validate this in the development and test environments
 with the [`RedisClusterValidator`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/instrumentation/redis_cluster_validator.rb),
 which is enabled for the `cache` and `shared_state`
 [Redis instances](https://docs.gitlab.com/omnibus/settings/redis.html#running-with-multiple-redis-instances)..
 Developers are highly encouraged to use [hash-tags](https://redis.io/docs/latest/operate/oss_and_stack/reference/cluster-spec/#hash-tags)
 where appropriate to facilitate future adoption of Redis Cluster in more Redis types. For example, the Namespace model uses hash-tags
 for its [config cache keys](https://gitlab.com/gitlab-org/gitlab/-/blob/1a12337058f260d38405886d82da5e8bb5d8da0b/app/models/namespace.rb#L786).
 To perform multi-key commands, developers may use the [`.pipelined`](https://github.com/redis-rb/redis-cluster-client#interfaces) method which splits and sends commands to each node and aggregates replies.
 However, this does not work for [transactions](https://redis.io/docs/latest/develop/interact/transactions/) as Redis Cluster does not support cross-slot transactions.
 For `Rails.cache`, we handle the `MGET` command found in `read_multi_get` by [patching it](https://gitlab.com/gitlab-org/gitlab/-/blob/c2bad2aac25e2f2778897bd4759506a72b118b15/lib/gitlab/patch/redis_cache_store.rb#L10) to use the `.pipelined` method.
 The minimum size of the pipeline is set to 1000 commands and it can be adjusted by using the `GITLAB_REDIS_CLUSTER_PIPELINE_BATCH_LIMIT` environment variable.
 ## Redis in structured logging
 For GitLab Team Members: There are <i class="fa fa-youtube-play youtube" aria-hidden="true"></i>
 [basic](https://www.youtube.com/watch?v=Uhdj19Dc6vU) and
 <i class="fa fa-youtube-play youtube" aria-hidden="true"></i> [advanced](https://youtu.be/jw1Wv2IJxzs)
 videos that show how you can work with the Redis
 structured logging fields on GitLab.com.
 Our [structured logging](logging.md#use-structured-json-logging) for web
 requests and Sidekiq jobs contains fields for the duration, call count,
 bytes written, and bytes read per Redis instance, along with a total for
 all Redis instances. For a particular request, this might look like:
 | Field | Value |
 | --- | --- |
 | `json.queue_duration_s` | 0.01 |
 | `json.redis_cache_calls` | 1 |
 | `json.redis_cache_duration_s` | 0 |
 | `json.redis_cache_read_bytes` | 109 |
 | `json.redis_cache_write_bytes` | 49 |
 | `json.redis_calls` | 2 |
 | `json.redis_duration_s` | 0.001 |
 | `json.redis_read_bytes` | 111 |
 | `json.redis_shared_state_calls` | 1 |
 | `json.redis_shared_state_duration_s` | 0 |
 | `json.redis_shared_state_read_bytes` | 2 |
 | `json.redis_shared_state_write_bytes` | 206 |
 | `json.redis_write_bytes` | 255 |
 As all of these fields are indexed, it is then straightforward to
 investigate Redis usage in production. For instance, to find the
 requests that read the most data from the cache, we can just sort by
 `redis_cache_read_bytes` in descending order.
 ### The slow log
 {{< alert type="note" >}}
 There is a [video showing how to see the slow log](https://youtu.be/BBI68QuYRH8) (GitLab internal)
 on GitLab.com
 {{< /alert >}}
 On GitLab.com, entries from the [Redis slow log](https://redis.io/docs/latest/commands/slowlog/) are available in the
 `pubsub-redis-inf-gprd*` index with the [`redis.slowlog` tag](https://log.gprd.gitlab.net/app/kibana#/discover?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-1d,to:now))&_a=(columns:!(json.type,json.command,json.exec_time_s),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:AWSQX_Vf93rHTYrsexmk,key:json.tag,negate:!f,params:(query:redis.slowlog),type:phrase),query:(match:(json.tag:(query:redis.slowlog,type:phrase))))),index:AWSQX_Vf93rHTYrsexmk)).
 This shows commands that have taken a long time and may be a performance
 concern.
 The [`fluent-plugin-redis-slowlog`](https://gitlab.com/gitlab-org/ruby/gems/fluent-plugin-redis-slowlog)
 project is responsible for taking the `slowlog` entries from Redis and
 passing to Fluentd (and ultimately Elasticsearch).
 ## Analyzing the entire keyspace
 The [Redis Keyspace Analyzer](https://gitlab.com/gitlab-com/gl-infra/redis-keyspace-analyzer)
 project contains tools for dumping the full key list and memory usage of a Redis
 instance, and then analyzing those lists while eliminating potentially sensitive
 data from the results. It can be used to find the most frequent key patterns, or
 those that use the most memory.
 Currently this is not run automatically for the GitLab.com Redis instances, but
 is run manually on an as-needed basis.
 ## N+1 calls problem
 {{< history >}}
 - Introduced in [`spec/support/helpers/redis_commands/recorder.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/support/helpers/redis_commands/recorder.rb) via [`f696f670`](https://gitlab.com/gitlab-org/gitlab/-/commit/f696f670005435472354a3dc0c01aa271aef9e32)
 {{< /history >}}
 `RedisCommands::Recorder` is a tool for detecting Redis N+1 calls problem from tests.
 Redis is often used for caching purposes. Usually, cache calls are lightweight and
 cannot generate enough load to affect the Redis instance. However, it is still
 possible to trigger expensive cache recalculations without knowing that. Use this
 tool to analyze Redis calls, and define expected limits for them.
 ### Create a test
 It is implemented as a [`ActiveSupport::Notifications`](https://api.rubyonrails.org/classes/ActiveSupport/Notifications.html) instrumenter.
 You can create a test that verifies that a testable code only makes
 a single Redis call:
 ```ruby
 it 'avoids N+1 Redis calls' do
   control = RedisCommands::Recorder.new { visit_page }
   expect(control.count).to eq(1)
 end
 ```
 or a test that verifies the number of specific Redis calls:
 ```ruby
 it 'avoids N+1 sadd Redis calls' do
   control = RedisCommands::Recorder.new { visit_page }
   expect(control.by_command(:sadd).count).to eq(1)
 end
 ```
 You can also provide a pattern to capture only specific Redis calls:
 ```ruby
 it 'avoids N+1 Redis calls to forks_count key' do
   control = RedisCommands::Recorder.new(pattern: 'forks_count') { visit_page }
   expect(control.count).to eq(1)
 end
 ```
 You also can use special matchers `exceed_redis_calls_limit` and
 `exceed_redis_command_calls_limit` to define an upper limit for
 a number of Redis calls:
 ```ruby
 it 'avoids N+1 Redis calls' do
   control = RedisCommands::Recorder.new { visit_page }
   expect(control).not_to exceed_redis_calls_limit(1)
 end
 ```
 ```ruby
 it 'avoids N+1 sadd Redis calls' do
   control = RedisCommands::Recorder.new { visit_page }
   expect(control).not_to exceed_redis_command_calls_limit(:sadd, 1)
 end
 ```
 These tests can help to identify N+1 problems related to Redis calls,
 and make sure that the fix for them works as expected.
 ### See also
 - [Database query recorder](database/query_recorder.md)
 ## Caching
 The Redis instance used by `Rails.cache` can be
 [configured with a key eviction policy](https://docs.gitlab.com/omnibus/settings/redis/#setting-the-redis-cache-instance-as-an-lru),
 generally LRU, where the "least recently used" cache items are evicted (deleted) when a memory limit is reached.
 See GitLab.com's
 [key eviction configuration](https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/blob/f7c1494f5546b12fe12352590590b8a883aa6388/roles/gprd-base-db-redis-cluster-cache.json#L54)
 for its Redis instance used by `Rails.cache`, `redis-cluster-cache`. This Redis instance should not reach its
 [max memory limit](https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/blob/f7c1494f5546b12fe12352590590b8a883aa6388/roles/gprd-base-db-redis-cluster-cache.json#L54)
 because key eviction at maxmemory comes at the cost of latency while eviction is taking place, see [this issue](https://gitlab.com/gitlab-com/gl-infra/observability/team/-/issues/1601) for details. See [current memory usage for `redis-cluster-cache`](https://dashboards.gitlab.net/goto/1BcjmqhNR?orgId=1).
 As data in this cache can disappear earlier than its set expiration time,
 use `Rails.cache` for data that is truly cache-like and ephemeral.
 For data that should be reliably persisted in Redis rather than cached, you can use [`Gitlab::Redis::SharedState`](#gitlabrediscachesharedstatequeues).
 ### Utility classes
 We have some extra classes to help with specific use cases. These are
 mostly for fine-grained control of Redis usage, so they wouldn't be used
 in combination with the `Rails.cache` wrapper: we'd either use
 `Rails.cache` or these classes and literal Redis commands.
 We prefer using `Rails.cache` so we can reap the benefits of future
 optimizations done to Rails. Ruby objects are
 [marshalled](https://github.com/rails/rails/blob/v6.0.3.1/activesupport/lib/active_support/cache/redis_cache_store.rb#L447)
 when written to Redis, so we must pay attention to store neither huge objects,
 nor untrusted user input.
 Typically we would only use these classes when at least one of the
 following is true:
 . We want to manipulate data on a non-cache Redis instance.
 . `Rails.cache` does not support the operations we want to perform.
 #### `Gitlab::Redis::{Cache,SharedState,Queues}`
 These classes wrap the Redis instances (using
 [`Gitlab::Redis::Wrapper`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/redis/wrapper.rb))
 to make it convenient to work with them directly. The typical use is to
 call `.with` on the class, which takes a block that yields the Redis
 connection. For example:
 ```ruby
 # Get the value of `key` from the shared state (persistent) Redis
 Gitlab::Redis::SharedState.with { |redis| redis.get(key) }
 # Check if `value` is a member of the set `key`
 Gitlab::Redis::Cache.with { |redis| redis.sismember(key, value) }
 ```
 `Gitlab::Redis::Cache` [shares the same Redis instance](https://gitlab.com/gitlab-org/gitlab/-/blob/fbcb99a6edec13a4d26010ce927b705f13db1ef8/config/initializers/7_redis.rb#L18)
 as `Rails.cache`, and so has a key eviction policy if one [is configured](#caching). Use this class for data
 that is truly cache-like and could be regenerated if absent.
 Ensure you **always** set a TTL for keys when using this class
 as it does not set a default TTL, unlike `Rails.cache` whose default TTL
 [is 8 hours](https://gitlab.com/gitlab-org/gitlab/-/blob/a3e435da6e9f7c98dc05eccb1caa03c1aed5a2a8/lib/gitlab/redis/cache.rb#L26). Consider using an 8 hour TTL for general caching, this matches a workday and would mean that a user would generally only have one cache-miss per day for the same content.
 When you anticipate adding a large workload to the cache or are in doubt about its production impact, reach out to [`#g_durability`](https://gitlab.enterprise.slack.com/archives/C07U8G0LHEH).
 `Gitlab::Redis::SharedState` [will not be configured with a key eviction policy](https://docs.gitlab.com/omnibus/settings/redis/#setting-the-redis-cache-instance-as-an-lru).
 Use this class for data that cannot be regenerated and is expected to be persisted until its set expiration time.
 It also does not set a default TTL for keys, so a TTL should nearly always be set for keys when using this class.
 #### `Gitlab::Redis::Boolean`
 In Redis, every value is a string.
 [`Gitlab::Redis::Boolean`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/redis/boolean.rb)
 makes sure that booleans are encoded and decoded consistently.
 #### `Gitlab::Redis::HLL`
 The Redis [`PFCOUNT`](https://redis.io/docs/latest/commands/pfcount/),
 [`PFADD`](https://redis.io/docs/latest/commands/pfadd/), and
 [`PFMERGE`](https://redis.io/docs/latest/commands/pfmerge/) commands operate on
 HyperLogLogs, a data structure that allows estimating the number of unique
 elements with low memory usage. For more information,
 see [HyperLogLogs in Redis](https://thoughtbot.com/blog/hyperloglogs-in-redis).
 [`Gitlab::Redis::HLL`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/redis/hll.rb)
 provides a convenient interface for adding and counting values in HyperLogLogs.
 #### `Gitlab::SetCache`
 For cases where we need to efficiently check the whether an item is in a group
 of items, we can use a Redis set.
 [`Gitlab::SetCache`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/set_cache.rb)
 provides an `#include?` method that uses the
 [`SISMEMBER`](https://redis.io/docs/latest/commands/sismember/) command, as well as `#read`
 to fetch all entries in the set.
 This is used by the
 [`RepositorySetCache`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/repository_set_cache.rb)
 to provide a convenient way to use sets to cache repository data like branch
 names.
 ## Background migration
 Redis-based migrations involve using the `SCAN` command to scan the entire Redis instance for certain key patterns.
 For large Redis instances, the migration might [exceed the time limit](migration_style_guide.md#how-long-a-migration-should-take)
 for regular or post-deployment migrations. [`RedisMigrationWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/redis_migration_worker.rb)
 performs long-running Redis migrations as a background migration.
 To perform a background migration by creating a class:
 ```ruby
 module Gitlab
   module BackgroundMigration
     module Redis
       class BackfillCertainKey
         def perform(keys)
         # implement logic to clean up or backfill keys
         end
         def scan_match_pattern
         # define the match pattern for the `SCAN` command
         end
         def redis
         # define the exact Redis instance
         end
       end
     end
   end
 end
 ```
 To trigger the worker through a post-deployment migration:
 ```ruby
 class ExampleBackfill < Gitlab::Database::Migration[2.1]
   disable_ddl_transaction!
   MIGRATION='BackfillCertainKey'
   def up
     queue_redis_migration_job(MIGRATION)
   end
 end
 ```

371 lines 17 KiB Markdown Raw Blame History

371 lines

17 KiB

Markdown

Raw Blame History