4.9 KiB
How to
Set embedding model
Pre-requisite: the collection, queue and reference classes exist.
Add target field
Either add the target field for storing embeddings on the migration to create collection or by adding a separate migration:
Set vector field as part of create_collection
migration:
# frozen_string_literal: true
class CreateMergeRequests < ActiveContext::Migration[1.0]
milestone '18.0'
def migrate!
create_collection :merge_requests, number_of_partitions: 3 do |c|
c.bigint :issue_id, index: true
c.prefix :traversal_ids
c.vector :embedding_1, dimensions: 768
end
end
end
Separate migration helper coming soon.
Add MODELS hash to collection class
Add a hash entry on the collection class to indicate which target field and embedding model to select in the following format:
MODELS = { 0 => { field: :embedding_1, model: 'textembedding-gecko@003' } }
Set collection indexing_embedding_versions
Add a migration to set the indexing_embedding_versions
to [0]
for the collection. This will start indexing embeddings using MODELS[0]
.
# frozen_string_literal: true
class SetMergeRequestIndexEmbeddingVersionsTo0 < ActiveContext::Migration[1.0]
milestone '18.0'
def migrate!
update_collection_metadata(collection: collection, metadata: metadata)
end
def metadata
{ indexing_embedding_versions: [0] }
end
def collection
Ai::Context::Collections::MergeRequest
end
end
Add a migration to backfill target field
Backfill migration helper coming soon.
Set collection search_embedding_version
Add a migration to set the search_embedding_version
to 0
for the collection. This will allow knn searches to search on :embedding_1
using the model in MODELS[0]
.
# frozen_string_literal: true
class SetMergeRequestSearchEmbeddingVersionTo0 < ActiveContext::Migration[1.0]
milestone '18.0'
def migrate!
update_collection_metadata(collection: collection, metadata: metadata)
end
def metadata
{ search_embedding_version: 0 }
end
def collection
Ai::Context::Collections::MergeRequest
end
end
Once this migration is complete, a knn search can be performed as:
query = ActiveContext::Query.knn(content: "a question", limit: 5)
Ai::Context::Collections::MergeRequest.search(query: query, user: user)
Migrate from one embedding model to another
Add new target field
Add a migration to add the new target field, e.g. embedding_2
.
Migration helper to add new field coming soon.
Update MODELS hash
Add an entry to the collection's MODELS
hash for the new target field and model.
MODELS = {
0 => { field: :embedding_1, model: 'textembedding-gecko@003' },
1 => { field: :embedding_2, model: 'text-embedding-005' }
}
Set collection indexing_embedding_versions
Add a migration to set the indexing_embedding_versions
to [0, 1]
for the collection. This will now index embeddings for both fields.
# frozen_string_literal: true
class SetMergeRequestIndexEmbeddingVersionsTo01 < ActiveContext::Migration[1.0]
milestone '18.0'
def migrate!
update_collection_metadata(collection: collection, metadata: metadata)
end
def metadata
{ indexing_embedding_versions: [0, 1] }
end
def collection
Ai::Context::Collections::MergeRequest
end
end
Add a migration to backfill target field
Backfill migration helper coming soon.
Set collection search_embedding_version
Add a migration to set the search_embedding_version
to 1
for the collection. This will allow knn searches to search on :embedding_2
using the model in MODELS[1]
.
# frozen_string_literal: true
class SetMergeRequestSearchEmbeddingVersionTo1 < ActiveContext::Migration[1.0]
milestone '18.0'
def migrate!
update_collection_metadata(collection: collection, metadata: metadata)
end
def metadata
{ search_embedding_version: 1 }
end
def collection
Ai::Context::Collections::MergeRequest
end
end
Searches done using ActiveContext::Query.knn(content: "a question", limit: 5)
will now use the model in MODELS[1]
and search over embeddings in the :embedding_2
field.
Set collection indexing_embedding_versions
Add a migration to set the indexing_embedding_versions
to [1]
for the collection. This will stop indexing embeddings for MODELS[0]
.
# frozen_string_literal: true
class SetMergeRequestIndexEmbeddingVersionsTo01 < ActiveContext::Migration[1.0]
milestone '18.0'
def migrate!
update_collection_metadata(collection: collection, metadata: metadata)
end
def metadata
{ indexing_embedding_versions: [1] }
end
def collection
Ai::Context::Collections::MergeRequest
end
end
Nullify old target field
Migration helper to nullify field coming soon.