mirror of
https://github.com/nextcloud/documentation.git
synced 2025-08-20 14:15:54 +00:00
Merge pull request #11957 from nextcloud/docs/ai-updates
docs(AI/LLM2): Update requirements and document model configuration
This commit is contained in:
@ -18,10 +18,15 @@ Requirements
|
||||
|
||||
* This app is built as an External App and thus depends on AppAPI v2.3.0 or higher
|
||||
* Nextcloud AIO is supported
|
||||
* Using GPU is currently not supported
|
||||
* We currently support NVIDIA GPUs and x86_64 CPUs
|
||||
* GPU Sizing
|
||||
|
||||
* A NVIDIA GPU with at least 8GB VRAM
|
||||
* At least 12GB of system RAM
|
||||
|
||||
* CPU Sizing
|
||||
|
||||
* At least 12GB of system RAM
|
||||
* The more cores you have and the more powerful the CPU the better, we recommend 10-20 cores
|
||||
* The app will hog all cores by default, so it is usually better to run it on a separate machine
|
||||
|
||||
@ -42,6 +47,42 @@ This app allows supplying alternate LLM models as *gguf* files in the ``/nc_app_
|
||||
3. Restart the llm2 ExApp
|
||||
4. Select the new model in the Nextcloud AI admin settings
|
||||
|
||||
|
||||
Configuring alternate models
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Since every model requires slightly different inference parameters, you can pass along a configuration file for the alternate model files you supply.
|
||||
|
||||
The configuration file for a model file must have the same name as the model file but must end in ``.json`` instead of ``.gguf``.
|
||||
|
||||
The strings ``{system_prompt}`` and ``{user_prompt}`` are variables that will be filled in by the app, so they must be part of your prompt template.
|
||||
|
||||
Here is an example config file for Llama 2:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"prompt": "<|im_start|> system\n{system_prompt}\n<|im_end|>\n<|im_start|> user\n{user_prompt}\n<|im_end|>\n<|im_start|> assistant\n",
|
||||
"gpt4all_config": {
|
||||
"max_tokens": 4096,
|
||||
"n_predict": 2048,
|
||||
"stop": ["<|im_end|>"]
|
||||
}
|
||||
}
|
||||
|
||||
Here is an example configuration for Llama 3:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"prompt": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{user_prompt}<|eot_id|>\n\”<|start_header_id|>assistant<|end_header_id|>\n\n",
|
||||
"gpt4all_config": {
|
||||
"max_tokens": 8000,
|
||||
"n_predict": 4000,
|
||||
"stop": ["<|eot_id|>"]
|
||||
}
|
||||
}
|
||||
|
||||
Scaling
|
||||
-------
|
||||
|
||||
|
Reference in New Issue
Block a user