You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Builtin local LLM provider(llama.cpp) **Features**:
33
59
* By default, it automatically detects memory and GPU, and uses the best computing layer by default. It automatically allocates gpu - layers and context window size (it will adopt the largest possible value) to get the best performance from the hardware without manually configuring anything.
34
60
* It is recommended to configure the context window yourself.
35
61
* System security: Support for system template anti-injection (to avoid jailbreaking).
36
-
* Support for general tool invocation (Tool Funcs) of large models (only for builtin local LLM provider):
37
-
* Can be supported without specific training of large models, requiring strong instruction - following ability of the large model.
38
-
* Minimum adaptation for 3B models, recommended to use 7B and above.
39
-
* Dual permission control:
40
-
1. Scripts set the list of tools AI can use.
41
-
2. Users set the list of tools scripts can use.
42
-
* Support for General Thinking Mode (`shouldThink`) of large models (only for builtin local LLM provider):
43
-
* Can be supported without specific training of large models, requiring strong instruction - following ability of the large model.
44
-
* Answer first then think (`last`).
45
-
* Think first then answer(`first`).
46
-
* Think deeply then answer(`deep`): 7B and above.
62
+
* Support for general tool invocation (Tool Funcs) of any LLM models (only for **builtin local LLM provider**):
63
+
* Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
64
+
* Minimum adaptation for 3B model, recommended to use 7B and above.
65
+
* Dual permission control:
66
+
1. Scripts set the list of tools AI can use.
67
+
2. Users set the list of tools scripts can use.
68
+
* Support for General Thinking Mode (`shouldThink`) of large models (only for **builtin local LLM provider**):
69
+
* Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
70
+
* Answer first then think (`last`).
71
+
* Think first then answer(`first`).
72
+
* Think deeply then answer(`deep`): 7B and above.
47
73
* Package support.
48
74
* PPE supports direct invocation of wasm.
49
75
* Support for multiple structured response output format types(`response_format.type`):
@@ -58,7 +84,7 @@ Developing an intelligent application with AI Agent Script Engine involves just
58
84
* Select a parameter size based on your application's requirements; larger sizes offer better quality but consume more resources and increase response time...
59
85
* Choose the model's expertise: Different models are trained with distinct methods and datasets, resulting in unique capabilities...
60
86
* Optimize quantization: Higher levels of quantization (compression) result in faster speed and smaller size, but potentially lower accuracy...
61
-
* Decide on the optimal context window size (`content_size`): Typically, 2048 is sufficient; this parameter also influences model performance...
87
+
* Decide on the optimal context window size (`max_tokens`): Typically, 2048 is sufficient; this parameter also influences model performance...
62
88
* Use the client (`@offline-ai/cli`) directly to download the AI brain: `ai brain download`
63
89
* Create the ai application's agent script file and debug prompts using the client (`@offline-ai/cli`): `ai run your_script.ai.yaml --interactive --loglevel info`.
Copy file name to clipboardExpand all lines: guide.md
+2-29Lines changed: 2 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -196,16 +196,9 @@ user: |-
196
196
This statement represents what the user (role) says (message), and the message content can use [jinja2](https://wsgzao.github.io/post/jinja/) template syntax.
197
197
`|-`is YAML syntax, indicating a multi-line string with line breaks preserved.
198
198
199
-
Let's give it a try. First, confirm that the background `llama.cpp` brain server is already running:
0 commit comments