Skip to content

Commit a4390a7

Browse files
committed
docs: change for local provider
1 parent 0984107 commit a4390a7

File tree

4 files changed

+134
-134
lines changed

4 files changed

+134
-134
lines changed

README.cn.md

Lines changed: 44 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -26,24 +26,53 @@ AI Agent 脚本引擎特点:
2626
* [可编程提示词工程测试用例单元测试](https://github.com/offline-ai/cli-plugin-cmd-test.js)
2727
* Fixtures Demo: https://github.com/offline-ai/cli/tree/main/examples/split-text-paragraphs
2828
* 智能缓存LLM大模型以及智能体调用结果,加速运行以及减少tokens开销
29-
* 内置本地LLM提供者(llama.cpp): 不再需要额外安装llama.cpp server.
30-
* `ai brain download hf://bartowski/Qwen_QwQ-32B-GGUF -q q4_0`
31-
* `ai run example.ai.yaml -P local://bartowski-qwq-32b.Q4_0.gguf`
32-
* 可以在PPE脚本中指定或任意切换LLM大模型文件了
29+
* 支持多LLM服务提供商:
30+
***推荐****内置本地LLM提供商(llama.cpp)**作为默认选项,以保护知识的安全性和隐私。
31+
* 首先下载GGUF模型文件:`ai brain download hf://bartowski/Qwen_QwQ-32B-GGUF -q q4_0`
32+
* 使用默认的大脑模型文件运行:`ai run example.ai.yaml`
33+
* 使用指定的模型文件运行:`ai run example.ai.yaml -P local://bartowski-qwq-32b.Q4_0.gguf`
34+
* 兼容OpenAI的服务提供商:
35+
* OpenAI: `ai run example.ai.yaml -P openai://chatgpt-4o-latest --apiKey “sk-XXX”`
36+
* DeepSeek: `ai run example.ai.yaml -P openai://deepseek-chat -u https://api.deepseek.com/ --apiKey “sk-XXX”`
37+
* Siliconflow: `ai run example.ai.yaml -P openai://Qwen/Qwen2.5-Coder-7B-Instruct -u https://api.siliconflow.cn/ --apiKey “sk-XXX”`
38+
* Anthropic(Claude): `ai run example.ai.yaml -P openai://claude-3-7-sonnet-latest -u https://api.anthropic.com/v1/ --apiKey “sk-XXX”`
39+
* [llama-cpp服务器(llama-server)提供商](https://github.com/ggml-org/llama.cpp/tree/master/examples/server)`ai run example.ai.yaml -P llamacpp`
40+
* llama-cpp服务器不支持指定模型名称,它是在启动llama-server时通过 model 参数指定的。
41+
* 您可以在PPE脚本中指定或任意切换*LLM模型或提供商*:
42+
43+
```yaml
44+
---
45+
parameters:
46+
model: openai://deepseek-chat
47+
apiUrl: https://api.deepseek.com/
48+
apiKey: "sk-XXX"
49+
---
50+
system: You are a helpful assistant.
51+
user: "tell me a joke"
52+
--- # first dialog begin
53+
assistant: "[[AI]]"
54+
--- # reset to first
55+
assistant: "[[AI:model='openai://claude-3-7-sonnet-latest',apiUrl='https://api.anthropic.com/v1/',apiKey='sk-XXX']]"
56+
--- # reset to first
57+
assistant: "[[AI:model='local://bartowski-qwq-32b.Q4_0.gguf']]"
58+
```
59+
60+
* **内置本地LLM提供商(llama.cpp)** 功能特性
3361
* 默认自动检测内存和GPU,并默认使用最佳计算层,自动分配gpu-layers以及上下文窗口大小(会采用尽可能大的值),以便从硬件中获得最佳性能,无需手动配置任何内容。
3462
* 建议上下文窗口自行配置
3563
* 系统安全:系统模板反注入(避免越狱)支持
36-
* 大模型通用工具调用(Tool Funcs)支持(仅限内置本地LLM提供者)
37-
* 无需大模型专门训练即可支持,要求大模型指令遵循能力强
38-
* 最小适配3B模型,推荐使用7B及以上
39-
* 双重权限控制:
40-
1. 脚本设定AI能够使用的工具列表
41-
2. 用户设定脚本能使用的工具列表
42-
* 大模型通用思维模式(`shouldThink`)支持(仅限内置本地LLM提供者):
43-
* 无需大模型专门训练即可支持,要求大模型指令遵循能力强
44-
* 先回答再思考(`last`
45-
* 先思考再回答(`first`)
46-
* 深度思考后再回答(`deep`): 7B及以上
64+
* 任意大模型通用工具调用(Tool Funcs)支持
65+
* 无需大模型专门训练即可支持,要求大模型指令遵循能力强
66+
* 最小适配3B模型,推荐使用7B及以上
67+
* 双重权限控制:
68+
1. 脚本设定AI能够使用的工具列表
69+
2. 用户设定脚本能使用的工具列表
70+
* 任意大模型通用思维模式(`shouldThink`)支持
71+
* 无需大模型专门训练即可支持,要求大模型指令遵循能力强
72+
* 最小适配3B模型,推荐使用7B及以上
73+
* 先回答再思考(`last`)
74+
* 先思考再回答(`first`)
75+
* 深度思考后再回答(`deep`): 7B及以上
4776
* Package 支持
4877
* PPE支持直接调用 wasm
4978
* 多种结构化响应输出格式类型(`response_format.type`)支持:

README.md

Lines changed: 84 additions & 58 deletions
Original file line numberDiff line numberDiff line change
@@ -26,24 +26,50 @@ Enjoying this project? Please star it! 🌟
2626
* [PPE Fixtures Unit Test](https://github.com/offline-ai/cli-plugin-cmd-test.js)
2727
* Unit Test Fixture Demo: https://github.com/offline-ai/cli/tree/main/examples/split-text-paragraphs
2828
* Smart caching of LLM large models and intelligent agent invocation results to accelerate execution and reduce token expenses.
29-
* Builtin local LLM provider(llama.cpp): No need to install llama.cpp server additionally.
30-
* `ai brain download hf://bartowski/Qwen_QwQ-32B-GGUF -q q4_0`
31-
* `ai run example.ai.yaml -P local://bartowski-qwq-32b.Q4_0.gguf`
32-
* You can specify or arbitrarily switch LLM large model files in the PPE script.
29+
* Support for Multi LLM Service Providers:
30+
* (**Recommended**) Builtin local LLM provider(llama.cpp) as default to protect the security and privacy of the knowledge.
31+
* Download GGUF model file first: `ai brain download hf://bartowski/Qwen_QwQ-32B-GGUF -q q4_0`
32+
* Run with the default brain model file: `ai run example.ai.yaml`
33+
* Run with specified the model file: `ai run example.ai.yaml -P local://bartowski-qwq-32b.Q4_0.gguf`
34+
* OpenAI Compatible Service Provider:
35+
* OpenAI: `ai run example.ai.yaml -P openai://chatgpt-4o-latest --apiKey “sk-XXX”`
36+
* DeepSeek: `ai run example.ai.yaml -P openai://deepseek-chat -u https://api.deepseek.com/ --apiKey “sk-XXX”`
37+
* Siliconflow: `ai run example.ai.yaml -P openai://Qwen/Qwen2.5-Coder-7B-Instruct -u https://api.siliconflow.cn/ --apiKey “sk-XXX”`
38+
* Anthropic(Claude): `ai run example.ai.yaml -P openai://claude-3-7-sonnet-latest -u https://api.anthropic.com/v1/ --apiKey “sk-XXX”`
39+
* [llama-cpp Server(llama-server) Provider](https://github.com/ggml-org/llama.cpp/tree/master/examples/server): `ai run example.ai.yaml -P llamacpp`
40+
* llama-cpp Server does not support specifying model name, It is specified with the model parameter when llama-server is started.
41+
* You can specify or arbitrarily switch *LLM model or provider* in the PPE script.
42+
43+
```yaml
44+
---
45+
parameters:
46+
model: openai://deepseek-chat
47+
apiUrl: https://api.deepseek.com/
48+
apiKey: "sk-XXX"
49+
---
50+
system: You are a helpful assistant.
51+
user: "tell me a joke"
52+
---
53+
assistant: "[[AI]]"
54+
---
55+
assistant: "[[AI:model='local://bartowski-qwq-32b.Q4_0.gguf']]"
56+
```
57+
58+
* Builtin local LLM provider(llama.cpp) **Features**:
3359
* By default, it automatically detects memory and GPU, and uses the best computing layer by default. It automatically allocates gpu - layers and context window size (it will adopt the largest possible value) to get the best performance from the hardware without manually configuring anything.
3460
* It is recommended to configure the context window yourself.
3561
* System security: Support for system template anti-injection (to avoid jailbreaking).
36-
* Support for general tool invocation (Tool Funcs) of large models (only for builtin local LLM provider):
37-
* Can be supported without specific training of large models, requiring strong instruction - following ability of the large model.
38-
* Minimum adaptation for 3B models, recommended to use 7B and above.
39-
* Dual permission control:
40-
1. Scripts set the list of tools AI can use.
41-
2. Users set the list of tools scripts can use.
42-
* Support for General Thinking Mode (`shouldThink`) of large models (only for builtin local LLM provider):
43-
* Can be supported without specific training of large models, requiring strong instruction - following ability of the large model.
44-
* Answer first then think (`last`).
45-
* Think first then answer(`first`).
46-
* Think deeply then answer(`deep`): 7B and above.
62+
* Support for general tool invocation (Tool Funcs) of any LLM models (only for **builtin local LLM provider**):
63+
* Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
64+
* Minimum adaptation for 3B model, recommended to use 7B and above.
65+
* Dual permission control:
66+
1. Scripts set the list of tools AI can use.
67+
2. Users set the list of tools scripts can use.
68+
* Support for General Thinking Mode (`shouldThink`) of large models (only for **builtin local LLM provider**):
69+
* Can be supported without specific training of LLM, requiring LLM can accurately follow instructions.
70+
* Answer first then think (`last`).
71+
* Think first then answer(`first`).
72+
* Think deeply then answer(`deep`): 7B and above.
4773
* Package support.
4874
* PPE supports direct invocation of wasm.
4975
* Support for multiple structured response output format types(`response_format.type`):
@@ -58,7 +84,7 @@ Developing an intelligent application with AI Agent Script Engine involves just
5884
* Select a parameter size based on your application's requirements; larger sizes offer better quality but consume more resources and increase response time...
5985
* Choose the model's expertise: Different models are trained with distinct methods and datasets, resulting in unique capabilities...
6086
* Optimize quantization: Higher levels of quantization (compression) result in faster speed and smaller size, but potentially lower accuracy...
61-
* Decide on the optimal context window size (`content_size`): Typically, 2048 is sufficient; this parameter also influences model performance...
87+
* Decide on the optimal context window size (`max_tokens`): Typically, 2048 is sufficient; this parameter also influences model performance...
6288
* Use the client (`@offline-ai/cli`) directly to download the AI brain: `ai brain download`
6389
* Create the ai application's agent script file and debug prompts using the client (`@offline-ai/cli`): `ai run your_script.ai.yaml --interactive --loglevel info`.
6490
* Integrate the script into your ai application.
@@ -160,11 +186,6 @@ Downloading https://huggingface.co/QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2/r
160186
1. https://huggingface.co/QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2/resolve/main/Phi-3-mini-4k-instruct.Q4_0.gguf
161187
~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
162188
done
163-
mkdir llamacpp
164-
cd llamacpp
165-
#goto https://github.com/ggerganov/llama.cpp/releases/latest download latest release
166-
wget https://github.com/ggerganov/llama.cpp/releases/download/b3563/llama-b3563-bin-ubuntu-x64.zip
167-
unzip llama-b3563-bin-ubuntu-x64.zip
168189
```
169190

170191
Upgrade:
@@ -176,16 +197,7 @@ npm install -g @offline-ai/cli
176197

177198
## Run
178199

179-
First run llama.cpp server(provider)
180-
181-
```bash
182-
#run llama.cpp server
183-
cd llamacpp/build/bin
184-
#set -ngl 0 if no gpu
185-
./llama-server -t 4 -c 4096 -ngl 33 -m ~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
186-
```
187-
188-
Now you can run your ai agent script, eg, the `Dobby` character:
200+
run your ai agent script, eg, the `Dobby` character:
189201

190202
```bash
191203
$ai run --interactive --script examples/char-dobby
@@ -211,7 +223,7 @@ $ npm install -g @offline-ai/cli
211223
$ ai COMMAND
212224
running command...
213225
$ ai (--version)
214-
@offline-ai/cli/0.10.0 linux-x64 node-v20.18.3
226+
@offline-ai/cli/0.10.1 linux-x64 node-v20.18.3
215227
$ ai --help [COMMAND]
216228
USAGE
217229
$ ai COMMAND
@@ -389,31 +401,45 @@ Specific script instruction manual see: [Programmable Prompt Engine Specificatio
389401
# Commands
390402

391403
<!-- commands -->
392-
* [`ai agent`](#ai-agent)
393-
* [`ai autocomplete [SHELL]`](#ai-autocomplete-shell)
394-
* [`ai brain [NAME]`](#ai-brain-name)
395-
* [`ai brain dn [NAME]`](#ai-brain-dn-name)
396-
* [`ai brain down [NAME]`](#ai-brain-down-name)
397-
* [`ai brain download [NAME]`](#ai-brain-download-name)
398-
* [`ai brain list [NAME]`](#ai-brain-list-name)
399-
* [`ai brain refresh`](#ai-brain-refresh)
400-
* [`ai brain search [NAME]`](#ai-brain-search-name)
401-
* [`ai config [ITEM_NAME]`](#ai-config-item_name)
402-
* [`ai config save [DATA]`](#ai-config-save-data)
403-
* [`ai help [COMMAND]`](#ai-help-command)
404-
* [`ai plugins`](#ai-plugins)
405-
* [`ai plugins add PLUGIN`](#ai-plugins-add-plugin)
406-
* [`ai plugins:inspect PLUGIN...`](#ai-pluginsinspect-plugin)
407-
* [`ai plugins install PLUGIN`](#ai-plugins-install-plugin)
408-
* [`ai plugins link PATH`](#ai-plugins-link-path)
409-
* [`ai plugins remove [PLUGIN]`](#ai-plugins-remove-plugin)
410-
* [`ai plugins reset`](#ai-plugins-reset)
411-
* [`ai plugins uninstall [PLUGIN]`](#ai-plugins-uninstall-plugin)
412-
* [`ai plugins unlink [PLUGIN]`](#ai-plugins-unlink-plugin)
413-
* [`ai plugins update`](#ai-plugins-update)
414-
* [`ai run [FILE] [DATA]`](#ai-run-file-data)
415-
* [`ai test [FILE]`](#ai-test-file)
416-
* [`ai version`](#ai-version)
404+
- [Offline AI PPE CLI(WIP)](#offline-ai-ppe-cliwip)
405+
- [Quick Start](#quick-start)
406+
- [PPE CLI Command](#ppe-cli-command)
407+
- [Programmable Prompt Engine Language](#programmable-prompt-engine-language)
408+
- [I. Core Structure](#i-core-structure)
409+
- [II. Reusability \& Configuration](#ii-reusability--configuration)
410+
- [III. AI Capabilities](#iii-ai-capabilities)
411+
- [IV. Message Text Formatting](#iv-message-text-formatting)
412+
- [V. Script Capabilities](#v-script-capabilities)
413+
- [Install](#install)
414+
- [Run](#run)
415+
- [Usage](#usage)
416+
- [Commands](#commands)
417+
- [`ai agent`](#ai-agent)
418+
- [`ai autocomplete [SHELL]`](#ai-autocomplete-shell)
419+
- [`ai brain [NAME]`](#ai-brain-name)
420+
- [`ai brain dn [NAME]`](#ai-brain-dn-name)
421+
- [`ai brain down [NAME]`](#ai-brain-down-name)
422+
- [`ai brain download [NAME]`](#ai-brain-download-name)
423+
- [`ai brain list [NAME]`](#ai-brain-list-name)
424+
- [`ai brain refresh`](#ai-brain-refresh)
425+
- [`ai brain search [NAME]`](#ai-brain-search-name)
426+
- [`ai config [ITEM_NAME]`](#ai-config-item_name)
427+
- [`ai config save [DATA]`](#ai-config-save-data)
428+
- [`ai help [COMMAND]`](#ai-help-command)
429+
- [`ai plugins`](#ai-plugins)
430+
- [`ai plugins add PLUGIN`](#ai-plugins-add-plugin)
431+
- [`ai plugins:inspect PLUGIN...`](#ai-pluginsinspect-plugin)
432+
- [`ai plugins install PLUGIN`](#ai-plugins-install-plugin)
433+
- [`ai plugins link PATH`](#ai-plugins-link-path)
434+
- [`ai plugins remove [PLUGIN]`](#ai-plugins-remove-plugin)
435+
- [`ai plugins reset`](#ai-plugins-reset)
436+
- [`ai plugins uninstall [PLUGIN]`](#ai-plugins-uninstall-plugin)
437+
- [`ai plugins unlink [PLUGIN]`](#ai-plugins-unlink-plugin)
438+
- [`ai plugins update`](#ai-plugins-update)
439+
- [`ai run [FILE] [DATA]`](#ai-run-file-data)
440+
- [`ai test [FILE]`](#ai-test-file)
441+
- [`ai version`](#ai-version)
442+
- [Credit](#credit)
417443

418444
## `ai agent`
419445

@@ -441,7 +467,7 @@ EXAMPLES
441467
$ ai agent publish <agent-name>
442468
```
443469

444-
_See code: [src/commands/agent/index.ts](https://github.com/offline-ai/cli/blob/v0.10.0/src/commands/agent/index.ts)_
470+
_See code: [src/commands/agent/index.ts](https://github.com/offline-ai/cli/blob/v0.10.1/src/commands/agent/index.ts)_
445471

446472
## `ai autocomplete [SHELL]`
447473

guide-cn.md

Lines changed: 4 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -198,17 +198,7 @@ user: |-
198198
该语句表示用户角色说的话(消息),消息内容可以使用[jinja2](https://wsgzao.github.io/post/jinja/)的模板语法。
199199
`|-` 是YAML语法,表示多行字符串,原样保留换行。
200200

201-
202-
让我们用用看. 首先确认后台已经在运行`llama.cpp`服务器:
203-
204-
```bash
205-
#run llama.cpp server
206-
cd llamacpp/build/bin
207-
#set -ngl 0 if no gpu
208-
./server -t 4 -c 4096 -ngl 33 -m ~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
209-
```
210-
211-
确认完毕,现在试一试,翻译一段文字为葡萄牙语:
201+
让我们用用看. 现在试一试,翻译一段文字为葡萄牙语:
212202

213203
```bash
214204
ai run -f translator-simple.ai.yaml "{ \
@@ -279,18 +269,8 @@ balabala,说了这么多,如何安装,请看下面:
279269
### Install
280270

281271
```bash
272+
# 安装
282273
npm install -g @offline-ai/cli
283-
ai brain download QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2 -q Q4_0
284-
Downloading to ~/.local/share/ai/brain
285-
Downloading https://huggingface.co/QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2/resolve/main/Phi-3-mini-4k-instruct.Q4_0.gguf... 5.61% 121977704 bytes
286-
1. https://hf-mirror.com/QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2/resolve/main/Phi-3-mini-4k-instruct.Q4_0.gguf
287-
~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
288-
done
289-
mkdir llamacpp
290-
cd llamacpp
291-
# 以 Ubuntu x64 系统为例
292-
wget https://github.com/ggerganov/llama.cpp/releases/download/b3091/llama-b3091-bin-ubuntu-x64.zip
293-
unzip llama-b3091-bin-ubuntu-x64.zip
294274
```
295275

296276
### 下载脑子🧠
@@ -306,16 +286,8 @@ done
306286

307287
### Run
308288

309-
首先需要运行 llama.cpp server:
310-
311-
```bash
312-
#run llama.cpp server
313-
cd llamacpp/build/bin
314-
#set -ngl 0 if no gpu
315-
./llama-server -t 4 -c 4096 -ngl 33 -m ~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
316-
```
317-
318-
现在, 你可以运行智能体脚本了:
289+
现在, 打开命令行终端,你可以运行智能体脚本了:
290+
第一次运行会让你设置默认脑子。
319291

320292
```bash
321293
# -i `--interactive`: 交互方式运行

guide.md

Lines changed: 2 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -196,16 +196,9 @@ user: |-
196196
This statement represents what the user (role) says (message), and the message content can use [jinja2](https://wsgzao.github.io/post/jinja/) template syntax.
197197
`|-` is YAML syntax, indicating a multi-line string with line breaks preserved.
198198

199-
Let's give it a try. First, confirm that the background `llama.cpp` brain server is already running:
199+
Let's give it a try.
200200

201-
```bash
202-
#run llama.cpp server
203-
cd llamacpp/build/bin
204-
#set -ngl 0 if no gpu
205-
./server -t 4 -c 4096 -ngl 33 -m ~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
206-
```
207-
208-
Confirmed. Now, let's try translating a piece of text into Portuguese:
201+
Now, let's try translating a piece of text into Portuguese:
209202

210203
```bash
211204
ai run -f translator-simple.ai.yaml "{ \
@@ -273,17 +266,6 @@ Alright, the agent script has successfully returned a JSON result. How to automa
273266

274267
```bash
275268
npm install -g @offline-ai/cli
276-
ai brain download QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2 -q Q4_0
277-
Downloading to ~/.local/share/ai/brain
278-
Downloading https://huggingface.co/QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2/resolve/main/Phi-3-mini-4k-instruct.Q4_0.gguf... 5.61% 121977704 bytes
279-
1. https://hf-mirror.com/QuantFactory/Phi-3-mini-4k-instruct-GGUF-v2/resolve/main/Phi-3-mini-4k-instruct.Q4_0.gguf
280-
~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
281-
done
282-
mkdir llamacpp
283-
cd llamacpp
284-
# Example for Ubuntu x64 system
285-
wget https://github.com/ggerganov/llama.cpp/releases/download/b3091/llama-b3091-bin-ubuntu-x64.zip
286-
unzip llama-b3091-bin-ubuntu-x64.zip
287269
```
288270

289271
### Download Brain(LLM) File 🧠
@@ -299,15 +281,6 @@ done
299281

300282
### Run
301283

302-
First, you need to run the llama.cpp brain(LLM) server in background:
303-
304-
```bash
305-
#run llama.cpp server
306-
cd llamacpp/build/bin
307-
#set -ngl 0 if no gpu
308-
./llama-server -t 4 -c 4096 -ngl 33 -m ~/.local/share/ai/brain/phi-3-mini-4k-instruct.Q4_0.gguf
309-
```
310-
311284
Now, you can run the agent script:
312285

313286
```bash

0 commit comments

Comments
 (0)