Skip to content

Commit cb61653

Browse files
authored
Merge pull request #42 from shalousun/master
feat: Add documentation for the model-config parameter
2 parents 8dd73ed + 72b3fdb commit cb61653

4 files changed

Lines changed: 108 additions & 104 deletions

File tree

crates/cli/README.md

Lines changed: 28 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -11,38 +11,39 @@ cargo run -p deepseek-ocr-cli --release -- \
1111

1212
### Arguments
1313

14-
| Flag | Default | Description |
15-
| --- | --- | --- |
16-
| `--prompt` || Inline text with `<image>` markers. |
17-
| `--prompt-file` || UTF-8 file containing the prompt; overrides `--prompt`. |
18-
| `--template` | `plain` | Conversation template (`plain`, `deepseek`, `deepseekv2`, `alignment`). |
19-
| `--image PATH` || Image path for each `<image>` token, specified in order. Repeat the flag for multiple images. |
20-
| `--tokenizer PATH` | assets default | Override tokenizer location; downloaded automatically when omitted. |
21-
| `--weights PATH` | auto-detected | Use custom model weights instead of the default safetensor. |
22-
| `--device` | `cpu` | Execution backend: `cpu`, `metal`, or `cuda` (alpha). |
23-
| `--dtype` | backend default | Override numeric precision (`f32`, `f16`, `bf16`, …). |
24-
| `--base-size` | `1024` | Global view resolution supplied to the vision stack. |
25-
| `--image-size` | `640` | Local crop resolution when dynamic tiling is enabled. |
26-
| `--crop-mode` | `true` | Toggle dynamic crop sampling (`false` to disable). |
27-
| `--max-new-tokens` | `512` | Maximum number of tokens generated during decoding. |
28-
| `--no-cache` | `false` | Disable the decoder KV-cache. Helpful for debugging only. |
29-
| `--do-sample` | `false` | Enable sampling (requires `--temperature > 0`). |
30-
| `--temperature` | `0.0` | Softmax temperature used when sampling is enabled. |
31-
| `--top-p` | `1.0` | Nucleus sampling mass; ignored unless sampling. |
32-
| `--top-k` || Top-k cutoff during sampling. |
33-
| `--repetition-penalty` | `1.0` | Penalise previously generated tokens (>1 discourages repeats). |
34-
| `--no-repeat-ngram-size` | `20` | N-gram blocking window applied to every decode step. |
35-
| `--seed` || RNG seed for reproducible sampling runs. |
14+
| Flag | Default | Description |
15+
|--------------------------|-----------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
16+
| `--prompt` || Inline text with `<image>` markers. |
17+
| `--prompt-file` || UTF-8 file containing the prompt; overrides `--prompt`. |
18+
| `--template` | `plain` | Conversation template (`plain`, `deepseek`, `deepseekv2`, `alignment`). |
19+
| `--image PATH` || Image path for each `<image>` token, specified in order. Repeat the flag for multiple images. |
20+
| `--tokenizer PATH` | assets default | Override tokenizer location; downloaded automatically when omitted. |
21+
| `--weights PATH` | auto-detected | Use custom model weights instead of the default safetensor. |
22+
| `--model-config PATH` | auto-detected | Specify the model architecture configuration file, overriding the default config.json. In offline environments, point to a pre-downloaded config file path. |
23+
| `--device` | `cpu` | Execution backend: `cpu`, `metal`, or `cuda` (alpha). |
24+
| `--dtype` | backend default | Override numeric precision (`f32`, `f16`, `bf16`, …). |
25+
| `--base-size` | `1024` | Global view resolution supplied to the vision stack. |
26+
| `--image-size` | `640` | Local crop resolution when dynamic tiling is enabled. |
27+
| `--crop-mode` | `true` | Toggle dynamic crop sampling (`false` to disable). |
28+
| `--max-new-tokens` | `512` | Maximum number of tokens generated during decoding. |
29+
| `--no-cache` | `false` | Disable the decoder KV-cache. Helpful for debugging only. |
30+
| `--do-sample` | `false` | Enable sampling (requires `--temperature > 0`). |
31+
| `--temperature` | `0.0` | Softmax temperature used when sampling is enabled. |
32+
| `--top-p` | `1.0` | Nucleus sampling mass; ignored unless sampling. |
33+
| `--top-k` || Top-k cutoff during sampling. |
34+
| `--repetition-penalty` | `1.0` | Penalise previously generated tokens (>1 discourages repeats). |
35+
| `--no-repeat-ngram-size` | `20` | N-gram blocking window applied to every decode step. |
36+
| `--seed` || RNG seed for reproducible sampling runs. |
3637

3738
> **Heads-up:** If the final markdown appears truncated, increase `--max-new-tokens`. The model stops once it has emitted the configured number of tokens even if the prompt is unfinished.
3839
3940
### Configuration & Overrides
4041

41-
| Platform | Config path | Weights cache path |
42-
| --- | --- | --- |
43-
| Linux | `~/.config/deepseek-ocr/config.toml` | `~/.cache/deepseek-ocr/models/<id>/model.safetensors` |
44-
| macOS | `~/Library/Application Support/deepseek-ocr/config.toml` | `~/Library/Caches/deepseek-ocr/models/<id>/model.safetensors` |
45-
| Windows | `%APPDATA%\deepseek-ocr\config.toml` | `%LOCALAPPDATA%\deepseek-ocr\models\<id>\model.safetensors` |
42+
| Platform | Config path | Weights cache path |
43+
|----------|----------------------------------------------------------|---------------------------------------------------------------|
44+
| Linux | `~/.config/deepseek-ocr/config.toml` | `~/.cache/deepseek-ocr/models/<id>/model.safetensors` |
45+
| macOS | `~/Library/Application Support/deepseek-ocr/config.toml` | `~/Library/Caches/deepseek-ocr/models/<id>/model.safetensors` |
46+
| Windows | `%APPDATA%\deepseek-ocr\config.toml` | `%LOCALAPPDATA%\deepseek-ocr\models\<id>\model.safetensors` |
4647

4748
- Pass `--config /path/to/config.toml` to read or bootstrap an alternate file (created with defaults if missing).
4849
- Runtime values resolve in this order: CLI flags → values in `config.toml` → baked-in defaults. Asset paths behave the same way: explicit flags beat config entries which beat the cache locations listed above.

crates/cli/README_CN.md

Lines changed: 28 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -11,38 +11,39 @@ cargo run -p deepseek-ocr-cli --release -- \
1111

1212
## 参数说明
1313

14-
| 参数 | 默认值 | 说明 |
15-
| --- | --- | --- |
16-
| `--prompt` || 内联文本提示,使用 `<image>` 标记图片位置。 |
17-
| `--prompt-file` || 含提示词的 UTF-8 文件;提供后会覆盖 `--prompt`|
18-
| `--template` | `plain` | 会话模板,可选 `plain``deepseek``deepseekv2``alignment`|
19-
| `--image PATH` ||`<image>` 匹配的图片路径,按出现顺序重复传入该参数。 |
20-
| `--tokenizer PATH` | 资产默认路径 | 指定自定义分词器路径;默认自动下载并缓存。 |
21-
| `--weights PATH` | 自动探测 | 指定模型权重文件,覆盖默认的 safetensor。 |
22-
| `--device` | `cpu` | 执行后端:`cpu``metal``cuda`(测试阶段)。 |
23-
| `--dtype` | 取决于后端 | 数值精度覆盖选项,如 `f32``f16``bf16` 等。 |
24-
| `--base-size` | `1024` | 传入视觉模块的全局视图分辨率。 |
25-
| `--image-size` | `640` | 动态裁剪启用时的局部分辨率。 |
26-
| `--crop-mode` | `true` | 是否启用动态裁剪(传 `false` 可关闭)。 |
27-
| `--max-new-tokens` | `512` | 解码阶段允许输出的最大 token 数。 |
28-
| `--no-cache` | `false` | 禁用解码 KV 缓存,仅在调试时使用。 |
29-
| `--do-sample` | `false` | 是否启用采样(需搭配 `--temperature > 0`)。 |
30-
| `--temperature` | `0.0` | 采样温度,越大越随机。 |
31-
| `--top-p` | `1.0` | 核心采样累计概率,采样时有效。 |
32-
| `--top-k` || Top-k 截断,配合采样使用。 |
33-
| `--repetition-penalty` | `1.0` | 重复惩罚系数(>1 会降低重复概率)。 |
34-
| `--no-repeat-ngram-size` | `20` | n-gram 阻断窗口,生成时始终生效。 |
35-
| `--seed` || 随机种子,便于复现采样结果。 |
14+
| 参数 | 默认值 | 说明 |
15+
|--------------------------|---------|------------------------------------------------------|
16+
| `--prompt` || 内联文本提示,使用 `<image>` 标记图片位置。 |
17+
| `--prompt-file` || 含提示词的 UTF-8 文件;提供后会覆盖 `--prompt`|
18+
| `--template` | `plain` | 会话模板,可选 `plain``deepseek``deepseekv2``alignment`|
19+
| `--image PATH` ||`<image>` 匹配的图片路径,按出现顺序重复传入该参数。 |
20+
| `--tokenizer PATH` | 资产默认路径 | 指定自定义分词器路径;默认自动下载并缓存。 |
21+
| `--weights PATH` | 自动探测 | 指定模型权重文件,覆盖默认的 safetensor。 |
22+
| `--model-config PATH` | 自动探测 | 指定模型架构配置文件,覆盖默认的 config.json。离线环境可配置下载好的配置路径 |
23+
| `--device` | `cpu` | 执行后端:`cpu``metal``cuda`(测试阶段)。 |
24+
| `--dtype` | 取决于后端 | 数值精度覆盖选项,如 `f32``f16``bf16` 等。 |
25+
| `--base-size` | `1024` | 传入视觉模块的全局视图分辨率。 |
26+
| `--image-size` | `640` | 动态裁剪启用时的局部分辨率。 |
27+
| `--crop-mode` | `true` | 是否启用动态裁剪(传 `false` 可关闭)。 |
28+
| `--max-new-tokens` | `512` | 解码阶段允许输出的最大 token 数。 |
29+
| `--no-cache` | `false` | 禁用解码 KV 缓存,仅在调试时使用。 |
30+
| `--do-sample` | `false` | 是否启用采样(需搭配 `--temperature > 0`)。 |
31+
| `--temperature` | `0.0` | 采样温度,越大越随机。 |
32+
| `--top-p` | `1.0` | 核心采样累计概率,采样时有效。 |
33+
| `--top-k` || Top-k 截断,配合采样使用。 |
34+
| `--repetition-penalty` | `1.0` | 重复惩罚系数(>1 会降低重复概率)。 |
35+
| `--no-repeat-ngram-size` | `20` | n-gram 阻断窗口,生成时始终生效。 |
36+
| `--seed` || 随机种子,便于复现采样结果。 |
3637

3738
> **重要提醒:** 如果生成的 Markdown 被提前截断,请调大 `--max-new-tokens`。模型在达到该上限后会立刻停止,即便尚未完成回答。
3839
3940
### 配置与覆盖
4041

41-
| 平台 | 配置文件路径 | 权重缓存路径 |
42-
| --- | --- | --- |
43-
| Linux | `~/.config/deepseek-ocr/config.toml` | `~/.cache/deepseek-ocr/models/<id>/model.safetensors` |
44-
| macOS | `~/Library/Application Support/deepseek-ocr/config.toml` | `~/Library/Caches/deepseek-ocr/models/<id>/model.safetensors` |
45-
| Windows | `%APPDATA%\deepseek-ocr\config.toml` | `%LOCALAPPDATA%\deepseek-ocr\models\<id>\model.safetensors` |
42+
| 平台 | 配置文件路径 | 权重缓存路径 |
43+
|---------|----------------------------------------------------------|---------------------------------------------------------------|
44+
| Linux | `~/.config/deepseek-ocr/config.toml` | `~/.cache/deepseek-ocr/models/<id>/model.safetensors` |
45+
| macOS | `~/Library/Application Support/deepseek-ocr/config.toml` | `~/Library/Caches/deepseek-ocr/models/<id>/model.safetensors` |
46+
| Windows | `%APPDATA%\deepseek-ocr\config.toml` | `%LOCALAPPDATA%\deepseek-ocr\models\<id>\model.safetensors` |
4647

4748
- 通过 `--config /path/to/config.toml` 可切换或初始化自定义路径;若文件不存在会自动填入默认值。
4849
- 参数生效顺序为:命令行参数 → `config.toml` → 内置默认值。资产路径同样遵循该顺序:显式的 `--weights`/`--tokenizer` 会覆盖配置文件,若都未指定则使用上表所列缓存目录。

0 commit comments

Comments
 (0)