mirror of
https://www.modelscope.cn/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B.git
synced 2026-04-02 21:52:53 +08:00
Small fix
This commit is contained in:
@ -118,7 +118,7 @@ Compared to previous versions of DeepSeek-R1, the usage recommendations for Deep
|
||||
1. System prompt is supported now.
|
||||
2. It is not required to add "\<think\>\n" at the beginning of the output to force the model into thinking pattern.
|
||||
|
||||
The model architecture of DeepSeek-R1-0528-Qwen3-8B is identical to that of Qwen3-8B, but it shares the same tokenizer configuration as DeepSeek-R1-0528. This model can be run in the same manner as Qwen3-8B.
|
||||
The model architecture of DeepSeek-R1-0528-Qwen3-8B is identical to that of Qwen3-8B, but it shares the same tokenizer configuration as DeepSeek-R1-0528. This model can be run in the same manner as Qwen3-8B, but it is essential to ensure that all configuration files are sourced from our repository rather than the original Qwen3 project.
|
||||
|
||||
### System Prompt
|
||||
In the official DeepSeek web/app, we use the same system prompt with a specific date.
|
||||
|
||||
Reference in New Issue
Block a user