diff --git a/README.md b/README.md
index a2ab367..ff36fe8 100644
--- a/README.md
+++ b/README.md
@@ -3,7 +3,6 @@ license: apache-2.0
pipeline_tag: image-text-to-text
tags:
- ERNIE4.5
-- PaddleOCR
- PaddlePaddle
- image-to-text
- ocr
@@ -17,7 +16,6 @@ language:
- en
- zh
- multilingual
-library_name: PaddleOCR
---
@@ -39,7 +37,7 @@ PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vi
[](./LICENSE)
**🔥 Official Demo**: [Baidu AI Studio](https://aistudio.baidu.com/application/detail/98365) |
-**📝 Blog**: [Technical Report](https://ernie.baidu.com/blog/publication/PaddleOCR-VL_Technical_Report.pdf)
+**📝 arXiv**: [Technical Report](https://arxiv.org/pdf/2510.14528)
@@ -84,14 +82,18 @@ Install [PaddlePaddle](https://www.paddlepaddle.org.cn/install/quick) and [Paddl
```bash
python -m pip install paddlepaddle-gpu==3.2.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
python -m pip install -U "paddleocr[doc-parser]"
+python -m pip install https://paddle-whl.bj.bcebos.com/nightly/cu126/safetensors/safetensors-0.6.2.dev0-cp38-abi3-linux_x86_64.whl
```
+> For Windows users, please use WSL or a Docker container.
+
+
### Basic Usage
CLI usage:
```bash
-paddleocr doc_parser -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_ocr_vl_demo.png
+paddleocr doc_parser -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png
```
Python API usage:
@@ -100,7 +102,7 @@ Python API usage:
from paddleocr import PaddleOCRVL
pipeline = PaddleOCRVL()
-output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_ocr_vl_demo.png")
+output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png")
for res in output:
res.print()
res.save_to_json(save_path="output")
@@ -117,23 +119,22 @@ for res in output:
--gpus all \
--network host \
ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddlex-genai-vllm-server
- # You can also use ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddlex-genai-vllm-server for the SGLang server
```
2. Call the PaddleOCR CLI or Python API:
```bash
paddleocr doc_parser \
- -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_ocr_vl_demo.png \
+ -i https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png \
--vl_rec_backend vllm-server \
- --vl_rec_server_url http://127.0.0.1:8080
+ --vl_rec_server_url http://127.0.0.1:8080/v1
```
```python
from paddleocr import PaddleOCRVL
- pipeline = PaddleOCRVL(vl_rec_backend="vllm-server", vl_rec_server_url="http://127.0.0.1:8080")
- output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/pp_ocr_vl_demo.png")
+ pipeline = PaddleOCRVL(vl_rec_backend="vllm-server", vl_rec_server_url="http://127.0.0.1:8080/v1")
+ output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png")
for res in output:
res.print()
res.save_to_json(save_path="output")
@@ -246,31 +247,32 @@ The evaluation set is broadly categorized into 11 chart categories, including ba
### Text
### Table
### Formula
+
### Chart
@@ -290,5 +292,4 @@ If you find PaddleOCR-VL helpful, feel free to give us a star and citation.
primaryClass={cs.CL},
howpublished={\url{https://ernie.baidu.com/blog/publication/PaddleOCR-VL_Technical_Report.pdf}}
}
-```
-
+```
\ No newline at end of file