Releases: PaddlePaddle/PaddleOCR
Releases · PaddlePaddle/PaddleOCR
v3.0.3
- Bug修复:
- 修复
enable_mkldnn参数不生效的问题,恢复CPU默认使用MKL-DNN推理的行为。 - 随PaddleX 3.0.3 版本的其他修复
- 修复
v3.0.2
-
功能新增:
- 模型默认下载源从
BOS改为HuggingFace,同时也支持用户通过更改环境变量PADDLE_PDX_MODEL_SOURCE为BOS,将模型下载源设置为百度云对象存储BOS。 - PP-OCRv5、PP-StructureV3、PP-ChatOCRv4等pipeline新增C++、Java、Go、C#、Node.js、PHP 6种语言的服务调用示例。
- 优化PP-StructureV3产线中版面分区排序算法,对复杂竖版版面排序逻辑进行完善,进一步提升了复杂版面排序效果。
- 优化模型选择逻辑,当指定语言、未指定模型版本时,自动选择支持该语言的最新版本的模型。
- 为MKL-DNN缓存大小设置默认上界,防止缓存无限增长。同时,支持用户配置缓存容量。
- 更新高性能推理默认配置,支持Paddle MKL-DNN加速。优化高性能推理自动配置逻辑,支持更智能的配置选择。
- 调整默认设备获取逻辑,考虑环境中安装的Paddle框架对计算设备的实际支持情况,使程序行为更符合直觉。
- 新增PP-OCRv5的Android端示例,详情。
- 模型默认下载源从
-
Bug修复:
- 修复PP-StructureV3部分CLI参数不生效的问题。
- 修复部分情况下
export_paddlex_config_to_yaml无法正常工作的问题。 - 修复save_path实际行为与文档描述不符的问题。
- 修复基础服务化部署在使用MKL-DNN时可能出现的多线程错误。
- 修复Latex-OCR模型的图像预处理的通道顺序错误。
- 修复文本识别模块保存可视化图像的通道顺序错误。
- 修复PP-StructureV3中表格可视化结果通道顺序错误。
- 修复PP-StructureV3产线中极特殊的情况下,计算overlap_ratio时,变量溢出问题。
-
文档优化:
- 更新文档中对
enable_mkldnn参数的说明,使其更准确地描述程序的实际行为。 - 修复文档中对
lang和ocr_version参数描述的错误。 - 补充通过CLI导出产线配置文件的说明。
- 修复PP-OCRv5性能数据表格中的列缺失问题。
- 润色PP-StructureV3在不同配置下的benchmark指标。
- 更新文档中对
-
其他:
- 放松numpy、pandas等依赖的版本限制,恢复对Python 3.12的支持。
v3.0.1
- 优化部分模型和模型配置:
- 更新 PP-OCRv5默认模型配置,检测和识别均由mobile改为server模型。为了改善大多数的场景默认效果,配置中的参数
limit_side_len由736改为64 - 新增文本行方向分类
PP-LCNet_x1_0_textline_ori模型,精度99.42%,OCR、PP-StructureV3、PP-ChatOCRv4产线的默认文本行方向分类器改为该模型 - 优化文本行方向分类
PP-LCNet_x0_25_textline_ori模型,精度提升3.3个百分点,当前精度98.85%
- 更新 PP-OCRv5默认模型配置,检测和识别均由mobile改为server模型。为了改善大多数的场景默认效果,配置中的参数
- 优化3.0.0版本部分存在的问题
- 优化CLI使用体验: 当使用PaddleOCR CLI不传入任何参数时,给出用法提示。
- 新增参数: PP-ChatOCRv3、PP-StructureV3支持
use_textline_orientation参数。 - CPU推理速度优化: 所有产线CPU推理默认开启MKL-DNN。
- C++推理支持: PP-OCRv5的检测和识别串联部分支持C++推理
- 修复3.0.0版本部分存在的问题
- 修复由于公式识别、表格识别模型无法使用MKL-DNN导致PP-StructureV3在部分cpu推理报错的问题
- 修复在部分GPU环境中推理报
FatalError: Process abort signal is detected by the operating system错误的问题 - 修复部分Python3.8环境的type hint的问题
- 修复
PPStructureV3.concatenate_markdown_pages方法不存在的问题。 - 修复实例化
paddleocr.PaddleOCR时同时指定lang和model_name时model_name不生效的问题。
v3.0.0
-
发布全场景文字识别模型PP-OCRv5: 单模型支持五种文字类型和复杂手写体识别;整体识别精度相比上一代提升13个百分点。
-
发布通用文档解析方案PP-StructureV3: 支持多场景、多版式 PDF 高精度解析,在公开评测集中领先众多开源和闭源方案。
-
发布智能文档理解方案PP-ChatOCRv4: 原生支持文心大模型4.5 Turbo,精度相比上一代提升15个百分点。
-
重构部署能力,统一推理接口: PaddleOCR 3.0 融合了飞桨 PaddleX3.0 工具的底层能力,全面升级推理、部署模块,优化 2.x 版本的设计,统一并优化了 Python API 和命令行接口(CLI)。部署能力现覆盖高性能推理、服务化部署及端侧部署三大场景。
-
适配飞桨框架 3.0,优化训练流程: 新版本已兼容飞桨 3.0 的 CINN 编译器等最新特性,静态图模型存储文件名由
xxx.pdmodel改为xxx.json。 -
统一模型名称: 对PaddleOCR3.0支持的模型命名体系进行了更新,采用更规范、统一的命名规则,为后续迭代与维护奠定基础。
v2.10.0
What's Changed
- update docs by @cuicheng01 in #14031
- update paddle2onnx doc by @inisis in #14038
- fix gpu memory growth by @zhangyubo0722 in #14037
- updata en docs by @dyning in #14036
- fix nan in PP-OCRv4 by @wangna11BD in #14043
- update a live promotion by @Zhiiixin in #14042
- reset latex ocr by @zhangyubo0722 in #14046
- Update pyproject.toml for add dependency by @Liyulingyue in #14058
- Fix
CMAKE_CXX_FLAGSoptimize flag by @Hirozy in #14059 - fix isnan_v2 is not supported in paddle2onnx by @GreatV in #14060
- ci: Fixed docs multi version error by @SWHL in #14048
- fix hyperlinks by @AmberC0209 in #14073
- fix nan in ppocrv4 for benchmark by @wangna11BD in #14072
- ci: Support seperate update of branch docs by @SWHL in #14079
- ci: fixed main doc ci by @SWHL in #14084
- Allow
create_predictorfunction to accept array of ONNX Execution Providers by @Salmondx in #14078 - docs: update quickstart by @SWHL in #14108
- docs: add command line usage documentation of quickstart page by @SWHL in #14110
- docs: add installation documentation of paddle by @SWHL in #14117
- docs: fixed typo by @SWHL in #14118
- image without any text will show a warning by @GreatV in #14132
- doc: remove duplicate paragraphs by @GreatV in #14133
- docs: update paddle2onnx documentations by @GreatV in #14144
- [third-party] Fix the issue of inference errors with KIE mode in ONNX format by @Alex37882388 in #14138
- update tests PR CI github action by @GreatV in #14159
- 移除doc目录下文档,保留fonts和doc_i18n两个目录 by @SWHL in #14156
- 移除ppstructure目录下旧有文档 by @SWHL in #14161
- docs: fixed error image link (#14164) by @SWHL in #14165
- 更新i18n的首页内容到新站点 by @SWHL in #14166
- docs: fix i18n languange code error by @SWHL in #14167
- docs: fix syntax error by @SWHL in #14168
- docs: update i18n docs by @GreatV in #14169
- upgrade to numpy 2.0 and remove imgaug by @GreatV in #13937
- docs: format multi languange docs home page by @SWHL in #14170
- docs: add the missing image by @GreatV in #14180
- Create close_inactive_issues.yaml by @GreatV in #14183
- update hpi config by @zhangyubo0722 in #14076
- Update close_inactive_issues.yaml by @GreatV in #14189
- Update close_inactive_issues.yaml by @GreatV in #14190
- remove lock inactive issues by @GreatV in #14192
- fix benchmark bug by @changdazhou in #14194
- pre-commit autoupdate && pre-commit run --all-files by @cclauss in #14201
- Remove Python 2 compatibility dependency six by @cclauss in #14202
- update quick_start by @AmberC0209 in #14200
- rename train result by @zhangyubo0722 in #14217
- fix benchmark bug by @changdazhou in #14235
- fix benchmark det_r50_vd_pse_v2_0 train error by @GreatV in #14239
- update infer/utility.py to support json format model by @GreatV in #14233
- Support inference for GCU by @EnflameGCU in #14142
- update docs by @AmberC0209 in #14230
- fix: Title text partially missing issue in
recovery_to_markdown.pyby @Coobiw in #14216 - change_support list by @liuhongen1234567 in #14293
- support latexocr static train by @liuhongen1234567 in #14297
- docs: Fix chinese image being displayed on the english readme page by @khanfarhan10 in #14299
- docs: update quick_start and recognition doc by @GreatV in #14302
- add d2s_train_image_shape for static train by @liuhongen1234567 in #14312
- update install command by @AmberC0209 in #14314
- fix: unable to export images without text to docx format by @GreatV in #14306
- paddle.shape return int64 tensor by @wanghuancoder in #14318
- docs: add warning of Apolications part by @SWHL in #14338
- Update algorithm_rec_cppd.md by @GreatV in #14366
- Update 印章弯曲文字识别.md by @BUJIQI in #14368
- update_det_static by @Sunting78 in #14372
- fix:calcute the left_center_pt and right_center_pt from min_area_quad by @fangfangzk in #14363
- add unimernet model by @liuhongen1234567 in #14357
- fix shape64 by @wanghuancoder in #14376
- add slanext models by @liu-jiaxuan in #14374
- fix: replace
rec_image_shapewhen manually set by @JesuisTong in #14371 - repair type bug for ppocrv3 by @liuhongen1234567 in #14397
- [WIP]support export with pir and no pir by @zhangyubo0722 in #14379
- Add pp formulanet by @liuhongen1234567 in #14429
- repair formula bug when export by @liuhongen1234567 in #14442
- modify export with pir by @zhangyubo0722 in #14441
- update SLANet inference weights for adapt to paddle3.0b2 by @cuicheng01 in #14467
- fix_server_v4_det_output by @Sunting78 in #14472
- fix label_dict save bug by @zhangyubo0722 in #14273
- add ppocrv4_doc dict by @liuhongen1234567 in #14499
- fix latex_ocr inference by @vivienfanghuagood in #14498
- fix SLANeXt export bug by @liu-jiaxuan in #14512
- add version control for export and modify hpi config by @zhangyubo0722 in #14513
- fix slanext export bug by @liu-jiaxuan in #14519
- repair bug in latexocr cpu infer and typo in bleu score by @liuhongen1234567 in #14552
- Fix language error and spelling mistakes in the documentation by @timminator in #14571
- Keep GitHub Actions up to date with GitHub's Dependabot by @cclauss in #14569
- repair train bug in multi gpu by @liuhongen1234567 in #14576
- build(deps): bump the github-actions group with 3 updates by @dependabot in #14573
- remove max inplace grad by @phlrain in #14596
- build(deps): bump pypa/gh-action-pypi-publish from 1.12.3 to 1.12.4 in the github-actions group by @dependabot in #14603
- CPP: emplace_back() replaces many push_back()...to improve performance by @nonwill in #14610
- Add Thai character dictionary for OCR recognition by @Thanajade in #14620
- CPP: Make functions mostly noexcept to improve runtime performance by @nonwill in #14613
- CPP: tidied file header includes by @nonwill in #14621
*...
v2.9.1
v2.9.0
What's Changed
- fix: table recognition content is not escaped properly by @GreatV in #13277
- fix bug when layout_predictor is None by @GreatV in #13279
- add url in pyproject, and update version number by @GreatV in #13274
- unifying data types in the SLAHead by @GreatV in #13276
- add PaddleX info to README by @TingquanGao in #13308
- Update expired link in quickstart.md by @ZeddYu in #13253
- optimize func: get_infer_gpuid by @GreatV in #13275
- fix slice op parameters not being passed correctly by @GreatV in #13319
- Solve ModuleNotFoundError: No module named 'tools.infer' by @myhloli in #13348
- Add hardware docs by @nepeplwu in #13329
- add paddlex link by @TingquanGao in #13316
- Fix the dictionary bug in tablerec inference by @Topdu in #13362
- add bn_dict.txt by @taeefnajib in #13373
- add missing docstring in paddleocr.py using copilot by @jzhang533 in #13344
- line 445 program.py by @ManikSinghSarmaal in #13389
- fix layout recovery import error by @GreatV in #13434
- Latexocr paddle by @liuhongen1234567 in #13401
- [doc]add amp train notes for detection train by @andyjiang1116 in #13481
- remove some of the less common dependencies by @GreatV in #13461
- docs: Add a new document site by @SWHL in #13375
- Update mkdocs.yml by @GreatV in #13487
- chore: Update issue template by @SWHL in #13505
- chore: Update bug report template by @SWHL in #13508
- Fix cpp_infer "--enable_mkldnn=false" not effective by @hiroi-sora in #13539
- Enable Main Branch Support for PaddleX by @zhangyubo0722 in #13523
- docs: Update README by @SWHL in #13543
- docs: Update README_en by @SWHL in #13545
- 修改错别字 by @MonkeyBrothers in #13544
- docs: Remove old applications docs by @SWHL in #13551
- fix: 'numpy' has no attribute 'astype' by @laolitou in #13554
- add latexocr docs and fix some typos by @GreatV in #13532
- chore(Issue_template): Add validation of Environment and MPE code by @SWHL in #13559
- skip text files when running test ci by @GreatV in #13561
- fix bug for paddlepaddle3.0 by @changdazhou in #13568
- docs: Update the pdf file path in the operation demonstration by @Gmgge in #13575
- support benchmark for paddlepaddle3.0 by @changdazhou in #13574
- improve the reading experience of some documents by @GreatV in #13562
- update dive into OCR book link by @GreatV in #13581
- docs: Shorten the image path and remove dupliate images by @SWHL in #13585
- docs: Fix docs errors by @SWHL in #13588
- skip text files when running test ci on push by @GreatV in #13582
- docs: Add android_demo docs by @SWHL in #13601
- fix download bug when use multi gpus by @changdazhou in #13610
- disable automatic checks for new version albumentations by @GreatV in #13583
- 修复LaTeXOCR 在paddleX中的一些问题 by @liuhongen1234567 in #13646
- update docs and remove out-of-date event by @GreatV in #13660
- setuptools 72.2.0 result in that MANIFEST.in is invalid by @TingquanGao in #13670
- update docs and remove old docs by @GreatV in #13662
- update docs and fix markdown render error by @GreatV in #13678
- chore: Update issue template by @SWHL in #13679
- cache Python dependencies and PaddleOCR files by @GreatV in #13682
- Add files via upload by @lingskr in #13685
- Update ch_PP-OCRv4_rec_distillation.yml by @jiqirenfeile in #13692
- Remove channel links from docs by @zhangyubo0722 in #13674
- Code Style Unification by @zhangyubo0722 in #13697
- docs: Remove doc/datasets directory and fix docs/datasets documents by @SWHL in #13700
- Provides Vietnamese dictionary and corpus by @lingskr in #13698
- Modify the data processing part of LaTeXOCR and replace the absolute path by a relative path by @liuhongen1234567 in #13702
- use setuptools-scm extracts PaddleOCR versions by @GreatV in #13716
- Repair the bug in the inference script for LaTeX OCR by @liuhongen1234567 in #13750
- fixed: mkldnn -> onednn by @achieve-dream1221 in #13757
- remove unused enumerate by @Kayzwer in #13760
- update applications/overview.md by @GreatV in #13763
- Fix setting of make border epoch by @Sunting78 in #13783
- Fix doc link in docs by @Topdu in #13792
- Add support for Hebrew Language and Alphabet by @johnlockejrr in #13797
- Add Syriac script support by @johnlockejrr in #13800
- update KIE docs by @GreatV in #13799
- fix the CI running errors in tests. by @GreatV in #13846
- Fix pir dy2st train by @0x45f in #13853
- fix SRN algorithm infer error by @GreatV in #13851
- update pretrain for benchmark by @changdazhou in #13820
- fix bugs for SLANet infer by @liu-jiaxuan in #13861
- fix version by @TingquanGao in #13895
- set --image_dir to be required by @GreatV in #13896
- support export after save model by @zhangyubo0722 in #13844
- fix hubserving run error by @GreatV in #13918
- fix lateocr bug by @zhangyubo0722 in #13920
- 1.在ppstructure管道中添加latex_ocr公式识别功能;2.添加pdf转markdown文件功能 by @ztyf-lq in #13868
- updata 2.9, adding new models and supporting all-in-one full developm… by @dyning in #13932
- updata 2.9, adding new models and supporting all-in-one full developm… by @dyning in #13933
- adding new models and supporting all-in-one full development tools by @dyning in #13934
- Update quick_start.md with html, not md by @dyning in #13935
- Update quick_start.md for paddlex by @dyning in #13936
- pdf to markdown document by @ztyf-lq in #13942
- Update algorithm_rec_vitstr_en.md by @GreatV in #13947
- update a live promotion by @Zhiiixin in #13954
- ci: Support multi version docs by @SWHL in #13957
- docs: Add tip of old documents by @SWHL in #13960
- ci: Fix mike error by @SWHL in #13962
- Update README.md, fixed broken quick start link by @Kozmosa in #13965
- fix broken link by @GreatV in #13970
- [NPU] cherrypick13983 by @Wang...
v2.8.1
v2.8.0
What's Changed
- [Cherry-pick] #10515 by @ToddBear in #10537
- [BugFix]compat_pillow by @shiyutang in #10596
- [bug fix] fix none res in recovery by @andyjiang1116 in #10603
- Fix seed passing issue of build_dataloader by @RuohengMa in #10614
- [bug fix]rm invalid params by @andyjiang1116 in #10605
- [Cherry-pick] #10441 #10512 by @moehuster in #10593
- 修改数据增强导致的DSR报错 by @xu-peng-7 in #10662
- onnxruntime support gpu by @WenmuZhou in #10668
- Update VQA to use the updated LayoutLM syntax from PaddleNLP by @sijunhe in #9791
- 实现功能:当--savefile为true时,在--output下以当前图片名称后接“.txt”为文件名保存ocr推理结果,解决了issues: by @WilliamQf-AI in #10628
- Cherrypicking GH-10217 and GH-10216 to PaddlePaddle:dygraph by @UserUnknownFactor in #10654
- fix numpy speed by @wanghuancoder in #10773
- Cherrypicking GH-10251 & GH-10181 to PaddleOCR:dygraph by @itasli in #10710
- rec_r45_abinet.yml add max_length and image_size by @xlg-go in #10744
- ch_PP-OCRv4_rec_distill.yml, fix KeyError: 'NRTRLabelDecode' by @xlg-go in #10761
- 根据推理对三通道的图像需求,以及opencv中imread参数说明IMREAD_COLOR(If set, always convert … by @Gmgge in #10777
- Update algorithm_kie_vi_layoutxlm_en.md by @sagarjgb in #10736
- Add new recognition method "ParseQ" by @ToddBear in #10836
- rm fluid for paddle dev by @tink2123 in #10931
- rec_r45_abinet for export model by @xlg-go in #10892
- fix:修复通道数不匹配造成的PPOCRLabel启动失败问题#10748,根据更新日志发现#10655,由于paddleocr中增加了对… by @Gmgge in #10847
- [New] add rec CPPD model by @Topdu in #10990
- fix
cls_xandbbox_xis possibly unbound by @SigureMo in #10991 - add svtr large model by @zhangyubo0722 in #10937
- [WIP]support eval pre epoch by @zhangyubo0722 in #11003
- Update kie_datasets_en.md by @sagarjgb in #10735
- fix import collection for py310 by @tink2123 in #11012
- update ppocrv4_framework by @tink2123 in #11048
- Update how_to_do_kie_en.md by @sagarjgb in #10731
- add cppd u14m train model and doc by @Topdu in #11052
- Fixed bug with "max_text_length" for VisionLAN by @victor30608 in #11025
- Cherrypicking GH-10923 to PaddleOCR:dygraph by @itasli in #11069
- Update quickstart_en.md by @sagarjgb in #10732
- Update README.md by @sagarjgb in #10733
- Update algorithm_overview_en.md by @sagarjgb in #10734
- [Cherry-pick] Cherry-pick from release/2.6 by @shiyutang in #11092
- [TIPC]update tipc scripts by @USTCKAY in #11097
- fix satrn export for paddle2.5 by @tink2123 in #11096
- [BugFix]Fix parseq net by @shiyutang in #11126
- update uygur dict by @hfengzhi in #11125
- Add tipc for "ParseQ" method by @ToddBear in #10843
- fix SAR inference, when batch size>1, norm_img_batch and valid_ratios… by @shiyunalex in #11238
- v4 det cml configs by @sylarwcy in #11258
- 解決recognition的train test分割程式執行後的文檔每行間多出一行空格 by @DingHsun in #11280
- Fix for Ambiguous Boolean Evaluation Error in PaddleOCR with Python 3.11 by @muhammadAgfian96 in #11287
- Dygraph【benchmark】add max_mem_reserved for benchmark by @mmglove in #11284
- Fix bug when running on XPU by @RuohengMa in #11299
- Dygraph by @RuohengMa in #11301
- Dygraph fix max_mem_reserved for benchmark by @mmglove in #11341
- 在check_gpu时增加对当前环境可用设备的检查 by @TracebaK in #11293
- Fixed some bugs that caused PPOCRLabel to crash, added ability to expand checkboxes by @g39088902 in #11236
- fix a bug for rec_postprocess.py by @Ataraxy33 in #11389
- Optimize prediction on long image and deduplicate similar boxes with multiple lables by @marswen in #11366
- doc: add doc for satrn by @wkml in #11397
- Update zeros' comment in rec_abinet_head.py by @YesianRohn in #11374
- Fix QPointF IndexError: list index out of range by @firmament2008 in #11393
- update paddlex of readme by @zhangyubo0722 in #11422
- chore: add notes for docker gpu deploy PP-OCRv4 by @sheiy in #11390
- Fix words by @co63oc in #11448
- [Feature]Complete the ppocrv4_act by @ranchongzhi in #11345
- rm QR code in the document by @tink2123 in #11512
- rm QR code by @tink2123 in #11532
- Fix dead links by @MatKollar in #11520
- cherry-pick for lazy import pymupdf and pre-commit by @tink2123 in #11692
- adapter new type promotion rule for Paddle 2.6 by @zxcd in #11698
- setup a workflow for publishing package to pypi by @jzhang533 in #11804
- update link mentioned at #11763 by @jzhang533 in #11764
- fix AttributeError by @GreatV in #11686
- fix: Correct misuse of
try_importfrompaddle.utilsby @neteroster in #11820 - Update quickstart.md for a better python pdf demo by @qwedc001 in #11927
- Update quickstart_en.md by @qwedc001 in #11934
- Enhance the OCR recognition accuracy of PPStructure. by @RussellLuo in #11916
- add u14m results of cppd by @Topdu in #11943
- use tensor.shape bug not paddle.shape(tensor) by @wanghuancoder in #11919
- add pre-commit workflow by @GreatV in #11973
- docs: Update FAQ.md, delete repeated question by @xu8117 in #11972
- Fix the bug where Python scripts fail to execute PDF text recognition… by @guangyunms in #11994
- 【OCR Issue No.9】以可选形式支持Visualdl by @Liyulingyue in #11947
- fix weird version info by @GreatV in #12003
- 【OCR Issue No.9】移除明确不适合放在ppocr依赖中的依赖项 by @Liyulingyue in #11946
- Burmese Language dict and corpus by @1chimaruGin in #12020
- 面版识别添加onnx支持完善 by @heweisheng in #12068
- Update README.md by @dyning in #12086
- fix readme codestyle by @GreatV in #12095
- fix wrong link for 通用OCR in README.txt by @tackhwa in #12100
- move PPOCRLabel to PFCCLab/PPOCRLabel by @GreatV in #12104
- move StyleText to PFCCLab/StyleText by @GreatV in #12121
- openocr compti code by @Topdu in #12033
- table rec code by @invictuszhao in #11999
- Error with pyclipper inhomogeneous expanded array by @zovelsanj in #12108
- 【OCR Issue No.2】修复训练过程中找不到對應模型和训练时计算精度报错 by @mattheliu in https://github.com/PaddlePaddle/Paddle...
PaddleOCRv2.7.5
fix broken v2.7.4